[ skins -> C vs. Scheme semantics ]
Where did I say anything about bridging incompatible semantics? How
would this be useful? Please, that's just a complete straw man. I
mean, I say I have a perfectly fine car and you say it doesn't fly
you to the moon. Well, flying to the moon has never been in the
requirements for a perfectly fine car.
Once again, I was talking about "skins", like the different skins
for Squeak-Smalltalk now showing. An XML-based encoding
(marshalling) of parse-trees could serve as a skin-neutral storage
format. And could probably be easily translated into (some) other OO
languages as well.
Anyway, look at SOAP for an example of *easy* interoperability. You
can write SOAP commands using any old text editor, or even via
direct telnet connection. That's very low overhead for trying
something out and doing things in an ad-hoc way.
Doing this has no support for any kind of input-side validity.
Exactly! The point is that extremely light-weight or ad-hoc access
is possible without all that overhead, while at the same time fully
validated access is just as easy (or easier) than with the
[more of same snipped]
Try the same with a
Why? I can do the same kind of thing from bsh or Tcl or Python
when I want
to play around with CORBA interfaces. The responsibility for
split between client and server.
Compare the overhead of the two solutions:
On the client side, a simple text editor or even just telnet, both
of which are simply available vs. a whole programming language plus
an integrated ORB, which has to implement IDL-parsing, IIOP
parsing/generation and all the other services, none of which leverage
common technologies but have to implemented just for CORBA!
On the server side, you can hook up one of the many, many XML
parsers available off the shelf and then just read out the DOM (or
handle the SAX events), again compared to a full CORBA ORB with all
the trappings and no common components.
The point here is that the barriers to entry are just incredibly
low. If you want to do something very simple, you can do so with
very simple means. If you want to do something complex or highly
secure, you need to spend more. But the people who want to do
something simple should not be penalized because some people want to
do something complex, and it would be good if both could use the same
basic protocol, and they can.
As for interoperability, the format is easy to
parse/generate with just about any language/OS I can think of.
It's only easy to generate if you don't care about correctness and
robustness. String manipulation is a truly awful basis for the core of
XML parsers and generators are commonly available and easy to write.
If you think XML-parsing should be done by regex-matching or
strcmp(), you get what you deserve. The CERT advisory clearly only
talks about incorrect servers, and you will note that my scenarious
have been about easy, ad-hoc *client* access. Furthermore, the
problem in the CERT advisory is really the insecurity of having
script access enabled. If you have that turned on, you're wide open
to attacks anyhow, the only difference with being that the audience
you're wide open to has increased.
Still I'll wager with you that a SOAP server requires a fraction of
the code of a fully functioning CORBA ORB. (Not that I am a SOAP
fan, but the general idea of encoding messages as XML has a lot of
appeal and SOAP is one spec I am aware of).
Deeply, the problem is that generating HTML or XML is *not* about
concatenating strings together.
Of course not, whowever said that it is? For programmatic access,
you should build the (simple) libs that do it properly. However, you
*can* reasonably build or edit a simple XML file in a text editor,
which you can't reasonably do with various binary formats, and you
can also *examine* XML files in a text editor.
Generating HTML or XML is about building
trees of elements and then marshalling them into ASCII.
Yes of course. Where did I say anything different? I always groan
when I see code that concatenates HTML/XML tags to strings in an
My own XML-framework (for Objective-C) both has a proper parser and
a generator, with the generator being a type of EncodingFilter.
OK, so maybe this is a straw man.
You probably agree with me that real XML
and SOAP applications need a library or set of classes to
generate/parse/validate their data, since it's not strings.
My point from the start is that it is *both* trees of semantically
rich objects *and* strings, though one has to take care, because it
is not simple strings.
But once you
have that library and code to it, the particular advantages of
(say) CORBA doesn't look so one-sided.
Yes it does. You get to leverage common components, the overall
complexity is much lower, you get a human-readable (and debuggable),
standardized interchange format/wire protocol, you get easy ad-hoc
Some of the magic binary bits are well-known and documented via
AFAIK, the whole word file format is now documented (somewhere).
That doesn't change the fact that writing custom binary-format
parsers is going to be a lot more complex than (a) just opening the
XML file in a text-editor and *looking* at it and/or (b) using one
of dozens/hundreds of off-the-shelf XML tools with it.
Yes, I agree, with a qualification: XML files *can* have
syntax. But you still have to understand what that syntax means.
Yes, if the semantics are inherently complex or intentionally
obscured, XML won't solve that, because, as I've explained before,
that's outside of the domain of problems it is trying to solve.
However, with XML you at least have a good chance of understanding
what is there, assuming that the creator of the schema is not an
adversary trying to confuse you.
<Quantity>1</Quantity> vs (hex) fc00000001 or was that fc01000000?
Post by Marcel Weiher
Furthermore, it combines the worlds of UNIX (everything is a text
This is a strawman (I hope). If everything is a text file,
vi+pipelines the only applications I use to edit and view things?
I think you misunderstood: the "everything is a text file" is an
assumption that most of the UNIX tools are built around. It makes
many powerful things possible, especially for *ad hoc* processing.
However, it loses out on richer structures, for example those typical
in OO systems. XML is a way of combining the two worlds.
Again, only if you're willing to give up correctness, or do a lot
to try to patch around end cases. The right way to process
objects is as
The "right" way depends on your application and on the economics of
the situation. For example, say you have two databases and you need
to dump the contents of one into the other, once. They have schemas
that are principally compatible but differ slightly.
Now, you could (and it seems you argue one should) build complete
class hierarchies modelling the schemas of the two databases,
routines reading correct object-graphs out of one database
(potentially running out of memory), transforming the object-graphs
in memory (running out again) and then writing the resulting object
graph to the other database.
Or you could dump one DB out to XML-format (with an off-the-shelf
tool), use an off-the-shelf generic XML-transformation tool that
remaps the file purely syntactically (with your knowledge of the
semantics controlling the syntactic transformations) and import the
resulting XML-file into database two.
Notice that I didn't say "use sed", because that won't work.
Purely syntactic queries are limited to identity queries, which aren't
Where do you get that idea? With an XML-aware tool, I could
certainly do more, based on structure, context and values. The user
of the tools can have knowledge (or make assumptions) about the
You can move up to queries based on string operations (is
this attribute value in the canonical Unicode sort order between
and "Weiher"?) but that's not much better.
What would prevent me from doing queries based on structural or
numerical relationships? Structural queries are even in the base
XPointer, XML query languages such as Quilt support rich queries
expanding on those available in SQL. ( see
o A not-so-awful generic syntax
o self describing, human readable, editable, robust
o that has a framework for discussing, documenting, and
o is usable *without* a machine encoded semantics
o thus providing a low barrier to entry, just like HTML/HTTP
o but scales to higher demands just as well
o A political atmosphere where information and service providers are
expected to do so.
XML hype is *good* as long as it's a tool to accomplish that last
we shouldn't forget that it's just one of many possible vehicles to get
this, and even with full documentation of formats there is a vast
work in creating interoperation, which often starts with
Yes, in many cases that is the case, and it is happening. Just
watch www.oasis.org to see industries scrambling to get their common
vocabularies defined. But once again, it is possible to do partial,
but correct, processing of files even when you don't know what half
(or more) of it means.