Jon Udell has been thinking so furiously about mixing namespaces and the meaning of markup that I imagine a visible swirl of superheated brain energy above his home office. I think that this whole area of thought is what over in the W3C TAG we refer to as a “rat-hole”. I.e., something you can vanish down never to re-appear, or at least a place where you can waste a lot of time scurrying along twisty little passages. Herewith some (I hope) demystification.

What Does Markup Mean? · There is typically all sorts of meaning associated with markup, and in some cases it’s obvious. For example, there is really no room to argue over the meaning of HTML’s <b> or <br/> elements.

Where does the meaning come from? I’m only aware of two ways for markup to take on semantic weight:

  1. The designer of the markup asserts that some tag or attribute is used to identify content with a particular semantic.

  2. The broad community of authors and programmers exhibits consensus as to the semantic of an item of markup.

This is not just a theoretical formulation: the venerable <ul> tag has been around for a long time in a succession of dialects including HTML. The name is an abbreviation for “unordered list,” and at one point the idea was that you could treat the contents like a relational table and sort them any way that was convenient. Due to a cascade of implementations (I first saw it in Mosaic), <ul> eventually grew the semantic of “ordered list with bullets.”

I can’t remember off the top of my head what the official HTML docs say about <ul>, but there’s really no point looking, because the semantics are locked down by a hundred million deployed implementations and a few million human HTML authors.

How is Meaning Communicated? · I know of very few widely-deployed applications where the meaning of markup is expressed formally or machine-readably. In practice, the only way to communicate the meaning of markup is human mind to human mind, either via designer assertion (“RTFM”) or user observation (“View Source”).

This should not be surprising, because at this point in history, only a few decades into the quest for intelligent machines, semantics is something that humans do.

Namespaces · Right now, in the context of the Pie/Echo/Atom/whatever project, people assert that crystallizing the meaning of embedded namespaces is the key to interoperability, the central problem, and so on. Huh? When someone proposes markup from another namespace for inclusion in a syndication feed, there are three possible outcomes:

  1. Nobody pays attention and it isn’t much adopted.

  2. It gets widely adopted, with semantics along the lines originally proposed.

  3. It gets widely adopted, with some semantic drift away from the original proposal becoming evident in the implementations. (Note that this has already happened with some RSS 2.0 markup).

Oddly enough, this is exactly what will happen with proposed tags and attributes that aren’t in a different namespace.

Just Labels · At the end of the day, markup is just a bunch of labels. We should be grateful that XML makes them (somewhat) human-readable and internationalized, and try to write down what we want them to mean as clearly as and cleanly as we can, with a view to the needs of the downstream implementors and users.

But we shouldn’t try to kid ourselves that meaning is inherent in those pointy brackets, and we really shouldn’t pretend that namespaces make a damn bit of difference.


author · Dad · software · colophon · rights
picture of the day
August 11, 2003
· Technology (77 fragments)
· · XML (135 more)

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.