XML Qualified Names: Good, Bad or Indifferent?
A couple of days ago, Phil Ringnalda published a short review of Microsoft’s newly unveiled RSS support. Among other things, he noted that the syntax of their sorting element broke his quick-and-dirty parser, because it includes an actual instance of the element being sorted (but with different syntax and semantics) inside the sort element, like so:
<cf:sort>
<title data-type="text">The title of the item</title>
</cf:sort>
Since Phil’s parser assumes that the title tag has specific semantics (i.e. those defined in the RSS spec), this use of the tag in a completely different context causes it to malfunction.
Among other wacky stuff going on here is the fact that the RSS 2.0 spec doesn’t define a namespace for the RSS grammar itself, as I discovered to my mild embarrassment when I claimed the opposite in the comments. Nonetheless, I believe it’s fair to say that when there’s a title tag in an RSS document, it can be assumed to be attached to the default namespace, which in this case is sadly lacking an identifying URI. Using the same unique identifier (i.e. title with no namespace) with a totally different meaning is wrong. I wouldn’t beat Microsoft up too much over this, however, as it looks to be a simple oversight.
More contentious was my suggestion to reference the sorted element using a qualified name inside an attribute value (taking the liberty of giving RSS a namespace prefix though it doesn’t really deserve one):
<cf:sort element="rss:title" />
This led to me being torn a new one by Dare Obasanjo, a member of Microsoft’s XML team, who puts forth as evidence a document by the W3C’s Technical Advisory Group (TAG). I proceeded to read the document, with some trepidation since, while I’m not one to shy away from a geek pissing contest, the TAG is the closest thing there is to XML royalty.
Luckily the document states no such thing. In fact, it seems to hint at a schism inside the TAG since it rather than offering a definitive view, it simply observes that using QNames in XML content is already common practice, so we should understand the implications. It then proceeds to pick apart these implications in a very thorough and perspicacious manner.
What it all comes down to is this: why did the XML Namespaces WG decide to use prefixes in QNames in the first place? After all, they could have just mandated that the full namepace URI be used instead, every time a QName occurs. Try typing out a few namespace-compliant XML documents in this way, however, and you’ll see why: it’s massively verbose, and typing a potentially cryptic URI over and over again is pretty darn error-prone. Evidently they felt that the extra processing firepower needed to interpret the prefixes was worth the cost, and despite having written my share of XML processing software that has to manage this added complexity, I agree.
Okay, so if prefixes make sense in tag and attribute names, why not in XML content? In my opinion, it all comes down to the two warring factions of the XML world: the schema folks (to whom I am firmly allied) and the others (let’s call them “Perl Scripters” for convenience). An XML document containing namespaces encapsulates all the information (i.e. prefix mappings) required to process it. When QNames are placed in XML content, you can still do exactly the same types of automated processing (including the canonicalization example from the TAG paper), but you need a schema so you know where the QNames are.
It is my firm belief that you can’t do proper generic processing of an XML document without a schema. Anything else is a quick-and-dirty hack, which is fine (I’ve written far, far more than my share) as long as it is recognized for what it is. This, not the use of QNames in XML content, is the crux of the matter and is clearly still the subject of great controversy. Until it’s resolved in some definitive way, XML spats like the QName debacle are unlikely to go away.
4 Comments »
Trackback URL RSS feed for comments on this post. TrackBack URI
Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>







not interesting
Comment by zed — 6/15/2006 @ 8:23 pm
Compare phentermine price….
Phentermine. Xenical hgh phentermine quit smoking detox. Phentermine cheap. Phentermine mg. Best price for phentermine….
Trackback by Phentermine. — 11/4/2007 @ 11:01 am
[…] elliot’s blog | townx wrote an interesting post today on Comment on XML Qualified Names: Good, Bad or Indifferent? by…Here’s a quick excerptPhentermine. Xenical hgh phentermine quit smoking detox. Phentermine cheap. Phentermine mg. Best price for phentermine…. […]
Pingback by Phentermine » Comment on XML Qualified Names: Good, Bad or Indifferent? by… — 11/4/2007 @ 1:10 pm
thank you nice sharing
Comment by cep program — 5/14/2008 @ 9:44 pm