<?xml version="1.0" encoding="utf-8"?><!-- generator="wordpress/2.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Markup Madness, Part Two: Who&#8217;s Afraid of the XML Web?</title>
	<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/</link>
	<description>The official AllPeers blog</description>
	<pubDate>Sun, 12 Oct 2008 05:41:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2</generator>

	<item>
		<title>By: Matt</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-117302</link>
		<author>Matt</author>
		<pubDate>Thu, 11 Oct 2007 08:21:41 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-117302</guid>
		<description>Boris - fair points about RSS, but I still think it's a big step in the right direction when compared with the quality of the average HTML on the web.

Brendan - well you'd have to agree that sites like Facebook (for profile pages) and Amazon (for reviews, comments, lists, etc.), among many others, hide the complexity of authoring HTML from average users. Perhaps we'll never have a really effective WYSIWYG HTML editor, because WYSIWYG is simply poorly adapted to authoring markup (I don't believe this to be the case, but I can certainly see the argument), but one future I don't see happening is the average Joe or Jane authoring HTML by handcrafting tags.</description>
		<content:encoded><![CDATA[<p>Boris - fair points about RSS, but I still think it&#8217;s a big step in the right direction when compared with the quality of the average HTML on the web.</p>
<p>Brendan - well you&#8217;d have to agree that sites like Facebook (for profile pages) and Amazon (for reviews, comments, lists, etc.), among many others, hide the complexity of authoring HTML from average users. Perhaps we&#8217;ll never have a really effective WYSIWYG HTML editor, because WYSIWYG is simply poorly adapted to authoring markup (I don&#8217;t believe this to be the case, but I can certainly see the argument), but one future I don&#8217;t see happening is the average Joe or Jane authoring HTML by handcrafting tags.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: monk.e.boy</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116961</link>
		<author>monk.e.boy</author>
		<pubDate>Wed, 10 Oct 2007 10:20:57 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116961</guid>
		<description>Everyone on MySpace wouldn't be able to browse their own web pages.

They out number XML freaks 1,000,000 to 1. Please get a grip! Open your eyes to reality!

XML was a nice idea, but I'm a python coder and I still find XML to be verbose, ill defined, and mostly useless.

.csv is the future baby.

monk.e.boy</description>
		<content:encoded><![CDATA[<p>Everyone on MySpace wouldn&#8217;t be able to browse their own web pages.</p>
<p>They out number XML freaks 1,000,000 to 1. Please get a grip! Open your eyes to reality!</p>
<p>XML was a nice idea, but I&#8217;m a python coder and I still find XML to be verbose, ill defined, and mostly useless.</p>
<p>.csv is the future baby.</p>
<p>monk.e.boy</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brendan Eich</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116892</link>
		<author>Brendan Eich</author>
		<pubDate>Wed, 10 Oct 2007 04:32:46 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116892</guid>
		<description>WYSIWYG will save us, oh boy.

First, there are lots of WYSIWYG editors that create invalid markup (some popular SVG ones led to a bugzilla bug asking us to impute xmlns= settings just to interoperate -- this while SVG has tiny "web content market share"!).

Second, so long as the web is alive in the sense that new combinations (mashups) of existing and new content can be cheaply created, whether or not tools are in the loop, you will get copy/paste injection of markup that violates well-formedness; and at the margins you will get (and should want!) hand-tweaking.

I'll believe the WYSIWYG utopia when I see it, which may be the same as saying when I kick the bucket (not saying whether it will be heaven or hell ;-).

/be</description>
		<content:encoded><![CDATA[<p>WYSIWYG will save us, oh boy.</p>
<p>First, there are lots of WYSIWYG editors that create invalid markup (some popular SVG ones led to a bugzilla bug asking us to impute xmlns= settings just to interoperate &#8212; this while SVG has tiny &#8220;web content market share&#8221;!).</p>
<p>Second, so long as the web is alive in the sense that new combinations (mashups) of existing and new content can be cheaply created, whether or not tools are in the loop, you will get copy/paste injection of markup that violates well-formedness; and at the margins you will get (and should want!) hand-tweaking.</p>
<p>I&#8217;ll believe the WYSIWYG utopia when I see it, which may be the same as saying when I kick the bucket (not saying whether it will be heaven or hell ;-).</p>
<p>/be</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Boris</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116877</link>
		<author>Boris</author>
		<pubDate>Wed, 10 Oct 2007 03:30:44 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116877</guid>
		<description>It looks like your blog software eats URIs?  Let's try the HTML markup approach, I guess:

&lt;a href="http://googlereader.blogspot.com/2005/12/xml-errors-in-feeds.html" rel="nofollow"&gt;Google data&lt;/a&gt;

&lt;a href="http://blogs.msdn.com/rssteam/archive/2005/11/03/489065.aspx" rel="nofollow"&gt;Microsoft reference&lt;/a&gt;

It would be nice if I could avoid having to type that icky HTML, though....</description>
		<content:encoded><![CDATA[<p>It looks like your blog software eats URIs?  Let&#8217;s try the HTML markup approach, I guess:</p>
<p><a href="http://googlereader.blogspot.com/2005/12/xml-errors-in-feeds.html" rel="nofollow">Google data</a></p>
<p><a href="http://blogs.msdn.com/rssteam/archive/2005/11/03/489065.aspx" rel="nofollow">Microsoft reference</a></p>
<p>It would be nice if I could avoid having to type that icky HTML, though&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Boris</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116875</link>
		<author>Boris</author>
		<pubDate>Wed, 10 Oct 2007 03:27:47 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116875</guid>
		<description>Matt, I find it interesting that your example of "valid XML" is RSS.  Some hard numbers:

1) About 7% of the feeds Google Reader runs into are not well-formed XML.  See .  Feed readers apparently vary in how they handle the problem.  Firefox doesn't enforce encoding validity (a bug); not sure about the rest.  IE7 at least planned to enforce well-formedness, apparently: see .  Not sure whether they stuck by that.

2) A very large number of (popular) RSS feeds are not served with an XML MIME type.  This has necessitated that Firefox sniff _all_ incoming content to determine whether it might be an RSS feed.  Dedicated RSS readers ignore the MIME type altogether (treat everything as a feed), which is how we got to this mess.

If this is the future XML web, I want out.  ;)</description>
		<content:encoded><![CDATA[<p>Matt, I find it interesting that your example of &#8220;valid XML&#8221; is RSS.  Some hard numbers:</p>
<p>1) About 7% of the feeds Google Reader runs into are not well-formed XML.  See .  Feed readers apparently vary in how they handle the problem.  Firefox doesn&#8217;t enforce encoding validity (a bug); not sure about the rest.  IE7 at least planned to enforce well-formedness, apparently: see .  Not sure whether they stuck by that.</p>
<p>2) A very large number of (popular) RSS feeds are not served with an XML MIME type.  This has necessitated that Firefox sniff _all_ incoming content to determine whether it might be an RSS feed.  Dedicated RSS readers ignore the MIME type altogether (treat everything as a feed), which is how we got to this mess.</p>
<p>If this is the future XML web, I want out.  <img src='http://www.allpeers.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mawrya</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116784</link>
		<author>mawrya</author>
		<pubDate>Tue, 09 Oct 2007 20:34:37 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116784</guid>
		<description>I am seeing people interpret "enforce correct markup" to mean render a blank page with an error message for bad HTML. People seem to think that supporting XML means tossing out HTML.  It doesn't have to be this way.  Support both, completely, and if XML really does have a lot of advantages over HTML, the web will naturally become XML because the authors will take the path of least resistance and/or greatest advantage. These rewards must be significant, and significant to authors, not browser manufactures. Whether its the merits of the XML itself, which I am not sure is enough, or more likely the author-used-tools (NVU, etc.) and additional features (SVG, XSLT) that require XML. Unfortunately, many of the rewards are being held hostage by IE.

IE needs to be fixed to play nicely with HTML served as XML, or XHTML.  The other browsers seem to have their mime types in order, but until the big one does, we won't seem much change. Maybe IE8?</description>
		<content:encoded><![CDATA[<p>I am seeing people interpret &#8220;enforce correct markup&#8221; to mean render a blank page with an error message for bad HTML. People seem to think that supporting XML means tossing out HTML.  It doesn&#8217;t have to be this way.  Support both, completely, and if XML really does have a lot of advantages over HTML, the web will naturally become XML because the authors will take the path of least resistance and/or greatest advantage. These rewards must be significant, and significant to authors, not browser manufactures. Whether its the merits of the XML itself, which I am not sure is enough, or more likely the author-used-tools (NVU, etc.) and additional features (SVG, XSLT) that require XML. Unfortunately, many of the rewards are being held hostage by IE.</p>
<p>IE needs to be fixed to play nicely with HTML served as XML, or XHTML.  The other browsers seem to have their mime types in order, but until the big one does, we won&#8217;t seem much change. Maybe IE8?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116782</link>
		<author>Matt</author>
		<pubDate>Tue, 09 Oct 2007 20:18:48 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116782</guid>
		<description>Brendan - certainly any vision of an XML web is taking the long view. This isn't going to happen any time soon.

The problems with XML namespaces wouldn't be too hard to fix. This isn't to say that XML is perfect, but most of its flaws are man-made, not inevitable.

I'm not sure that I agree about the web continuing to be hand-crafted. More and more pages are coming out of publishing engines: Wordpress, Facebook and Amazon to name a diverse sample. And I would tend to believe that where people are publishing markup by hand, they'll use some sort of WYSIWYG tool. I know that usable HTML WYSIWYG editors have been a long time coming, but someone has to get it right eventually.</description>
		<content:encoded><![CDATA[<p>Brendan - certainly any vision of an XML web is taking the long view. This isn&#8217;t going to happen any time soon.</p>
<p>The problems with XML namespaces wouldn&#8217;t be too hard to fix. This isn&#8217;t to say that XML is perfect, but most of its flaws are man-made, not inevitable.</p>
<p>I&#8217;m not sure that I agree about the web continuing to be hand-crafted. More and more pages are coming out of publishing engines: Wordpress, Facebook and Amazon to name a diverse sample. And I would tend to believe that where people are publishing markup by hand, they&#8217;ll use some sort of WYSIWYG tool. I know that usable HTML WYSIWYG editors have been a long time coming, but someone has to get it right eventually.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brendan Eich</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116767</link>
		<author>Brendan Eich</author>
		<pubDate>Tue, 09 Oct 2007 19:23:41 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116767</guid>
		<description>jgraham already pointed out the Prisoner's Dilemma facing browser vendors trying to gain market share. Cooperate with the purity police while IE continues to defect? You lose.

But I think there are severe usability problems with XML apart from this "how to get there" issue. Micah Dubinko and Oliver Steele have blogged about just the problems with namespace usability, and we see this all the time in SVG content, with E4X users, etc.

What's more, I contend that the Web is and will remain a human-crafted artifact, not mostly machine produced in its hypertext content, therefore error correction must be part of normative specs for its main text-like content languages.

Finally, I dispute both claims in this paragraph: "The good news is that if major browser vendors enforced correct markup, authors would rapidly get a clue, just as they have been scrambling to fix pages that don’t play well in Firefox as the latter’s market share has grown."

Too many pages still don't work with Firefox, and often the page author (a consultant to some bigdumbcompany.com) is long-gone. Sure, this may be a variation on the Prisoner's Dilemma, but I wanted to point out that your hopeful statement here is over-optimistic, in our experience. The WHATWG work may pay off in a few years and change my reading of reality, but we're not there yet.

/be</description>
		<content:encoded><![CDATA[<p>jgraham already pointed out the Prisoner&#8217;s Dilemma facing browser vendors trying to gain market share. Cooperate with the purity police while IE continues to defect? You lose.</p>
<p>But I think there are severe usability problems with XML apart from this &#8220;how to get there&#8221; issue. Micah Dubinko and Oliver Steele have blogged about just the problems with namespace usability, and we see this all the time in SVG content, with E4X users, etc.</p>
<p>What&#8217;s more, I contend that the Web is and will remain a human-crafted artifact, not mostly machine produced in its hypertext content, therefore error correction must be part of normative specs for its main text-like content languages.</p>
<p>Finally, I dispute both claims in this paragraph: &#8220;The good news is that if major browser vendors enforced correct markup, authors would rapidly get a clue, just as they have been scrambling to fix pages that don’t play well in Firefox as the latter’s market share has grown.&#8221;</p>
<p>Too many pages still don&#8217;t work with Firefox, and often the page author (a consultant to some bigdumbcompany.com) is long-gone. Sure, this may be a variation on the Prisoner&#8217;s Dilemma, but I wanted to point out that your hopeful statement here is over-optimistic, in our experience. The WHATWG work may pay off in a few years and change my reading of reality, but we&#8217;re not there yet.</p>
<p>/be</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116737</link>
		<author>Matt</author>
		<pubDate>Tue, 09 Oct 2007 17:28:29 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116737</guid>
		<description>jgraham - I argued that an XML web has advantages over an HTML web, but I purposely avoided the question of how to get from here to there. :-) It's a fascinating question, and you raise very substantial issues. Perhaps the rise of RSS holds a clue: suddenly a lot of web content is being delivered as valid XML in a kind of parallel web. The question is: how could we extend this trend so that more and more content is available as XML? In the long run, this might lead to a tipping point where content authors can no longer assume that their messy invalid HTML will be viewable by most visitors.

Regarding WHAT WG: I totally agree, this is a great effort and certainly a highly pragmatic approach to a really thorny issue. A true XML web would still be better but even to a dreamer like me that's years away, so a better HTML web would be a great way to tide us over in the meantime.</description>
		<content:encoded><![CDATA[<p>jgraham - I argued that an XML web has advantages over an HTML web, but I purposely avoided the question of how to get from here to there. <img src='http://www.allpeers.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> It&#8217;s a fascinating question, and you raise very substantial issues. Perhaps the rise of RSS holds a clue: suddenly a lot of web content is being delivered as valid XML in a kind of parallel web. The question is: how could we extend this trend so that more and more content is available as XML? In the long run, this might lead to a tipping point where content authors can no longer assume that their messy invalid HTML will be viewable by most visitors.</p>
<p>Regarding WHAT WG: I totally agree, this is a great effort and certainly a highly pragmatic approach to a really thorny issue. A true XML web would still be better but even to a dreamer like me that&#8217;s years away, so a better HTML web would be a great way to tide us over in the meantime.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jgraham</title>
		<link>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116735</link>
		<author>jgraham</author>
		<pubDate>Tue, 09 Oct 2007 17:20:59 +0000</pubDate>
		<guid>http://www.allpeers.com/blog/2007/10/09/markup-madness-part-two-whos-afraid-of-the-xml-web/#comment-116735</guid>
		<description>&lt;blockquote&gt;The good news is that if major browser vendors enforced correct markup, authors would rapidly get a clue, just as they have been scrambling to fix pages that don’t play well in Firefox as the latter’s market share has grown.&lt;/blockquote&gt;

The problem with this argument is that major browser vendors would simply never choose to break 97%+ of exisiting content, let alone simultaneously, and even if they did, it would not have the effect that you describe. Any browser vendor who did refuse to render "broken" HTML would find that no user would be prepared to use their browser. Even if somehow all vendors decided to simultaneously release super-strict versions of their browsers, the net effect would be users sticking with their current web-compatible browser rather than taking an upgrade that would break the vast majority of the web. No doubt, in time, many sites would be fixed to adhere to the new requirements but there would still be a great deal of valuable legacy content that would work better in a less strict browser. What are the odds that every browser vendor would stick with the coallition when, by loosening up a bit they could capture a huge chunk of the market? Indeed, such loosening up need not be intentional; simple bugs in popular UAs can become depended on to the extent that the buggy behaviour must be copied by other UAs. Therein lies the heart of the problem; requiring strictness in content handling is requiring everyone to maintain an unstable equlibrium. It may sound good in theory but it can't happen in practice.

&lt;blockquote&gt;By far the most significant argument for XML on the web, however, is the complexity of existing HTML processors, a direct result of the compatibility hacks required to deal with naughty content.&lt;/blockquote&gt;

Hopefully that argument has been substantially weakened now the WHATWG has invested significant effort in documenting the behaviour required by a web-compatible HTML parser. That takes most of the effort out of implementing a HTML parser because you don't have to come up with your own scheme to deal with the hard issues like residual style. Indeed the general feeling amonst people familiar with both the WHATWG spec and the XML spec is that it is not significantly harder to implement the WHATWG spec than the XML spec (XML has a lot of complexity to do with the DTD, entities in the internal subset, etc.), nor need the result be significantly less performant. Of course a HTML parser still has to do unintuitive things, like magically inserting tbody elements, at times but that is apparently not so difficult to understand that it has significantly affected the popularity of HTML.</description>
		<content:encoded><![CDATA[<blockquote><p>The good news is that if major browser vendors enforced correct markup, authors would rapidly get a clue, just as they have been scrambling to fix pages that don’t play well in Firefox as the latter’s market share has grown.</p></blockquote>
<p>The problem with this argument is that major browser vendors would simply never choose to break 97%+ of exisiting content, let alone simultaneously, and even if they did, it would not have the effect that you describe. Any browser vendor who did refuse to render &#8220;broken&#8221; HTML would find that no user would be prepared to use their browser. Even if somehow all vendors decided to simultaneously release super-strict versions of their browsers, the net effect would be users sticking with their current web-compatible browser rather than taking an upgrade that would break the vast majority of the web. No doubt, in time, many sites would be fixed to adhere to the new requirements but there would still be a great deal of valuable legacy content that would work better in a less strict browser. What are the odds that every browser vendor would stick with the coallition when, by loosening up a bit they could capture a huge chunk of the market? Indeed, such loosening up need not be intentional; simple bugs in popular UAs can become depended on to the extent that the buggy behaviour must be copied by other UAs. Therein lies the heart of the problem; requiring strictness in content handling is requiring everyone to maintain an unstable equlibrium. It may sound good in theory but it can&#8217;t happen in practice.</p>
<blockquote><p>By far the most significant argument for XML on the web, however, is the complexity of existing HTML processors, a direct result of the compatibility hacks required to deal with naughty content.</p></blockquote>
<p>Hopefully that argument has been substantially weakened now the WHATWG has invested significant effort in documenting the behaviour required by a web-compatible HTML parser. That takes most of the effort out of implementing a HTML parser because you don&#8217;t have to come up with your own scheme to deal with the hard issues like residual style. Indeed the general feeling amonst people familiar with both the WHATWG spec and the XML spec is that it is not significantly harder to implement the WHATWG spec than the XML spec (XML has a lot of complexity to do with the DTD, entities in the internal subset, etc.), nor need the result be significantly less performant. Of course a HTML parser still has to do unintuitive things, like magically inserting tbody elements, at times but that is apparently not so difficult to understand that it has significantly affected the popularity of HTML.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
