I was recently asked about this by Xavier Borderie in an interview currently appearing at Journal du Net. Since not all ongoing will be able to read my incredibly-polished French (well actually, Xavier translated my English, but I nit-picked the translation), I thought I should give the English version here:
Micah Dubinko asks “Is HTML on the Web a special case?”, and the answer is obviously “yes”. Note that the HTML language being developed by the WhatWG is not XML at all, and I'm not brave enough to predict whether that is a good idea.
There have always been a few tools that processed XML data but also accepted broken (non-XML) data; for example, every Web browser. It seems unlikely to me that there will ever be an official new release called “XML 2.0” that has different error-handling rules. But I'm sure that the arguments about when to apply real XML error handling and when software should accept non-XML data will go on forever; among other things they are quite entertaining.
There's a spectrum of situations: at one end, if an electronic-trading system receives an XML message for a transaction valued at €2,000,000, and there's a problem with a missing end tag, you do not want the system guessing what the message meant, you want to report an error. At the other end, if someone sends a blog post from their cellphone with a picture of a cute kitten, you don't want to reject it because there's an “&” in the wrong spot. The world is complicated.