A week or two ago, I was reading something which included a really silly statement hyperlinked to the Wikipedia entry for XML. I followed the link and discovered that the entry was appallingly bad. I looked with a shudder at the size and complexity of the brokenness and just failed to convince myself that it was somebody else’s problem. So we fixed it.

If you want to get a feel for the problem, here’s the August 4th version, as it stood before this work started.

How It Went · As a first step, I sent an email to the old xml-dev mailing list; a remarkable institution that, it’s been in continuous operation (I think) since before XML 1.0 was actually finished in 1998.

Then I started a linear march through the entry, throwing out three or four paragraphs for every one I put in. By the time I got through the first pass, several others were involved, notably including Michael Kay, Rick Jelliffe, Ken Sall, and James Clark. Some of the discussion took place over on xml-dev, but quite a bit is where it really should be, the XML entry’s discussion page.

A Lesson · Right now I’m feeling good about the way this is coming out. Historically, the community of XML experts hadn’t really paid attention to the Wikipedia entry until it was “too late”, when it had become bloated, disorganized, and a theater for pro/anti-XML edit wars. The evidence suggested that the majority of people editing the entry were not overburdened with real XML expertise. Nobody really had the heart to take this mess on.

What changed was this: one engaged party (me) decided it was worthwhile investing the (single-digit number of) hours for an initial hosing-out of the of this particular Augean stable, and knew where to go to appeal for help going forward.

At the moment, my bet is that enough people with an intersection of XML expertise and Wikipedia-editing skills are paying attention that the entry should be in pretty good shape for the next little while.

A Problem? · Well, perhaps. If you go to the XML entry Discussion page, there’s a notice across the top with a big Attention glyph. It says:

An individual covered by or significantly related to this article has edited Wikipedia as TimBray (talk · contribs). This user's editing has included this article. Readers are encouraged to review Wikipedia:Autobiography for information concerning autobiographical articles on Wikipedia.

Well, yep, and the potential problem is obvious. XML is probably going to be the biggest thing on my gravestone after my name. The incentive for me to pump this entry up and make XML seem positively epochal in its importance is huge.

A more subtle but even more pernicious incentive would be for me, while editing the entry, to inflate my importance in the development of XML.

Maybe this isn’t just abstract. At one point a few years back, some XML-haters descended on the XML entry and added a section explaining why it was a crock of shit and anyone with any taste would use YAML or S-expressions or something.

Others who objected, but didn’t want simply to erase others’ edits, added countervailing evidence, and the entry ended up with a section entitled “Criticism of XML” with “Pro” and “Con” lists; sprawling, disorganized, of questionable relevance, and a frequent locus for edit wars.

I raised the question of whether this thing deserved to be in the entry at all; someone took this as Will no one rid me of this turbulent priest? and shitcanned the whole section. I hadn’t been quite ready to do this, but on the other hand I hear no voices raised asking it to be put back.

It would be reasonable to suspect someone like me of maliciously conspiring to censor criticism of my intellectual baby right out of Wikipedia. I honestly don’t know what the right answer is. We want experts to edit Wikipedia, but experts tend also to be partisans. I think the current approach (highlight when it happens) is a reasonable compromise.

Unfinished · The one thing I’m not claiming is that this entry is finished or perfect or not in need of further work. The entry could be better. It should be better. Can you help?


Comment feed for ongoing:Comments feed

From: John Cowan (Aug 08 2009, at 19:26)

Thanks. Your work here inspired me to do some of my own on the article.


From: Andrew Wahbe (Aug 11 2009, at 06:46)

The REST wikipedia article is in a similar state -- Roy Fielding has blamed it in part for folks having a hard time understanding REST (http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven#comment-724). It looks like folks have been busy trying to fix it lately but I still think hypermedia is not given enough treatment. Roy has said that he has refrained from editing it himself as it would be too "self-referential" (http://roy.gbiv.com/untangled/2009/it-is-okay-to-use-post#comment-996) -- similar to you editing the XML entry.

But here's the thing: wikipedia covers much more specialized topics than a standard encyclopedia. For many, recent (by encyclopedia standards) technical topics, it's pretty damn hard to find an "expert" that had nothing to do with the definition of the technology. This is in part because of the participatory nature of the web (something that wikipedia itself is built from). If wikipedia is going to be "inclusive" it needs to include everyone -- especially the folks who understand a topic the best. It's everyone else's job to keep them honest.


From: Andrew Garrett (Aug 11 2009, at 08:18)

I'm really glad to see people with expertise working on technical articles – it's very pleasing that you had a good experience (we frequently have trouble with over-zealous admins who drive away content contributors).


From: Paul W. Homer (Aug 11 2009, at 08:51)

It is really too bad that these things tend to get bogged down into politics. XML is just a technology, it does some things extremely well, and others not. That is true of all technologies, they all have their strengths and weaknesses.

I guess some people get too invested in a specific instance and just can't seem to get back to being objective. The use or abuse of any of these technologies should never be personal. They exist (and get used) whether or not we love or hate them (and will always be people on both sides).

Perhaps in the Wikipedia entry a section called something like "Recommended Usages" would be a non-confrontational way of listing out places were this specific technology works well (without having to list out all of the cons). If I'm going to view the entry, it is probably because I am asking the questions "what is this" and "what is it good for".



From: SJ (Aug 11 2009, at 18:04)

Write on! Calling on a group of interested people is a fine way to tackle a thorny problem.

There is nothing wrong with contributing to articles on topics you have been directly involved with -- as long as that connection is made clearly, and it is appropriate for this to raise extra attention and review from others.


From: Fred Bauder (Aug 12 2009, at 04:51)

I regularly counsel Wikipedia editors who are in the situation you are in. Usually after they have been blocked for violation our conflict of interest guidelines... Here's the canned responses I send to them before I unblock them:

As you apparently intend to edit articles about an organization which you are involved with, please review Wikipedia:FAQ/Organizations, Wikipedia:Conflict of interest, and Wikipedia:Conflict of interest/Noticeboard




Please confirm that you are willing and able to conform to those guidelines.

Thank you for your patience,

Fred Bauder

In your case, rather than trying to create an article, you might do

better to make a request at Wikipedia:Requested articles


As you seem interested in creating or editing an article about yourself please

review Wikipedia:Conflict of interest, Wikipedia:Autobiography, and Wikipedia:Conflict of interest/Noticeboard




Obviously I send the part of the canned message that is relevant to the particular user. In your case, we want you to edit the article on XML, provided you apply your expertise rather than any bias you might have. All we expect is common sense, of which you seem to have an abundance.

Fred Bauder


From: len (Aug 12 2009, at 09:42)

The idea seems to be that contributions from a large and hopefully dispersed set of contributors tends toward a better coverage. YMMV.

Still, given sufficient respect over a number of years and iterations, a technical community does well to make the attempt. Over the such time spans with many projects particularly those from the years when open communications among parties from different companies and universities with different projects and interests developing around the same emerging technologies (eg, SGML) and at the transition point from primacy of a technology origin to a new primacy, without the multiple contributors, history is lost. This often isn't noticed until it is time to present prior art.

It is the documentation of that which alone makes a clean wikipedia article highly valuable to the evolution of open social systems such as the WWW on the Internet.


From: Jim Harvie (Aug 13 2009, at 16:34)

I know nothing about xml, or at least until I read this post. First I went to Wikipedia and read your contribution. I grasped the gist of it and was curious. Then I went back to your site and linked to your archived version of the old one.........then I read your Wikipedia entry again.


From: Sam Johnston (Aug 18 2009, at 11:45)

As someone who's been on the giving and receiving side of Wikipedia, and who's spent some time patrolling the conflict of interest noticeboards (WP:COIN) so as to better understand the problem, this approach (adding a notice to the talk page) is my preferred option. The usual practice is to revert the victim's (often valid) edits, tag their talk page with uw-coi and then brand the article itself as untrustworthy in its entirity. This obviously sucks so I encourage other editors to identify violations of policy that stem from the conflict (e.g. WP:N, WP:V, WP:NPOV) and remove the warning when the policy violations are resolved. I even created a COI-issues template for the purpose which I would like to see eventually becoming the default (problem is when 1000's of articles are tagged the real issues are buried until they pop up on WP:COIN).

Anyway overall the process generally works even if there is significant room for improvement. I personally found the atmosphere as an active Wikipedian fairly toxic and have moved on to more enjoyable endeavours (like writing standards) but nonetheless appreciate your efforts with the XML article - perhaps you would consider pushing it to GA or FA status?



author · Dad
colophon · rights
picture of the day
August 08, 2009
· Technology (90 fragments)
· · Life Online
· · Publishing (160 more)
· · XML (136 more)

By .

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!