This week I had a pleasant, relaxed, sit-down conversation with Jimmy Wales, the main man behind the Wikipedia. The purpose of this note is to pass along some interesting facts about the project that I hadn’t previously known. This is timely in that there has been a recent flare-up of the usual Wikipedia controversies, with mostly the same old players flinging the same old slime; those who care might want to revisit my essay from last year, which takes a careful look at the project as contrasted to the world of conventional reference publishing. I stand by my conclusion: the Wikipedia dwarfs its critics. The rest of this piece is just a recitation of facts, but some of them are surprising. [Update: PHP@Yahoo!]
In November 2005, the Wikipedia is running at about 2.4 billion page-views per month, and that figure is doubling every three or four months. It has two data centers in Europe, one in the US, and one in South Korea. The Korean facility is provided and paid for (servers, bandwidth, everything) by Yahoo, and the one in Holland by Kennisnet, an ISP (details are here).
As of today, there are 124 servers, fairly heterogeneous, although these days they’ve pretty well standardized on dual-Opteron boxes. The MediaWiki software is PHP-based, mostly running on Fedora; I wonder if this is the world’s largest-scale PHP deployment, or would Yahoo top that? They get a pretty good hit rate on their Squid caches; basically, the whole system scales about linearly with the number of servers they deploy.
[Update: Someone who wishes to remain anonymous writes: “FYI, there are approximately 200 webservers running news.yahoo.com; many properties at Yahoo! (mail.yahoo.com, for example), have many, many more. Nearly all are running PHP.”]
Having said that, speaking both as a user and contributor, I find that Wikipedia’s performance is mostly pretty terrible; usable, but irritatingly slow. So there’s certainly room for improvement.
The Wikipedia foundation is a US 501c(3) corporation, which means it’s a registered charity and donations are tax-deductible. Their major expense is servers; all the work is done by volunteers and co-ordinated largely via IRC. They are exploring new modes of funding from foundations and other granting agencies, but Jimmy thinks the organizational structure is viable and appropriate, for the moment at least. They are also considering book-publishing as a source of income; imagine the Wikipedia History of Rock & Roll or Wikipedia Compendium of TV SciFi Lore.
Whether you like the Wikipedia or not, it’s a New Thing In The World, and bears watching.