Here are some questions about the “Average Web Page”: How big is it? Does it have pictures? How many others does it point to? How many others point to it? Nine years ago I offered answers to those questions, with pretty pictures even (some included here), and those answers are still interesting, but it would be nice if someone would repeat the exercise for today’s Web. Plus, another reason to be mad at Microsoft.
The Missing Conference · What happened was, I was going to show someone a graph from my nine-year-old Measuring the Web paper, but the whole WWW5 site was completely gone. So I asked Tim Berners-Lee and he asked Vincent Quint and now it’s back—thanks Tim and Vincent!
This caused me to and re-read the paper, and I decided that some of the graphics were worth sharing here, and the big question was worth re-asking. The paper won a gold medal from the Mayor of Paris, one of two so honoured as best of the conference. I am keenly aware of how much politics and blind luck and other people’s work have helped me along over the years and I’m pretty cynical about awards and trophies, but I worked like a dog for weeks and weeks on that paper and the gold medal sits at the front of the trophy shelf and I smile whenever I see it.
In 1995-96 I built and ran the Open Text Index, one of the first handful of Web Search Engines; you could go to Lycos or Infoseek or Webcrawler or us, and we were doing a million hits a day when almost nobody else was. So in late 1995 I took a snapshot of the index and tried to answer some of those questions from up there in the first paragraph.
The Web’s 1995 Measurements · The average web page (just the text, not the pictures and so on) was around 7K in size:
About half of web pages had images:
About three quarters of pages had links:
This last graph is my favorite, even though it’s a little hard to read; Edward Tufte could probably turn into a marvel of transparent expressiveness. Have a look, and I’ll provide some exegesis below.
The cluster labeled “0” says that in 1995, about 80% of Web sites contained no outgoing off-site links, and that apparently, some 5% weirdly had no incoming off-site links.
The cluster labeled “1-10” says in the 20% of sites that did link off-site, most had only a handful of such links, less than ten; and that well over 80% of sites had ten or fewer incoming links from elsewhere.
I wonder what those numbers would look like today? How about it; Google, MSN, Yahoo? Somebody out there want to run the numbers?
Snickers, and a Sad Story · If you go read the paper you’ll probably enjoy a snicker at the laughably-small size of the 1995 Web, and also at my heroic attempt to build a Snow Crash-inspired immersive simulation of the whole thing, using the then-hot VRML technology. It actually ran and you could zoom around in it and it was drop-dead cool, but not many people ever saw it; here’s the sad story.
I’d got some interests from the titans of VRML culture, including Mark Pesce and YON, who were at the conference, and we’d scheduled a demo. Getting the thing to run was kind of a pain in the ass, you needed all sorts of Netscape extensions and the textures were mammoth, so it took a while to download.
So on the day of the demo, I got it all set up and went to get the guys, and in the 30 seconds I was away from the browser, this elderly conference attendee got in, closed all my windows, opened a shell window, and started going through his back 300 emails over a connection with an effective rate of 300 baud or so. We stood there for a few minutes and hinted politely, but he needed his email and just wasn’t moving and was old enough not to be intimidated. A pity, but we made a date to come back first thing the next morning.
Except for, during the night, the guys from Microsoft went around, erased Netscape from all the computers, and installed Internet Explorer. I have some perspective now, but at the time I literally had to go outside the conference and sit down away from everybody, because I was afraid I’d do physical violence to the first Microsoftie I saw.
But I shouldn’t cry too much; Neal Stephenson, as a cyberspace designer, is a great novelist, and the project never had legs.