The ongoing logfiles flip over early Sunday mornings, and sometimes I run some basic stats over them. This last Sunday they said that a total of 995,213 pages have been read, so there is a chance that if you’re reading this on the 29th or 30th of September, you will get the millionth page. Thanks to all; herewith a couple more statistics and some discussion of them.

But before the stats, I wanted to re-iterate that thank-you to everyone who takes the time to read this, I really mean it, ongoing has filled a hole in my life that I previously didn’t know existed. I can’t imagine not doing this.

How Do You Count? · Anyone who publishes anything wants to know who’s reading it. On the Web, it’s hard to figure out the right questions, and then it’s hard to figure out the answers. So when I say “a million pages read,” what do I really mean? Well, for the unix-literate, here’s an exact characterization of what I mean:


zcat *.log.gz | \
 egrep '"GET /ongoing/.* 200 ' | \
 awk ' {print $7}' | \
 egrep -v '\.' | \
 wc -l

For the rest, an approximate English description would be “everything that was fetched successfully whose URI began with /ongoing/ and which didn’t contain a dot.” Excluding the dot excludes all graphics as well as the RSS feed and the CSS stylesheet. So it really is a pretty decent approximation of of the number of times someone looked at a page.

It’s not perfect: it overestimates because some proportion of that million or so fetches were by Google, Inktomi, and many less-skilful robots and crawlers and so on. On the other hand, it underestimates because it excludes all the fetches of the full-size versions of the images, and all fetches of the source-code snippets and so on that I’ve posted. Also, it leaves out all the single-paragraph postings that are contained entirely in the RSS feed and are read that way. I’m willing to bet that the two errors kind of cancel each other out, and say that about a million stories have been read.

In that same time-span, my RSS feed has been fetched 1,856,905 times.

How Many Different People? · Resources at ongoing have been accessed from 228,855 different IP addresses. The RSS feed has been fetched from 49,703; 21,836 since August first.

Everyone knows that IPs are a lousy way to count people; it estimates high because people move around: I have one address at home, another at work, and have showed up from any number of hotel rooms and conferences. On the other hand, everyone at AOL has one IP address, as does everyone at Microsoft. My gut tells me that the number of unique IP addresses overcounts the number of unique people, maybe by a factor of two? But we shouldn’t have to rely on my gut, since there are people out there who count subscribers properly with cookies and so on, and would have a good feel for what the real ratio is. Anyhow, I’d be surprised if I had less than five thousand subscribers or more than fifteen thousand.

The Hit Parade · Q: What do people like reading? A: You’re a bunch of hopeless geeks, but that’s OK, so am I. I live in hope that one of my notes about nature or politics or music gets noticed outside the coterie of markup-slingin’ webheads who apparently are my natural audience.

FetchesEssay
153116ongoing
83816XML Is Too Hard For Programmers
44539Why XML Doesn’t Suck
30650The Web’s the Place
17152The Door Is Ajar
14601I Like Pie
10133Truth
9232Language Fermentation
8649What This Is
7715Author
7402Technology
7106What · Technology · XML
6941iYear
6286iTunes Music Store and the WWW
5833Business
5762On the Goodness of Unicode
5739Colophon
5474What
5454The RDF.net Challenge
5049When

Pix · More geekery; the only full-size pictures that people look at are screen grabs and pictures of Macintoshes. The top three non-tech pictures that people actually looked at were the panoramic second shot in the write-up on my Canon S50 (330 views), the close-up of Byron’s Troy at the end of the Slim Book of Verse photo-essay (307 views), and of course the Bit Bucket (298 views). The lesson for me is obvious; the way I present the pictures on the page is the way they’re gonna get seen, so maybe the current approach of crushing them all down to 300 bytes wide is sub-optimal.


author · Dad · software · colophon · rights
picture of the day
September 29, 2003
· The World (126 fragments)
· · Journalism (37 more)

By .

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.