Just How Miserable? · 29:31:30 elapsed, 29:30:52 CPU, i.e. it maxed one of the T2000’s 32 CPUs. That’s one day plus five hours and change. Ouch!
Small lessons · First of all, it was just using Ruby-by-default-out-of-the-box. As Haik pointed out, there’s a SPARC-optimized version available from Sun and I’ll install that.
Second, the code computes the various top-10 lists (hits, bytes, clients)
by storing the numbers in a hash and running
It says right in the docs for that function that it’s inefficient. Still, it
wasn’t too bad for sorting hashes keyed by ongoing
resources, because there are only a few thousand of those.
But when it sorted the hash keyed by the hostnames that had fetched resources,
well... it wasn’t pretty. The
sort_by call took
It was made worse by a programming error on my part; several people pointed out that one couldn’t diff the program output because if several of the top entries had the same numeric value, they could show up in arbitrary order. I simply changed the code to sort on the value, then the key. Of course, I stupidly didn’t take out the old sort, so it ended up sorting twice.
Anyhow, I’ve changed the code to do a single
through the big hashes, rather than sort them.
Big Lessons · The naive Ruby code is making terrible use of the machine. Over the course of the run, it uses, on average, 3.2% of the available CPU resources, and 3.8% of the available memory.
If you look at the
results, this sucker can do simple block I/O at about 150M/second at a
cost of about 75% of one CPU. So you ought to be able to plow through the
45G dataset in under five minutes, leaving 31 CPUs to do the actual work, even
mmap() tricks. I bet somebody will write a WF2
implementation that runs in about five minutes as opposed to 29 hours.
Next Steps · I’m going to re-run the cleaned-up Ruby code with the SPARC-optimized MRI and propose that as the baseline benchmark for WF2 implementations to beat. Then the fun should start.