I had the idea that I’d chop up the disk on my Ultra 20 into a bunch of partitions and do some filesystem performance testing with UFS and ZFS and Ext3 and Reiser. This turned out to be a really bad idea, but I still got some interesting numbers.
Solaris vs. Partitions ·
figuring out how to do the partitioning, and I started pestering some of the
Solaris gurus that working here means you can pester, and their answers were
kind of nervous and evasive. “Uh, why would you want to do that? The UFS and
ZFS buffering will collide, it won’t run well.” I ignored them and, after a
lot of beating
through the bushes, ran across
create a Solaris
fdisk partition, which actually explains all
This whole notion of partitioning is I think an artifact of “Personal Computer” culture, which includes Microsoft, Linux, and Macintosh computers; you’d slice up your disk into partitions and have different operating systems and filesystems and so on. Solaris isn’t really comfortable with it; the tools are lousy and the docs are lousy and the terminology is really, really confusing. Which makes sense in an OS with its roots in server space where you’ve always thought in terms of multiple disks, rather than slicing up individual ones.
So first of all, it was hard to get the partitions going. Hard enough that I ended up re-partitioning and re-installing Solaris and Linux multiple times. Then, my Bonnie benchmarking results were weirdly inconsistent and variable, up down and sideways, after each re-install, and I totally couldn’t correlate the performance with anything. So I’m going to stick another disk in there and get some useful ZFS numbers.
Bonnie · In the course of this work, I fiddled with Bonnie some more. I apologize to Russell Coker and the people who do the Solaris HCL work and the people who maintain Bonnie for various Linux distros, because I didn’t co-ordinate with anyone, I just went ahead and did it. I guess I should get Bonnie a home on SourceForge or somewhere to bring some order to this chaos.
The most interesting thing I did was add a
-r option, which
ensures that the data in each 16K block Bonnie writes is randomized and different. This is to defeat compression (see below).
I ♥ ZFS · Just like all those guys said on their blogs, ZFS is ridiculously, laughably easy for the sysadmin. It takes like three simple commands to make a ZFS pool, a couple of filesystems on it, compress them, find out their stats, you name it. Assuming there isn’t some horrible gotcha that I haven’t seen yet, ZFS is going to be the way to go for all Solaris filesystems; of course, they have to figure out the little matter of making it bootable.
The Tests · In general, this Ultra seems to have what I would call pretty good I/O performance for a desk-side machine. The machine has 2G of RAM so I ran all the tests with 20G of data, which should be effective at defeating caching.
In no case did I twiddle any filesystem parameters, or take any non-default settings.
Linux: Ext3 vs. Reiser4 · Here are the results for the Ext3 (default) and Reiser4 filesystems, both run under an updated Ubuntu with a nice recent 2.6 kernel.
-------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU Ext3 20 38.5 54.7 38.1 17.0 17.8 6.6 44.5 53.3 47.2 7.3 67 0.3 Reiser4 20 38.8 53.5 43.9 20.1 18.2 7.3 42.1 60.9 44.5 10.1 73 0.4
In this test, Reiser is a little faster at writing data, a little slower at reading it, a little faster at random seeking, and burns a little more CPU. But for the stuff that Bonnie measures, it’s pretty well a wash. This makes sense, since I understand Reiser4’s sweet spot to be dealing with huge numbers of tiny files.
Solaris: UFS vs. ZFS · I already pointed out that my results were weirdly variable in the aftermath of my partitioning misadventures, so these numbers have little stand-alone value; I’ve picked pairs of adjacent runs where I just changed one variable.
First, here’s ZFS and UFS.
-------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU UFS 20 52.1 37.5 55.1 12.3 6.4 1.9 64.0 48.6 66.3 9.7 47 0.4 ZFS 20 36.7 22.0 42.5 9.6 26.6 4.9 61.9 42.9 62.2 4.6 48 0.2
Interesting; in an environment that was clearly unfriendly to ZFS, ignoring all the best advice of my best experts, ZFS sort of hangs in there, in most tests burning less CPU than UFS while doing sequential I/O a bit slower, but disk rewriting immensely faster.
I want to run some more tests before I do any comparisons of Linux vs. Solaris; as I said, the Solaris numbers wobbled wildly every time I re-partitioned. These aren’t the fastest I saw, but I also saw some much slower ones.
Compression! · ZFS allows you to run your filesystem compressed. Here are a pair of runs, the first with compression turned on, the second with it turned on but the data randomized and thus pretty well un-compressable (the data that Bonnie writes by default is extremely compressable).
It shows two things. First, that compression can make your I/O scream; the main benefit isn’t saving disk space, it’s pumping less data back and forth to the disk, and thus saving time. This is the same reason that modern office-document formats use XML stored in a zipfile. Then there’s the second run, which is frankly weird and counter-intuitive... incredibly slow and using dramatically less CPU. Huh? None of these make sense. This shows that Bonnie and Solaris (maybe abetted by the partition weirdness) are somehow going off the rails together, and I have to do a deep-dive to figure out what’s happening.
-------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU Compr 20 105.2 56.6 166.4 34.6 101.2 21.7 110.8 81.0 327.7 40.9 62 0.3 Compr+R 20 25.0 15.6 27.3 6.3 17.4 3.1 42.1 29.1 43.3 3.0 42 0.1
There’s lots more work to do here. First I have to get another big disk.