I did some research on storage-system performance for my QConf San Francisco preso and have numbers for your delectation (pictures too). Now is a good time to do this research because the hardware spectrum is shifting, and there are a bunch of things you might mean by the word “disk”. I managed to test some of the very hottest and latest stuff.
Methodology · Even after all these years, I still like to measure with Bonnie. Yeah, it’s old (18 years!) and is a fairly blunt instrument, but it has the virtue that you don’t have to think very much before running it, and I’m still proud of how clear and compact the output is, and I still believe that the very few things it measures are really useful things to measure.
I’m not alone, either, Last week at ApacheCon, during the httpd-tuning talk by Colm MacCárthaigh, he talked about using it (the Bonnie++ flavor) to get a grip on filesystem performance. He said, looking kind of embarrassed, something along the lines of “Yeah, it’s old and it’s simplistic but it’s easy to use and has decent output.” [Smile]
Also, Steve Jenson has been using it to look at MacBook Pro filesystem performance, see More RPMs means faster access times. No news there. (Hey Steve, it’s OK to cut out all Bonnie’s friendly in-progress chat about how it’s readin’ and writin’, and just include the ’rithmetic.)
And hey, just to brighten up this dry technical post, here’s a picture of Bonnie Raitt, after whom the program is named. She’s older than me; doesn’t she look great?
What Does “Disk” Mean? · I think it can mean three distinct things, these days:
A plain old-fashioned spinning-rust disk system attached directly to your computer through some sort of bus connection.
(This is new) A solid-state disk (SSD) device; essentially flash memory packaged up to look like a disk.
A network-accessed storage device, like for example the Storage 7000 storage-appliance line Sun just announced, which might well include both traditional and SSD storage modules.
Systems Under Test · There are four different tests here, representing (I think) a pretty fair sampling of the storage options system builders have to choose from. The titles in the next few sub-sections correspond to the row labels in the summary table below.
MacPro · This is my own Mac Pro at home that I use for photo and video editing. It’s a meat-grinder; dual quad-core 2.8GHz Xeons, 6GB of RAM. There’s one 250G disk; whatever Apple thinks is appropriate, which bloody well better be pretty damn high-end considering what I paid for this puppy.
T2K · This the Sun T2000 hosts for the Wide Finder 2 project; eight 1.2GHz cores, 32G of RAM, two 170G disks; whatever Sun thinks is appropriate. There’s a single ZFS filesystem splashed across them, taking all the defaults.
7410 · This is a Sun Storage 7410 appliance, the top of the line that we just announced. It has an 11TB filesystem, backed by some combination of RAM and SSDs and spinning rust. They gave me a smaller box with 8G of RAM to run the actual test on, connected to the 7410 via 10G Ethernet.
IntelSSD · This is one of the latest-and-greatest; in fact the very one that Paul Statamiou recently wrote up in Review: Intel X25-M 80GB SSD. It’s attached to a recent 4G MacBook Pro, which Paul also reviewed. What happened was, I filled out Paul’s contact form and wondered politely if he’d be open to doing a Bonnie run. He wrote back with the output; what a guy.
The Table · There are notes below commenting on each of the four lines of numbers but, if you’re the kind of person who cares about this kind of thing, take a minute to review them and think about what you’re seeing.
-------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU MacPro 12 64.7 82.0 67.4 10.0 29.8 5.0 64.8 76.7 67.9 6.5 190 0.7 T2K 100 20.5 100 150.1 100 61.4 64.8 19.8 98.9 148.9 76.7 214 10.7 7140 16 121.5 97.7 222.2 51.0 75.3 27.2 100.0 95.6 254.2 47.3 975 76.6 IntelSSD 8 44.8 66.4 69.3 12.8 51.5 10.7 73.4 94.3 246.0 27.0 7856 43.2
Mac Pro Results · Given that Apple’s HFS+ filesystem is held in fairly low regard by serious I/O weenies, these numbers are not too bad. Salient points:
On this system with its heavy-metal CPUs, I/O is not in the slightest CPU-limited; the per-char and block input and output rates don’t differ much. Remember that Bonnie’s %CPU numbers are percentage of one CPU, and the Mac pro, like several of the other boxes under test, has lots.
The maximum input and output rates are about the same. This is actually a little surprising; most modern I/O setups, including the others under test here, exhibit some asymmetry.
That under-30M/sec number for in-place update of a big file is pretty poor.
The ability to seek almost 200 times/second is quite cheering; as recently as a single-digit number of years ago, it was really hard to find hardware that could seek more than 50 times per second. Since disk subsystems benefit only slightly from Moore’s Law, these performance increases are pretty hard-won.
T2000 Results · This thing has a much slower and wider CPU than the Mac Pro, and a massively more ambitious I/O subsystem; it’s designed for life as a Web server.
The kind of single-threaded I/O that Bonnie does (so do lots of other apps) is totally CPU-limited. See the per-character input and output, where the performance is lousy while one of its many cores is maxed. Even on the block-I/O tests it looks like the CPU may be the bottleneck.
Despite the CPU bottleneck, this box clearly has massively more I/O bandwidth than the Mac, the block I/O numbers are more than twice as high. There’s no indication we’re maxing out the I/O bandwidth; if we got a few more cores pumping and tuned the ZFS parameters, I bet those numbers could be cranked way up.
7410 Results · Remember, in this one, there’s a (fast) network in between the computer and the disk subsystem.
The numbers are really, really big. Way north of 200M/second both in and out, nearly a thousand random filesystem seeks a second. Yow.
The bottleneck here isn’t obvious: We had a close look with the Fishworks analytics (more on that later), and it was clear that the 10GigE link had lots of headroom and the 7410 itself was barely breaking a sweat. So something in the client or (quite likely) its network adapter was holding things back. Technically, it may not be correct to call this I/O “CPU-limited”, but it’s certainly some aspect of the single-threaded client that’s holding things back.
It shouldn’t be a surprise, but there’s an important lesson here: Given modern storage back-ends and network infrastructure, single-threaded programs are just not going to be able to max out the available I/O throughput. Of course, the 7410 is designed to serve a whole lot of threads and processes and clients; the total bandwidth this puppy could deliver under a serious load ought to be mind-boggling.
Intel SSD · Um, one of these things is not like the other, and this would be the one.
The output numbers are just not that great. I’m not sure what’s wrong here; maybe HFS+ is getting in the way? Also this is a after all a notebook; high-volume output may not have been the design center.
The block-input number, at nearly 250M/sec, is pretty mind-boggling in a notebook. But you have to be smart and do block not per-char I/O; once again, it’s easy to get into CPU-limited I/O mode.
As for the random-access number... words fail me. I’ve never seen numbers like this on any disk-like storage device ever; nearly 8000 seeks/second. This is into getting into territory that’s competitive with memcached and friends.
And, unlike the other local disks under test, this class of device has Moore’s Law in its corner; so the price, capacity, and performance will all be moving in the right direction. Ladies and gentlemen, you are looking at the future.
SSDs are gonna win. They have fewer moving parts, better performance, and Moore’s Law on their side. Plus they burn less energy.
If your application is I/O-bound, and lots are, you’re going to have to go parallel, and be smart about doing block I/O.
It’s easy to be bottlenecked on your network link or your storage client performance. It’s getting harder and harder to actually max out the raw throughput of a big-league storage back-end.
What a great time to be in this business.