What
 · Technology
 · · Storage

Time Machine Completed the Backup · Re­cent­ly, I ac­quired a Synol­o­gy DiskS­ta­tion and wired up a nice com­fort­ing Time Ma­chine-to-Synol­o­gy-to-S3-to-Glacier back­up da­ta flow. But then I start­ed to see “Time Ma­chine couldn’t com­plete the backup” with some­thing about “could not be ac­cessed (er­ror 21)”. Here’s how it got fixed ...
 
Network Storage · A cou­ple days ago in New Home Net­work I post­ed a re­quest for ad­vice on a home NAS box and net­work­ing hard­ware. Now I have the stor­age box, and boy was it ev­er easy and straight­for­ward and anxiety-relieving. If you haven’t done this al­ready, you might want to ...
[11 comments]  
New Home Network · Hol­i­day pro­jec­t: Redesign the do­mes­tic in­fas­truc­ture. Look­ing for: Net­work and stor­age gear. Got any ad­vice? ...
[22 comments]  
Maxed Book · My Google-issue Mac is pret­ty nice, but I de­cid­ed to im­prove it by swap­ping ob­so­lete op­ti­cal stor­age for not-obsolete-yet spin­ning rust. With bench­marks for the disk geeks in the crowd ...
[12 comments]  
Database Helper · I’ve been in a lot of Cloud-flavored dis­cus­sions re­cent­ly about what kind of Platform-as-a-Service of­fer­ings might hit sweet spot­s. On sev­er­al oc­ca­sion­s, Peo­ple Who Should Know have said things like “A huge pro­por­tion of app­s, even re­al­ly big app­s, can coast along just fine on a sin­gle MySQL in­stance with help from mem­cached.” Some num­bers crossed my radar to­day that would tend to sup­port that the­o­ry; and they’re sort of as­tound­ing ...
[4 comments]  
Tab Sweep — Tech · Here­with glean­ings from a cir­cle of brows­er tabs fac­ing in­ward at the world of tech­nol­o­gy. Some are weeks and weeks old: Am­ber Road, Clo­jure, tail re­cur­sion, cloud­zones, deep pack­et in­spec­tion, and key/­val­ue mi­crobench­mark­ing ...
[5 comments]  
2008 Disk Performance · I did some re­search on storage-system per­for­mance for my QConf San Fran­cis­co pre­so and have num­bers for your delec­ta­tion (pic­tures too). Now is a good time to do this re­search be­cause the hard­ware spec­trum is shift­ing, and there are a bunch of things you might mean by the word “disk”. I man­aged to test some of the very hottest and lat­est stuff ...
[10 comments]  
Transparent Storage · In prepa­ra­tion for that Disk Per­for­mance piece and ac­com­pa­ny­ing keynote last week, I spent quite a bit of time with the new Uni­fied Stor­age “Fishworks” An­a­lyt­ics Soft­ware, which is fas­ci­nat­ing stuff. Here­with an il­lus­trat­ed re­port ...
[2 comments]  
2008 Storage Hierarchy · We call them “computers”, but the soft­ware and hard­ware are over­whelm­ing­ly con­cerned with stor­ing and re­triev­ing data. Yesterday’s Disk Per­for­mance re­search fit in­to a larg­er con­text in the QConf pre­sen­ta­tion; a sur­vey of all the lev­els of stor­age that make up a mod­ern sys­tem ...
[8 comments]  
Storage 7000 · This is cer­tain­ly our biggest an­nounce­ment of the year so far; just pos­si­bly the biggest since I showed up here in 2004. The of­fi­cial name is the “Sun Stor­age 7000” and there are three sys­tems in the line-up. As usu­al, the re­al ac­tu­al tech­nol­o­gy news is in the blogs; the hub is at the Stor­age News blog, but I’d start with the co-conspirators: Bryan Cantrill’s Fish­work­s: Now it can be told and Mike Shapiro’s In­tro­duc­ing the Sun Stor­age 7000 Series. I have some opin­ions too ...
[4 comments]  
Memories · I’ve got this new Mac Pro, and the 2G it came with just isn’t go­ing to do the trick. Last week, both Lau­ren and I were in the Val­ley, at dif­fer­ent Sun meet­ings. So one lunchtime, we snuck away to geek-shop. I picked up 4G of high-performance RAM at S.A. Tech­nolo­gies, a lit­tle mem­o­ry spe­cial­ist that I to­tal­ly rec­om­mend, their prices are pret­ty hard to beat. It cost about $360 in­clud­ing tax. On the way back, we stopped at a big tech em­po­ri­um for some oth­er odds and end­s, and at the check­out they were ad­ver­tis­ing high-capacity USB disks for not much; Lau­ren picked up 8G for $29.99. That’s quite a pric­ing spread.
[8 comments]  
Online Data · That S3 out­age sure con­cen­trat­ed people’s mind­s. And al­most si­mul­ta­ne­ous­ly, EMC an­nounces that they’re get­ting in­to cloud stor­age. It’s ob­vi­ous to me that we’re nowhere near hav­ing worked out the eco­nomics and safe­ty and per­for­mance is­sues around where to put your data. There are some ar­eas of clar­i­ty; geek über-photog James Dun­can David­son, in The Eco­nomics of On­line Back­up, shows that for a per­son with a ton of per­son­al data, the on­line op­tion is re­al­ly unattrac­tive. And you do hear sot­to voce rum­bles about go­ing on­line in the geek hall­ways, for ex­am­ple “Amazon web ser­vices: 3x the price, 0.5x the re­li­a­bil­i­ty, and low­er scal­a­bil­i­ty than DYI. Buy on­ly for the low capex and lead time.” That’s from Stanislav Shalunov, who by the way is a damn fine Twit­ter­er. The big ques­tions re­main open.
[3 comments]  
Them Bits · Ear­li­er this evening, I fin­ished scan­ning the slides I have that my Dad took. That’s a lot of slides and a lot of bit­s. With ob­ser­va­tions about Wal-Mart and Ubun­tu and the end of op­ti­cal stor­age ...
[10 comments]  
Slow Bonnie · I’ve been notic­ing that it takes longer and longer to get a mean­ing­ful Bon­nie run. To make sure you’ve bust­ed the filesys­tem caching and are ac­tu­al­ly do­ing I/O, you need to use a test file two or three times the size of sys­tem mem­o­ry. Which can eas­i­ly get in­to a cou­ple of hun­dred gigs on a se­ri­ous serv­er these days. And while I/O has been get­ting faster, it still takes a while to pro­cess that much data; and Bon­nie does it five times. So, the ra­tio that gov­erns Bon­nie test­ing time is some­thing like memory-size over I/O-performance. Thus we ob­serve that, pro­por­tion­ate­ly, mem­o­ry size has grown faster than I/O speed. Thus, mem­cached and friend­s.
[1 comment]  
Seeking Basement Disk · Dear LazyWe­b: we’re look­ing for a great big honkin’ stor­age serv­er to sit on the home net­work and be a back­up pool for the mot­ley crew of com­put­ers around the house: Mac, So­lar­is, Ubun­tu, & Win­dows. Si­mon Phipps has a Buf­fa­lo Ter­aS­ta­tion and is very hap­py with it. On the oth­er hand, there’s a wi­ki which sug­gests it’s kind of loud, and it’s go­ing to be hard for us to get it be­hind closed doors. Might the Net have a sug­ges­tion?
[38 comments]  
Postmodern Litigation · Wel­l, it’s all over the news; we and NetApp are in court. Blec­ch. There is one in­ter­est­ing side-note in this drea­ry sto­ry, a first I sus­pec­t: NetApp’s CEO pro­vid­ed col­or com­men­tary on his blog (no link­age from me to blog­gers who are su­ing us). And then lat­er on to­day, on our of­fi­cial PR blog, ap­pears Sun re­sponse to NetApp law­suit which says, more or less, “In yo face”. Now, I guess, it’s over to the lawyer­s. [Up­date: As of now, I’m re­ject­ing all com­ments on this one. There were a pile in the in-basket this morn­ing, and a cou­ple were en­tire­ly in­ap­pro­pri­ate in a mat­ter in­volv­ing lit­i­ga­tion, and I sud­den­ly be­came un­com­fort­able try­ing to make judg­ment call­s. So, sor­ry, but let’s just leave this.]
[Up­date: I think Bryan Cantrill’s DTrace on ONTAP? de­serves a link, since Bryan was one of the guys who built the tech­nol­o­gy that’s now in play in court.]

[1 comment]  
ORM Bien Phu · I thought the laugh line “Object-Relational Map­ping is the Viet­nam of Com­put­er Science” was an­cien­t, but Ted Ne­ward claims that he made it up in 2004. Ted has writ­ten an im­mense, de­tailed, es­say on the sub­jec­t, The Viet­nam of Com­put­er Science, which, just to be thor­ough, in­cludes a cap­sule his­to­ry of the Viet­nam con­flic­t. This ought to be re­quired read­ing for all Com­put­er Science un­der­grad­s, so they’ll at least be fore­warned be­fore they stum­ble in­to their own pri­vate South­east Asi­a. Bonus: in the com­ments, the first com­menter asks “If ORM = Viet­nam, does SOA = Iraq?”
 
No Database!? · Re­cent­ly, in dis­cus­sion of a de­sign for a com­ments sys­tem, I not­ed that I wasn’t plan­ning to use a database, and I even al­lowed my self a lit­tle fun sneer­ing at the idea. I got sev­er­al reasonable-sounding emails from reasonable-sounding peo­ple say­ing “Why on earth wouldn’t you?” Here’s why ...
 
The Databox · After I re­port­ed on the Thumper an­nounce­ment yes­ter­day, Si­mon Phipps wrote: I want one. I kind of snick­ered, think­ing “Simon, get re­al, that suck­er weights 77kg and prob­a­bly sounds like a 747.” But last night, co­in­ci­den­tal­ly, I ran a back­up, which pro­voked thought, and you know, I think Simon’s right, I think there’s a huge open­ing for a con­sumer prod­uct in this space. [Up­date: Hah! Bill Pierce specs out a Databox, it’ll cost you $2,312.33; dig it!] ...
 
WinFS · Wow, it’s dead. You have to be sad when any­thing goes south that so many peo­ple have worked on so hard for so long. Stil­l, I re­mem­ber be­ing told in the ear­ly Nineties, when I was talk­ing up Unix server­s, that I was sil­ly and wrong be­cause the Cairo ob­ject filesys­tem would make ev­ery­thing else ir­rel­e­van­t. And then years lat­er, when I was sell­ing search and con­tent man­age­ment for a liv­ing, be­ing told once again that we’d all be ca­su­al­ties of the WinFS band­wag­on. I won­der if, in oth­er pro­fes­sions as in ours, the con­ven­tion­al wis­dom is so of­ten so wrong? [Up­date: Lots of thought­ful cov­er­age: The OS Re­view, Devel­op­ing on the Edge, The Fish­bowl, Dare Obasan­jo, Si­mon Phipps.]
 
The RAID in the Mirror · If you have lots of da­ta to store and are fig­ur­ing out how to lay out your disks, check out Roch Bourbonnais’ WHEN TO (AND NOT TO) USE RAID-Z. (Hey Roch; could you find a slight­ly less bru­tal way to for­mat your blog?) For RAID & filesys­tem wonks on­ly. It’s a lu­cid, quan­ti­ta­tive ex­pla­na­tion of the trade-offs be­tween mir­ror­ing, strip­ing, and RAID-ing. Some of the nar­ra­tive is ZFS-specific, but I sus­pect that the lessons are pret­ty gen­er­al. Out there in the re­al world of pro­duc­tion ap­pli­ca­tion­s, you’d be sur­prised how of­ten it is, when you’re wait­ing for a slow ap­p, you’re wait­ing for the disk, not the CPU. This stuff mat­ter­s.
 
Flat Files Rule · Yes, databas­es are use­ful. But there are a lot of good rea­sons not to use them: they’re a lot of work to ad­min­is­ter and it’s very easy to make them run slow. Par­tic­u­lar­ly when the al­ter­na­tive, or­di­nary flat files in an or­di­nary di­rec­to­ry tree, is so in­cred­i­bly use­ful. For more ev­i­dence, see Tim O’Reilly’s re­portage on the sub­jec­t, with in­puts from Mark Fletch­er (Blog­li­nes) and Gabe Rivera (Me­me­o­ran­dum). Note that both of them are sup­ple­ment­ing their flat files with memory-resident da­ta stores; it’s a pow­er­ful com­bi­na­tion. Now if Mark would on­ly put some of that pow­er­ful ma­chin­ery to fix­ing Bloglines’ bro­ken Atom 1.0 han­dling...
 
JDiskReport · Hey, this is cool; it’s a lit­tle doo-hickey that draws pie charts and graphs of what you’ve got on your disk. I won­der on what set of hard­ware/OS com­bi­na­tions the web-start Just Works like it did on my Mac? The pie-charts of my life were so cool I had to pub­lish a few. And I turned up a re­al prob­lem, too ...
 
More ZFS Data · I see that Dana H. My­ers has been dig­ging away at ZFS per­for­mance us­ing the on­ly met­ric that re­al­ly mat­ters to the re­al geek: OS build per­for­mance. The num­bers are in­ter­est­ing... I’m sur­prised that com­pres­sion made so lit­tle dif­fer­ence, both source and ob­ject code com­press quite well (I just ran a lit­tle test: the Emacs bi­na­ry com­pressed to 18% of its size, a bunch of Ja­va code to 19%.) Maybe the fact that it’s zil­lions of lit­tle files means that the file open/cre­ate over­head dom­i­nates the ac­tu­al in­put/out­put time? There is no doubt there is a huge amount of work to be done on I/O per­for­mance, both un­der­stand­ing it and im­prov­ing it. But ZFS is in­creas­ing­ly look­ing like a step for­ward.
 
Protecting Your Data · I was watch­ing a mailing-list dis­cus­sion of back­up soft­ware, and how of­ten you should back up, and based on some decades’ ex­pe­ri­ence, found some of the think­ing slop­py. Here are my life lessons on keep­ing your da­ta safe while as­sum­ing that The Worst Will Hap­pen. Some of it is Macintosh-specific, but there may be use­ful take-aways even from those part­s, even for non-Mac-hacks ...
 
Filesystem Lessons · I had the idea that I’d chop up the disk on my Ul­tra 20 in­to a bunch of par­ti­tions and do some filesys­tem per­for­mance test­ing with UFS and ZFS and Ex­t3 and Reis­er. This turned out to be a re­al­ly bad idea, but I still got some in­ter­est­ing num­ber­s ...
 
Bonnie Z · In case you hadn’t no­ticed, yes­ter­day the much-announced ZFS fi­nal­ly shipped. There’s the now-typical flur­ry of blog­ging; the best place to start is with Bryan Cantrill’s round-up. I haven’t had time to break out Bon­nie and ZFS my­self, but I do have some raw da­ta to re­port, from Dana My­ers, who did some Bon­nie runs on a great big honkin’ Dell [Sure­ly you jest. -Ed.] server. The da­ta is pret­ty in­ter­est­ing. [Up­date: Another run, with com­pres­sion.] [And an­oth­er, with big­ger data. Very in­ter­est­ing.] ...
 
An Evening With Bonnie · Like al­most ev­ery­one, I have a long list of things that I re­gret not hav­ing done, and mine in­cludes writ­ing a Unix filesys­tem. So in­stead, I mea­sure ’em, with the help of my old friend Bon­nie. I just spent some time ad­dress­ing the ques­tion: “How much does FileVault slow down a Macintosh?” And turned up a cou­ple oth­er in­ter­est­ing re­sult­s, too, in­clud­ing a fair­ly startling three-way OS X/Lin­ux/So­laris com­par­ison. [Up­date: Many read­ers write on the sub­ject of Lin­ux and hd­par­m(8).] ...
 
Bonnie 64 · Fif­teen years ago I wrote a lit­tle filesys­tem bench­mark called Bon­nie. I hadn’t main­tained it in years and there are a few in­ter­est­ing forks out there. Sud­den­ly, by ac­ci­dent I found my­self fid­dling with the Bon­nie code and I think I’m go­ing to call the new ver­sion “Bonnie 64”. Here­with, for those who care about filesys­tem per­for­mance, the de­tail­s ...
 
Moore Who? · Cy­berspace is buzzin’ tonight over the re­lease of the Reis­er4 filesys­tem, which seems to be pret­ty hot stuff. I was look­ing at their bench­marks page and was charmed to see an ap­pear­ance by Bon­nie++, a di­rect de­scen­dent of the orig­i­nal Bon­nie men­tioned here just the oth­er day. The bench­marks sug­gest that on a good com­put­er with a mod­ern filesys­tem, you can ex­pect to get 130 or so ran­dom seek­s/sec­ond in 1G of data, 105 in 3G. That’s not bad... in fact it’s three or four times faster than the best re­sults I was able to get in 1990 (search for “asymptotically”). Check out the com­put­ers I ran that on, they’re mu­se­um pieces. Per Moore’s law, in four­teen years the CPUs ought to have sped up by a fac­tor of 214/1.5=645 or so. Yep, one of them was a 4MHz 386, 4MHz×645=2.58GHz, damn that Moore is smart. I hap­pen to re­mem­ber that of the orig­i­nal com­put­ers I bench­marked, the biggest had 64M of mem­o­ry. If you ap­plied the same mul­ti­pli­er (645) to the mem­o­ry, you’d get 4.1G, quite a rea­son­able fig­ure for a big mod­ern Unix box. I think the les­son is ob­vi­ous: for high-performance ap­pli­ca­tion­s, keep your da­ta away from those filthy disks, no mat­ter what filesys­tem, use mem­o­ry.
 
author · Dad · software · colophon · rights
Random image, linked to its containing fragment

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.