Distributed Hardness

The other day this, from Mathias Verraes, got thousands of retweets.

“There are only two hard problems in distributed systems: 2. Exactly-once delivery 1. Guaranteed order of messages 2. Exactly-once delivery.”

That’s really funny, but it dawns on me that it might not be obvious why. (Well, except to cloud infrastructure geeks.)

It’s like this: In the cloud, you’d like to operate at large scale with high reliability. (Take me out and shoot me if I’m ever heard uttering the phrase “infinite scale” unironically.)

Lists · Here’s a list of the ways that you can munch data at a scale that is, for the purposes of most apps, unbounded:

Sharding.

Which is to say, instead of trying to make one computer really fast, you take racks-full of ordinary computers (“shards”) and deal out the incoming data among them.

Here’s a list of the ways that you can store data very reliably:

Replication.

So the messages or transactions or clicks or whatever flow in at an insane rate and, through the magic of DNS trickery and load balancers, get dealt out into shards, each of which can soak ’em up at some well-understood rate; and if they start flowing faster, you add more shards.

Since you want to be really sure that not one single precious click gets lost, you replicate, storing copies of each blessed little bundle of bytes on multiple widely-separated boxes.

The software that I know best which illustrates these principles is Amazon Kinesis. (I just use it, I don’t build it or know the details about how it works.) Apache Kafka is an open-source alternative, said to be good. Kinesis will soak up frightening volumes of data and is very unlikely to lose any.

If you’re observant, you will have noticed that neither of the lists above has an entry #2. Which is OK, because in practice both these approaches work pretty damn well; as witness the cheery growth of my employer and its competitors.

Ointment, meet flies · There. Are. Limitations. For example, the shards are (quite possibly) being read by different machines, and those machines may not run at exactly the same speed, and the incoming bits might not get dealt out exactly evenly among the shards. So the data might not always come out in the same order it went in.

And that replication? Well, we do it because sometimes boxes crap out. Which is OK because your bits are replicated, remember? Except for maybe the box that crapped out was network gear and so two boxes with replicas think each other are down and both helpfully deliver the goods. Which means that your bits won’t get lost, but sometimes they might go in once and come out more than once.

“But wait!” you exclaim, “these messages are financial transactions and they have to be delivered exactly once in the right order!”. And while you can scale that too, the basic shard-and-replicate stick may not turn out to be exactly the precision instrument you were looking for.

Be of good cheer: Hope is not lost. There are tricks (mostly at the application level) involving unique identifiers and progressive version stamps and being really careful about idempotence. Scaling out remains possible; but pricier, trickier, and you have to know what you’re doing.

The really good news is that lots of applications these days can survive dupes and ordering glitches just fine. If you’re doing statistics or metrics or certain classes of auditing and logging, scaling and retention has become a matter between your credit card and your cloud vendor.

Contributions

Comment feed for ongoing:

From: John Cowan (Aug 28 2015, at 12:37)

“These messages are financial transactions and they have to be delivered exactly once in the right order!”

In fact, banks have had systems to recover from unreliable, duplicated, and out-of-order actions long before computers were ever thought of. Part of the point of double-entry bookkeeping (which is about 500 years old) is to make error recovery straightforward. If a bank accidentally pays a check twice that sends your checking account into overdraft, it can recover the money from the payee, undo the loan automatically made from your revolving credit line, refund the interest charged on that credit line, and inform the credit bureaus that its downcheck of your credit rating was unjustified, all quite automatically once the error is caught, and even if months and thousands of transactions have passed since the error was made. (This actually happened to me.)

[link]

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

August 26, 2015
· Technology (90 fragments)
· · Concurrency (76 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!