The Web vs. the Fallacies

Here at Sun, the Fallacies of Distributed Computing have long been a much-revered lesson. Furthermore, I personally think they’re pretty much spot-on. But these days, you don’t often find them coming up in conversations about building big networked systems. The reason is, I think, that we build almost everything on Web technologies, which lets get away with believing some of them.

1. The network is reliable · HTTP helps here in a couple of different ways. Most obviously, connections are brief; I’ve never seen much in the way of measurements, but I’d expect the average connection lifetime to be under a second. Compared to a traditional networked system with connections whose lifetime approximates that of the application, this moves the likelihood of experiencing a damaging connection breakdown while application code is running from “essentially always” to “rather rarely”. ¶

Second, the clarity about GET, PUT, and DELETE being idempotent, while POST isn’t, helps hugely. Most obviously, if a GET gets a network blowup, just do it again. And if the breakage hits a POST, well, it probably only hits one, and this places very clear boundaries around the repair and recovery that an app needs to handle.

2. Latency is zero · The Web actually makes the latency problem worse, because every interchange, on average, requires connection setup/teardown. Oddly, its population of users has apparently internalized this, and is keenly aware of the difference between the normal single-digit number of seconds it takes for a nicely-functioning server to assemble request results and get them to you, and behavior under overload. ¶

Since it’s not realistic to expect anything like keystroke-level latency across the Net, the correct engineering solution is to defuse the expectation.

3. Bandwidth is infinite · Once again, the Web has been a wonderful teacher of networking realities to the non-technical. Time after time, you’ll see messages, between computing civilians, of the form “Sorry that this picture is so big” because they know perfectly well that it’s going to slow down the experience of seeing it. ¶

4. The network is secure · This is probably the fallacy least-well-addressed by the Web. True, people have become more aware that There Are Bad Guys out there, and they need to be careful. But not nearly enough. ¶

Also, let’s grant that TLS, properly deployed, has been pretty well Good Enough to run apps in a mostly-secure way in a hostile environment. But who among us would be surprised if someone turned up a catastrophic flaw, perhaps not in TLS itself, but in one or two widely-deployed implementations? Who’s to say that someone hasn’t, already?

Anyhow, the Web technologies mean that application builders can survive even while subject to one or more of The Fallacies. But not this one.

5. Topology doesn't change · By making almost all our apps Web-based, and thus having everyone address everything with URIs, we all agree to share solutions to routing and addressing problems; solutions provided by the DNS, the network stack, and the Internet backbone operators. This doesn’t mean the solutions are easy or cheap or perfect; it just means that application builders almost never have to think about the problem. ¶

6. There is one administrator · Well yeah, there isn’t. But who cares, any more? Web architecture makes addressing decentralized. Thus when an administrator screws up, or imposes policies that seem good to him or her and insane to you, the damage is limited to that person’s URI space. ¶

Also, Web architecture, which requires that you talk about things in terms of the URIs you use to address them and the data formats you use to transmit them, makes it a whole lot easier to achieve administrative coherence even when there are millions of administrators.

7. Transport cost is zero · (See #3 above.) ¶

8. The network is homogeneous · This is perhaps the Web’s single greatest triumph. For decades we thought we could extend object models and APIs and lots of other programming concepts over the network. This was a fool’s errand, because all you can do with a network is send messages over it. ¶

The Web doesn’t do APIs and object models, it’s just a set of agreements over what messages you’re going to send and what messages expect back in return. Which, as a side-effect, makes heterogeneity a non-issue.

History · Did TimBL and Roy Fielding and Marca and those guys think, when they were pasting together the Web, “We’re about making those Fallacies of Distributed Computing irrelevant”? ¶

I doubt it; they were just trying to ship hypertext around.

Conclusion · If you’re building Web technology, you have to worry about these things. But if you’re building applications on it, mostly you don’t. ¶

Well, except for security; please don’t stop worrying about security.

Contributions

Comment feed for ongoing:

From: bingo (May 25 2009, at 15:13)

re 6: The web DOES only have one admin. And he lives in Pakistan.

[link]

From: Christopher Mahan (May 25 2009, at 23:08)

I would say in addition to #2 that fallacy 2A is "latency is stable"

Latency can vary widely at random.

[link]

From: Martin Probst (May 25 2009, at 23:45)

Very interesting in that regard is also the paper "A Note on Distributed Computing", by Jim Waldo et alii, Sun.

They argue, and rightly so IMHO, that trying to make distributed programming somehow invisible and implicit to a programming model is a failure waiting to happen. And people on the web seem to have understood this as all examples, snippets and so on make network operations explicit, with failure modes and all.

This is both a cultural thing and a technical thing, where the culture is that people understand the massive difference between local and non local operations, and the technical thing is that with HTTP as an application level protocol, we have a very solid foundation that covers many of the hard issues (relocating things, various modes of failure, idempotency and so on).

[link]

From: Steve Loughran (May 26 2009, at 00:46)

There are a couple of places where homogenity is still assumed

-that DNS works everywhere and resolves the right hostnames. Oddly enough, Java not only assumes this, it assumes rDNS works -that machines know who they are.

-that what works in the developer's browser/screen works everywhere else.

#2 is worst in the enterprise, where the developers still run IE6 and assume that they really can mandate that everyone sticks to it. And now, as their IE7 release nears, IE8 has shipped. If you want to play that game, switch to firefox for OS independence, and start testing properly.

[link]

From: Guy Murphy (May 26 2009, at 02:27)

Can you direct this at something like Erlang more specifically? I think addressing it at the Web approaches being a straw man argument... not quite, but heading in that direction :)

So how about Erlang vs. the Fallacies?

Now that's an piece of writing you know will grab commentary.

[link]

From: Christopher K (May 26 2009, at 05:20)

Specially to what Martin said but in compliance to the overall gist here: and the visibility of the networks connection's fragilities doesn't matter, because the local machines already are powerfull enough for the most widely used apps.

[link]

From: len (May 26 2009, at 07:08)

The one we seldom see: many of the problems of web applications are social problems which the web designer cannot solve but nevertheless will be asked to immunize the application against such.

And they have to do it. Security is the easy part comparatively.

The fun one to watch right now is 'virtual world currencies' colliding with 'social web standards' as the gleam of gold farming spreads like swine flu.

[link]

From: Neil Conway (May 26 2009, at 12:39)

Has the web really made the fallacies "irrelevant"? You concede that, if anything, #4 is even more important on the web.

Bandwidth is not free, and latency and transport cost are not zero; simply because the user population has come to accept these truths does not mean that they are somehow no longer relevant in systems design.

The web hasn't yet succeeded in making the network homogeneous, either: given the enormous effort that goes into making web content portable across client applications, I think we're still pretty far from this fallacy being irrelevant. And the network being unreliable is still a fundamental truth in distributed computing; I don't see how the web makes much of a difference here at all (idompotency existed long before HTTP/1.0).

So of the 8 fallacies, 1, 2, 3, 4, 7, and 8 are still very much in effect. The network topology is essentially flat, thanks to URIs -- that is probably true. But otherwise, I don't think the web has changed very much at a fundamental level.

[link]

From: Christopher Ferris (May 26 2009, at 14:09)

Actually, Peter's advice is even more important and relevant than ever.

If a GET fails, just do it again? Well, maybe not so much if the GET is retrieving a resource representation that is gigabytes in size. You just might want to consider some compensating strategies for that inevitability or you'll have unhappy campers.

Topology doesn't change in Web terms translates to "Cool URIs don't change". The problem is, that axiom is an admonition, not a constraint of the web. Too often, cool URIs DO change and not enough attention is paid to addressing that (via redirects and compensating means of locating a back-up service if need be).

The point of Peter Deutch's talk was that while it might appear to the naif that you could ship a working solution without due consideration of the implications of each fallacy that, inevitably, the implications of one or more would eventually rear its ugly head, usually at the most inopportune occasion.

The Web does not free developers from thoughtful consideration of Peter's sage advice. Frankly, it makes it even _more important_ that developers appreciate his pearls of wisdom.

[link]

From: Thomas Powell (May 26 2009, at 14:13)

I tend to agree with the last commenter about a number of these being in full effect. I think the problem being gone is a mostly head in sand lack of reliable measurement in terms of the failures and an over-reliance on the "meat layer" to fix everything with reloads or in some cases acceptance of status quo (slow loads).

I can tell you flat out that running Ajax based applications with fully instrumentation that network concerns are alive and well on the Web you just don't hear about them much because the lack of monitoring or in-page instrumentation out there.

I do think though that from the writers perspective we are in the "good enough" mode right now which is both good and bad. With problem resolution currently mostly "did you reload the page?" or "did you clear your cache?" I think we aren't as robust as we need to be. Take a look around and root cause analysis of site and Web application failures is the exception and not the norm.

BTW if you add in true dependency concerns for multiple concurrent Ajax requests and wow do you have fun - things really are variably out of order and basically no library supports a request queue let along a response queue. I had to make one just to show people what the problem was. And don't get me started about people in the dark about remove <script> include or even banner ad failures - people just don't have this data and synthetic benchmarks and monitors don't show what a particular end users sees so complaints for the wild are both rare (who to complain to) and impossible to track down (try correlating this and hope you even have the in-page instrument to do it).

Anyway from where I sit fallacies are still winning, but as long as we continue to deliver and improve it hopefully won't matter. The network delivery and application quality battle is far from over. User's will always want more. Infinite bandwidth doesn't solve infinite expectations :-)

[link]

From: Tim (but not THE Tim) (May 26 2009, at 21:30)

I think the post is completely accurate only as long as readers realize that the fallacies as posted are fallacies held by developers and that the "Web"/HTTP layer removes the need for those same developers to worry about them.

The worry has mostly been pushed down to the layer where we infrastructure folk live. My phrase at work for this is " the smoother and cleaner it gets for the users/developers, the more snarly and tangled it is underneath": for every technology layer added to make development life easy, there's usually a library, a config file, possibly a server to be integrated and managed.

I'm not saying it's _wrong_, nor complaining, it's just how it is.

Maybe there's a Law of Conservation of Complexity at work.

[link]

From: Tomislav Jovanovic (Jun 11 2009, at 15:36)

ugh, i think the last 3 comments are missing the point.

i don't think tbray was arguing that on the web: "latency is zero" (2nd fallacy), but instead, that even computing civilians are aware of latency issues, and thus, web developers are not likely to make the fallacy of assuming it as zero.

[link]

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

May 25, 2009
· Technology (90 fragments)
· · Web (397 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!