I’ve been trying to avoid editorializing in this “Java-of-concurrency” series. But since people are grumbling about my biases, here are a few notes about how I see things at this (early, I hope) stage in the process.
[This is part of the Concur.next series.]
Why Me? · I think I’m well-positioned to do this research and write-up. First of all, I’ve written a bunch of highly concurrent code over the years (Web crawlers, in-memory data-viz database, one specialized network-management database kernel). So I have a bitter personal appreciation for how hard it is to get this stuff right and debug it, and consequently, how unreasonably long it takes to ship robust working code.
Second, I’m not a language or platform designer, so I don’t really have a horse in this race. Granted, my employer does ship a very nice line of highly-parallel servers, but you think that’s just a Sun problem you’re nuts.
Also, I have a nice little sample project around, the Wide Finder. More on that later.
History · I’m proficient in C, Java, and Perl, can get along in Ruby, have written snippets of working code in Python and Prolog and a couple of Lisps and Erlang.
As regards concurrency, I will freely admit, as I start to dig into this, the following biases:
In favor of problems that involve persistent stores and network communication.
In favor of the *n*x and Open-Source ecosystems, accompanied by a visceral aversion to .NET.
In favor of Test-Driven Development.
Against abstractions that try to hide too much.
In favor of object-orientation; history shows that programmers are on average capable of understanding this sufficiently well to produce acceptably-good software in accceptably-short timeframes.
Having had essentially no experience in conventional “HPC”, which is to say I’ve never been near MPI or any of its ilk. I don’t see a strong connection there to the kind of general-purpose business apps I want to run faster on general-purpose multicore business computers, but I could be wrong on that too.
Erlang · Also, in the context of the HECS languages, I’m obviously biased in favor of Erlang/OTP if only because it’s the one that I’ve actually used. Here are some of the things I like about it:
It’s a small language; there’s not much to learn.
To borrow some Rails-speak, it’s opinionated: Erlang thinks there’s exactly one way to build concurrent software, and that’s with processes and messages and no global data. If your problem can work that way, you’re in good shape with it.
I personally find the process/messaging model of the world fits neatly into my mind and provides a useful basis to think about solving problems. I’m not assuming this will be true for everyone; see below.
The implementation is remarkably solid and seems to bend gracefully under pressure rather than breaking. To the extent that there are metaphors or abstractions, the implementation seems to behave exactly in the way that they would encourage you to expect.
On the other hand, Erlang is hardly ready for adoption by the broad mainstream of developers. Out of the box, its file-handling is pathetic and its string processing facilities are putrid. Also, its syntax is irritatingly irregular; the use of comma, semicolon, and period as line-enders makes refactoring your code subject to almost-inevitable syntax errors.
And finally, I’m not quite ready to toss our decades of experience with object-orientation on the scrap heap.
Switching back to the positive, there’s the really big factor: Erlang has been shown to work on highly concurrent systems. To start with, in lots of pieces of telecom equipment. It’s in production at Facebook. I’m impressed.
Recent News · If you care about this stuff, and you haven’t already done so, find 56 free minutes sometime in the next day or so and listen to Rick Hickey’s presentation of Persistent Data Structures and Managed References; you won’t regret a minute of it. Rich walks through, in detail, how you can co-ordinate safe lock-free reasonably-simple access to shared data in a highly-concurrent situation.
Historically, my simplest-thing-that-could-possibly-work thinking has led me to “Don’t share data. Send messages.” But maybe sharing it is not so bad.
While you’re paying attention to Rick, check out his lengthy contribution to this conversation. I disagree with his assertion that I’m “focusing on same-(OS)-process concurrency”; I’m not religious on that subject at all. But Rich is definitely worth paying attention to.
I note in passing that one factor which has historically helped push certain new technologies into software’s mainstream is having an eloquent and appealing chief evangelist.
Functional Programming · Its advocates make two claims. First, that it makes truly concurrent computing tractable. I find this claim convincing.
Second, that it’s a better way to program because code which implements pure side-effect-free functions is easier to reason about and understand and re-use. Well maybe, but I don’t care. I don’t think we’re facing a huge urgent crisis in terms of being able to reason about the code we write. In fact, with the advent of TDD and higher-level languages and opinionated frameworks, we’re doing better at constructing pleasing applications in acceptable time-frames than any other time in my lengthy sojurn in this profession.
But I think we are facing a crisis in understanding how to make our everyday codes take reasonable advantage of the kinds of computers we’re now starting to build. So... Dear FP: You may have all sorts of wonderful and endearing characteristics. But I only want you for your concurrency hotness.
And when I run across things like, for example, the sideways dance Haskell goes through to generate random numbers, my eyes roll. On the other hand, I guess it’s way cool that the language lets me play with infinite lists. I guess.
Language Learning · I’m not going to say anything more about Clojure or Scala or Haskell until my understanding matures some. There’s lots of online material, and there’s the Wide Finder problem, which may not be perfect but I understand its trade-offs. There’s Scala code there now, and I guess I’ll just have to grind out some Clojure and Haskell as part of the education process.
The Berkeley Study · An extremely-rude commenter linked to the a 2006 U Cal Berkeley Technical Report, The Landscape of Parallel Computing Research: A View from Berkeley (PDF), which I’d read but then filed away somewhere. A lot of it is of questionable relevance (in particular I’m unconvinced by the selection of benchmarks) but I was struck by section 5.1, “Programming model efforts inspired by psychological research”. The idea is that what we’re trying to do here is equip actual real-world programmers to solve actual real-world problems. So, as we look at our laundry list of tools and strategies, it’d be nice to, you know, actually measure how good people are at using them.
So I’ve got an action item to track down a few of their references and anything else in the space.
End Game · I don’t know at the moment. I’m envisioning a fully-worked out write-up, with lots of links, on each of the Laundry List top-level items. Then maybe a big honking grid with all the actual languages and platforms classified according to which laundry-list items they implement. Perhaps these last two items belong on a wiki.
We’ll see where this goes.