· · Coding
· · · · Java (25 fragments)
· · · · Text (13 fragments)
· I’m in the unaccustomed position of spending all my work time either writing software or in meetings about it. The current project is conventional big-system server-side Java. That’s not a landscape that anyone’s gonna wax lyrical over, but boy, have the tools ever come along while I was off Androiding and Go-ing and Rubying ... [4 comments]
R and G and J
· I just read Adam Jacob’s Rust and Go, comparing two new hotnesses. Me, I’ve been (unaccustomedly) working the last few months in a familiar codebase/toolset, on an Android app; so I thought I’d add an “…and Java.” ... [8 comments]
· I owe a whole lot to Perl. So does the practice of computing in general, and the construction of the Web in particular. Perl’s situation is not terribly happy; I wouldn’t go so far as to say “desperate”, but certainly these are not its glory days ... [22 comments]
Concurrent List Update With Shuffling
· This is a sketch of how to provide highly concurrent read and update access to sorted paged lists while requiring minimal locking. This particular trick has probably been covered before but if so I’ve missed it and haven’t seen anyone else using it ... [12 comments]
Forget the Defaults
· I was watching engineers argue, someone was bitching about a code reviewer asking him to put more parentheses in a conditional: “As if I don’t know the precedence rules!” I don’t know them, as a matter of principle ... [37 comments]
· Since I’m spelunking around the new-languages caverns these days, I really ought to mention the long-ongoing and very interesting Fortress, brain-child of our own Guy Steele, who knows one or two things about designing languages ... [5 comments]
Clojure N00b Tips
· Clojure is the new hotness among people who think the JVM is an interesting platform for post-Java languages, and for people who think there’s still life in that ol’ Lisp beast, and for people who worry about concurrency and state in the context of the multicore future. Over the last few days I’ve been severely bipolar about Clojure, swinging from “way cool!” to “am I really that stupid?” Herewith some getting-started tips for newbies like me ... [7 comments]
Tail Call Amputation
· This is perhaps a slight digression; just an extended expression of pleasure about Clojure’s recur statement. It neatly squashes a bee I’ve had in my bonnet for some years now; if it’s wrong to loathe “tail recursion optimization” then I don’t want to be right ... [25 comments]
· I’ve been thinking about test-driven development a lot lately, observing myself veering between TDD virtue and occasional lapses into sin. Here’s my thesis: As a profession, we do a lot more software maintenance than we do greenfield development. And it’s at the maintenance end where TDD really pays off. I’m starting to see lapses from the TDD credo as more and more forgivable the closer you are to the beginning of a project. And conversely, entirely abhorrent while in maintenance mode ... [26 comments]
Tab Sweep — Tech
· Herewith gleanings from a circle of browser tabs facing inward at the world of technology. Some are weeks and weeks old: Amber Road, Clojure, tail recursion, cloudzones, deep packet inspection, and key/value microbenchmarking ... [5 comments]
· This is a lengthy note to myself. I initially wanted to capture the thinking that went into the construction of mod_atom while it was still fresh in my mind, and dumped out the first dozen or so sections. Then as I expanded and refactored the code, I find that I’m keeping this up to date. This mostly by way of putting it in a place where I won’t lose it. I can write stuff for ongoing faster than for any other medium, and “On the Net” is a good place not to lose stuff. If mod_atom eventually gets picked up and used, this may be useful to me or anyone else who’s maintaining it; and if it doesn’t, there’ll still eventually be an AtomPub server module for Apache, and this might be useful to whoever builds it. But this is not designed to be entertaining or pedagogical; among other things, it’s in essentially random order ...
Build One to Throw Away
· This is a maxim from Fred Brooks’ The Mythical Man-Month. These days I’m thinking it’s the single most important lesson there is about software. It’s been brought rudely home to me by my recent work on mod_atom, whose design is terribly simple; but I still got the first cut wrong in important ways ... [9 comments]
· OSCON has traditionally featured bundles of short fifteen-minute keynotes. I gave one entitled “Programming-Language Questions” and loved the compressed intensity of the format. Unfortunately, my slides were irretrievably lost in the post-speech disk crash (yes, I usually drop ’em on a USB stick, this time I didn’t). There’s video online at blip.tv, but the quality is pretty basic, and I suspect that the on-screen URIs and code samples not useful. Here are all the missing pieces, should you want to watch it (only 15 minutes, remember); plus a little extra commentary ... [4 comments]
· Wow, this one touched a nerve. Some guys here at Sun were arguing about which bug trackers and SCM tools were currently da bombiest, and they decided to ask the world. Hasn’t received hardly any publicity yet, and already over 200 responses. Join in, and pass the word; Here is the survey and here are the results. [10 comments]
· I’m back to working on mod_atom and kind of gloomy because it’s C programming; so much more effort to do a unit of work compared, for example, to Ruby. In this context, Terry Jones’ Embracing Encapsulation got my attention and made me feel vaguely dinosaurian. While he may be on to something important, there are some consolations for us grizzled pointer-wranglers. For example, the list of things implemented in largely or wholly in C: Unix, Linux, Solaris, Windows, Java, .NET, Flash, Mozilla, Microsoft Office, the Apache server, the X window system, Perl, Python, and Ruby. Not bad, really not bad at all. [8 comments]
· I have started reading the Scala book (which doesn’t seem to have its own URL, but is for sale at the artima.com shop) and I have two remarks on programming-language books ... [14 comments]
NetBeans & C
· I’ve been using NetBeans for my mod-atom work for a while now, and while it was better than Emacs, the C support has still had a way to go ... [4 comments]
Autotools 1, Tim 0
· I’m going to do some more work on mod_atom, but I have a problem; it doesn’t work on Leopard. That’s OK, the Ape blows it up repeatably, so should be no biggie. Hmm, except for apr_global_mutex_create is acting weird, removing the lockfile while failing. Docs no help... OK, let’s look at the code. Urgh. Let’s use the debugger to see where it’s going. Well... that was a day and a half ago. Since then, been in a maze of twisty little passages. I’m beginning to think that Brian McCallister has a point in saying Autotools are the Devil. I used to know how to compile C code, sigh.
[Update]: Hey, check out the follow-ups. I think this Open Source stuff is going to catch on.
[Again]: Hah! Paul Querna’s suggestion not only made the compilation problems go away, the original bug vanished too. You know, that Apache community is first-rate. [2 comments]
Year-End Sweep — Tech
· Over the course of the year, in browser tabs, bookmarks, and del.icio.us, I’ve built up a huge list of things that I felt I should write about, at least at the time I saw them. Well, dammit, I’m not gonna let 2007 end without at least making a try. Here goes. Categorized, even ... [7 comments]
· I started out nervous with the idea of adding closures to Java, and I think I’ve slid firmly into the “contra” camp. One reason is that it seems increasingly a bad idea to add anything to the Java language. On a related subject, my brief sojourn in Ruby-land has changed my thinking about closures entirely ... [19 comments]
· I suppose I could have entitled this A General Model for Progress In Adoption of Popular Programming Languages. What happened was, I was composing a rant intended for use in an internal discussion of developer futures, and it dawned on me that there’s a repeating pattern in the waves of programming languages that manage to succeed in finding broad usage ... [14 comments]
· Today’s fashionable programming languages, in particular Ruby, Python, and Erlang, have something in common: really lousy error messages. I guess we just gotta suck it up and deal with it. But today I got something really, uh, special from Erlang ... [15 comments]
· Which is to say, NetBeans 6.0 Beta 1 is out. Looks pretty good so far, they even revised the Borg Cube logo. I’ve got a couple tabs with .rb files open, and three more that end in .h and .c. I understand it can be used with Java too ... [15 comments]
Reducing C Pain
· Despite my brutal minimalism, mod_atom is getting kind of big. The main file has a few dozen (mostly pleasingly-small) functions, and navigating around in it was starting to be a chore. I’ve been using Emacs, and I seem to recall that it has all sorts of navigation magic. But then I thought about NetBeans’ excellent “Navigator” tool, and that there’s supposed to be some new C-support code. So I installed it and it kind of works ... [4 comments]
· OK, here’s the problem. It’s a warm day, and kind of stuffy, and what you’re working on isn’t that interesting, and you’re really having trouble keeping a grip. Here’s the solution: double iced latte! ... [4 comments]
· Here are two considerations of the fact that trying to get work done in just one programming language is no more likely to be possible in the future than it has been in the past: Neal Ford’s Polyglot Programming and Martin Fowler’s Should we strive to only have one language in our development efforts? Both are worth reading, partly because they take up more general issues like concurrency and the old static/dynamic debate. But both conclude that yeah, no surprise, the future is multilingual. So how do we integrate ’em all? ... [1 comment]
Tech Tab Sweep
· We’re all over the map today, from general theories of software development to low-level optimized bit-banging. Well, all over the software map, I guess ... [2 comments]
· I was chatting with one of the NetBeans guys the other day and he said “BTW, it’s up to 6.7 million lines of code now”. Gack. Here is their public schedule. Wish them luck... the thought of managing a code-base like that makes my flesh crawl. [2 comments]
· I was having trouble getting my partially home-baked Ruby WSSE implementation to play nice with Hiroshi Asakura’s NTT server-side, so I asked him to send me his WSSE client code. I eventually got it to work (not 100% sure how) but at one point I was peering closely at Hiroshi’s code and thinking “What does that do?” and realized I wasn’t sure what programming language I was reading. Then I realized it didn’t really matter, they all look more and more like each other. It turned out to be C#. [6 comments]
· This is the O’Reilly project that I was working on in January in Australia. In terms of the co-authors’ coding achievements, I am in the rounding error way off to the right of the decimal point. Still, it was fun writing my chapter, and it’s a book I think I’d snap up even if I hadn’t been part of it. Greg Wilson spills the beans. [1 comment]
· Thanks to the commenters on the previous RX piece who recommended
ruby-prof (there’s a gem install), which is a much faster and thus better profiler than the built-in one. I learned a few more things ... [6 comments]
An RX for Ruby Performance
· This is an insanely long and gnarly essay about implementing, then optimizing, the low-level bits of a pure-Ruby XML parser. If you obsess about XML reading, deterministic finite automata, or Ruby code optimization, you may find some part of it interesting. There may perhaps be six people in the world who care about all three; others are warned that an attempt to read this end to end may lead to general paralysis and perhaps even clinical brain-death. [He’s not kidding. -Ed.] By way of compensation, I’ve tried to be offensive wherever the opportunity presented. [Update: Outstanding comment from Avi Bryant below, which he repeats and expands here.] ... [18 comments]
Data that Holds Still
· These last few days, I’ve been sketching in some code for an idea I have that you’ll hear about if it works. Unlike most of my recent projects, it’s got no network links or message-passing or socket-munging, it just processes some data and produces some other data. The main difference is, it’s incredibly easy to unit-test. There are lots of network-programming tasks where I just don’t even know how to unit-test, and where you can do it, it takes a lot of extra work and orchestration, and so the temptation to slack off can be irresistible, for me at least. Someone who really wanted to advance the state of the art in software could work on reducing the friction for developers who believe in TDD but have to write distributed code.
· That’d be Cédric Beust, who, writing both in my new comment system and his own space, declaims “The bottom line is that IDE’s for dynamic languages will *never* be able to perform certain refactorings, such as renaming” and asks “Who wants a refactoring IDE that ‘works most of the time’?” He closes with a major dynamic-language diss: “I'm convinced that they are not suited for large-scale software development”. Obviously a fun-loving fellow. Geek girls and boys, I think that man is getting in yo face. [21 comments]
: in front of constant strings. And forgetting parentheses. Especially empty parentheses. And semicolons. Especially semicolons. Bloody stupid useless semicolons.
Spolsky Starts a Language War
· In Joel Spolsky’s new Language Wars, he argues that .NET, Java, PHP, and maybe Python are the safe choices if you’re going to build out a Web app that’s really big and really critical. He ices this cake with a shovelful of classic FUD aimed at Ruby and Rails. Not surprisingly, David Heinemeier Hansson volleys back twice with Fear, Uncertain, and Doubt by Joel Spolsky and Was Joel’s Wasabi a joke? Bruce Tate has a more thoughtful response over at InfoQ: From Java to Ruby: Risk. You may not agree with all of Bruce’s points, but they’re well argued. It may surprise some who’ve endured the flood of Ruby-red writing around here recently, but I think Joel’s correct that Python is quite a bit better proven than Ruby; and also that Ruby has a big Unicode problem. But I can’t get around the fact that Joel sounds exactly like a mainframe droid talking about Personal Computers, or a VMS droid talking about Unix, or an EDI droid talking about the Web, or a C++ droid talking about Java. Yeah, the new thing is kinda unproven and kinda shaky in places and kinda slow and not very full-featured. But it’s got ease-of-use advantages and programmer-productivity advantages and developers like to use it. See the Technology Predictor Success Matrix, and particularly the last three criteria: Happy Programmers, Technical Elegance, and especially the 80/20 Point. Joel’s probably wrong.
· I’ve been having fun writing Ruby (and I’ll post some thoughts on that, and the code, soon) and one of the things I’m trying to do is be idiomatic. I haven’t fully internalized where to use parentheses and where not to; but I’ve come to appreciate the virtues of leaving them out; one of the reasons that Ruby’s so easy to read is that there are fewer typographic squiggles to break up the flow of meaningful text. Also, having done mostly Java for some time, I use lowerCamelCase method names. But after some days soaking in the Ruby ethos, those names are starting to feel reallyKindOfOverExcited and I’m warming to Ruby’s cool_measured_rhythms instead.
OSCON—Perl & Python
· I managed to attend most of both Guido van Rossum’s talk on Python 3000, and Larry Wall with Damian Conway on Perl 6. It’s refreshing to look at technologies that have passed that tenth birthday that seems to be crucial for software to establish that it’s real, and to see that they’re living and squirming and growing. Python, per its culture, seems to be treading a straight-and-narrow path on a well-defined schedule guided by a ruthlessly rational set of design criteria. On a technical note, Python 3 will have a String type that is 100% Unicode and that’s all it is, and separately a byte-array type that lets you indulge your most squalidly-perverse bit-bashing fantasies. I approve. Perl, on the other hand, is whimsical and witty and unscheduled and blithely disregards many genera of conventional wisdom. One could easily have concluded, listening to Larry and Damian, that the problem with previous versions of perl was that they didn’t have enough syntax, and thus there was an urgent need to add more. It ill behooves me to diss Larry Wall’s language designs, since I have successfully internalized all but the most perverse (typeglob, blecch) of those that are here today and they have enabled me to wrangle large amounts of data in surprisingly little time with generally-popular results. Nothing would warm my heart more than Perl 6 leaping to the center of the dynamic-language stage and reclaiming mindshare. The jury’s out.
Want a New Pony?
· In Java, you’d say
new Pony("black"), and the pony is made by
Pony(String color). In Ruby, you’d say
Pony.new("black") and the pony-maker is
initialize(color). In Python, you’d say
Pony("black") and the pony-maker is
__init__(self, color). I wonder which is best? Does all this seem kind of ad-hoc? Anyhow, in real life, a Java programmer would start with an
AbstractEquineBreederGenerator and work from there. And for Ruby you’d be able to say
mySpecialPresent.get do | pony |, and it would clean up the pony poop for you (but the pony would be named Kurofune). Python programmers are Serious Men and Women who Don’t Have Time For Ponies. [Update: The email is already swirling in. Stand by for another virtual comments section.]
Friday Coding Hint
· I’m sure you know the feeling; an innocent-seeming refactoring causes little waves of disturbance all over your system and all of a sudden lots of your tests are failing, and you can’t seem to to really get a handle on it. So yesterday after a couple of hours of hard slogging with no net gain, I threw up my hands in disgust and mowed the lawn. Halfway through it, I realized the refactoring was subtly wrong at the core, and when I came back in I made one little shift and was able to delete lots of special-case code and the tests passed. Problem is, I hate mowing the lawn.
More Binary-Search Breakage
· Peter Luschny writes in with yet another way to break my supposedly bullet-proof binary search algorithm. You’re searching an array of whatevers; well suppose that array is declared:
Whatever w = new Whatever[Integer.MAX_VALUE * 2];
I checked, and Java will compile that happily. Binary search fall down go boom. Sigh. So, if you think you might have more than a couple billion elements in your array, you’d be better off declaring all your indexing variables as
long. (Which should be free on a 64-bit computer, right?) I’ll go update the binary-search article to add this caution. [Update: Maybe not. Greg Thompson and A. Sundararajan both point out that the Java Language Definition requires array indices to be integers, not longs. So I wonder why this compiles?]
Abstract Numbers Yadda Yadda
· Following on the update to the binary search piece, I am in receipt of multiple emails, and the target of multiple web links, all saying, in a superior kind of tone, “The poor boy, that primitive Java stuff broke because he doesn’t have auto-magical big numbers like Lisp-n-Smalltalk had back in the day.” Thank you for raising my consciousness. If you’ll grant that the trade-off between fixed-size hard-wired datatypes and more abstract ones has been under discussion since Turing was a tot, I’ll grant that many attempts to pack the data in tight are symptoms of premature optimization. But space-vs-time trade-offs are just not gonna go away; deal with it. And I’ve had my working set blown to hell more than once trying to build the parse tree for what seemed like a moderately-sized incoming message, in a language that turned out to be just a little too high level. And the “My thought-experiment language solved that in 1976” mantra is boring.
On the Goodness of Binary Search
· Anyone who regards themselves as a serious programmer has internalized a lot of different ways of searching: hash tables, binary, and many different kinds of trees. I've used pretty well all of these seriously at some point, but for a decade or so, as far as can I recall I've used almost exclusively binary search, and I see no reason to change that. Herewith an essay for programmers, with fully-worked out examples in Java, on why. [Updated 39 months after publishing when I read with horror Josh Bloch’s exposé of a long-lurking bug. If we can’t get binary search right, what chance do we have with real software?] ...
· I already wrote about how the NetBeans and EE guys are learning lessons from Rails. But when Roman Strobl asked me to look at his latest on instant persistence, I realized that they’ve learned the really important lesson; it’s all about instant-app screencasts featuring guys with cute European accents. Dig the way Roman says “scaffolding”. Clearly Django, Grails, and the other Web-framework wannabes need to go recruit some appealing Europeans... now here’s a radical idea: how about a woman? [Update: Django has a Eurowebcast too!]
The Rails Lesson
· Over at Geertjan’s blog, The Best Feature Of The Upcoming NetBeans IDE 5.5 is the strongest evidence I’ve seen that the mainstream Java universe is really paying attention to that lesson. Sure, over at the excellent Aquarium, you can read about how they’re slaving away in the engine room trying to make Java EE.next simpler and simpler and yet simpler. But I haven’t been convinced that they’ve got to a place yet where they’re going to win lots of converts from PHP and Rails. But this GlassFish+IDE combo is really coming along: in Geertjan’s example, he makes what looks like a basic CRUD app with no coding and no file editing. In particular, it looks like they’re getting close to Rails levels of DRY (“Don’t Repeat Yourself”). Geertjan skips lightly over the database-selection wizard; I wonder how much more than “use these tables” it needs? [Update: He follows up with the details.] And the Rails people will be asking “What is this ‘Deploy’ of which you speak?” But still, we’re in interesting territory. [Update: Not ten minutes after writing this, I ran across Java web frameworks - the Rails influence, which in turn led me to the (excellent, albeit in PDF) Java Web Frameworks Sweet Spots. Did I say “interesting territory”? Interesting times, too.] [Update: It turns out that the infrastructure Geertjan showed off was by Pavel Buzek, who writes about the process and seems like a Major Force for Good. It’s guys like him who are going to cost Berlind the price of a nice dinner.]
Having Done Java
· Here’s an observation: if there’s something you as a programmer want to do (connect to a website, read some XML, walk a filesystem, listen on a socket, whatever) there’ll be a library in whatever language you’re using to do that. I’ve observed that, on average, the quality of the libraries is better in Java than in the competition: Perl, Python, Ruby, whatever. Don’t get upset, those other languages have lots of other advantages and are The Right Tool for lots of jobs. And the delta isn’t universal—there are stinky Java libraries and lovely Ruby ones—but still, I’d say this is true way, way more often than not. This suggests a hypothesis: Having been a Java programmer will make you a better Ruby or Python or whatever programmer. Ooh, are people ever gonna get mad at me.
· In December of 1996 I released a piece of software called Lark, which was the world’s first XML Processor (as the term is defined in the XML Specification). It was successful, but I stopped maintaining it in 1998 because lots of other smart people, and some big companies like Microsoft, were shipping perfectly good processors. I never quite open-sourced it, holding back one clever bit in the moronic idea that I could make money out of Lark somehow. The magic sauce is a finite state machine that can be used to parse XML 1.0. Recently, someone out there needed one of those, so I thought I’d publish it, with some commentary on Lark’s construction and an amusing anecdote about the name. I doubt there are more than twelve people on the planet who care about this kind of parsing arcana. [Rick Jelliffe has upgraded the machine]. ...
· I followed a pointer from Bill de hÓra this morning and it cost me an unplanned hour while the rest of the family slept, on the subject of programming languages. If you care about such things, stop reading here or you’re about to get stuck too; but that’s because it’s good stuff. Bill pointed me at Steve Yegge, somehow I hadn’t run across him previously.
Item: Bruce Eckel on The Departure of the Hyper-Enthusiasts, which is too rich to summarize but if you had to, it would be: Ruby is good, but not really good enough to beat Python. I wrote about this before, but the conversation it started really has legs.
Item: Steve Yegge pushes back with A little anti-anti-hype, which argues that friendlier languages sometimes beat better languages, e.g. Perl vs. Python. The piece is, he admits, inflammatory.
Item: Speaking of friendly languages, if Steve is right, Ruby has won, check out why’s (poignant) guide to Ruby which isn’t just friendly, it’s a cute little puppy bouncing in your lap, licking your nose.
Item: Back to Steve Yegge, who irritated enough people with that previous piece that he wrote a follow-up, Bambi Meets Godzilla, making the same points, but well enough that you don’t mind.
Item: Steve’s Tour de Babel is a really funny and entertaining romp through a bunch of languages.
Item: Steve’s also interested in other-languages-on-the-JVM, just like me. Unlike me, he positively despises the Java language. Memorable quote: “Java has lots of wonderful features, but Java isn’t one of them. Java’s appeal as a platform for doing real work rests precisely on its strengths as a platform, not as a language.” This is in JVM Languages: Java 5, from the series entitled Stevey’s JVM Language Soko-Shootout, a really interesting run at a sample programming problem in a bunch of different languages running on the JVM.
Item: Speaking of those languages, it turns out that Charles Nutter who (with Thomas Enebo) leads the JRuby project, has a blog, in which he’s recently written about Getting IRB Going which he kind of has (although it turns out to be hard), enough to type in Swing (!) code; and a piece which starts talking about JRuby on Rails, but veers into a very interesting discussion of JRuby performance.
· There’s much ado about Joel Spolsky’s The Perils of JavaSchools. I think that Joel’s largely right, in that I don’t think that you can really appreciate why Java is a good language unless you’re proficient in C, and programmers who don’t really appreciate Java won’t get the most out of it. But Joel is half wrong in claiming that Java bypasses pointers and recursion; I use recursion all the time in Java! If you learn programming via Java but remain ignorant of recursion, you’ve been poorly taught. Also, Bill de hÓra has a point when he says that the other really hard thing that good programmers need to have thought about is concurrency. My guess is that Java is actually a good language for teaching concurrency, because the parts of the problem it sweeps under the rug are not essential to deep understanding and anyhow aren’t the really hard bits. Having said all that, if I were developing a difficult, mission-critical piece of infrastructure, I might develop in Java but I’d be leery of hiring anyone who hadn’t been to the mat with C. My experience differs from Joel’s in another respect: Recursion is mildly hard. Closures and continuations are hard. Concurrency is very hard. I never found pointers hard at all.
· As the year winds down, the programming-language news keeps flowing; at this point I wouldn’t be surprised by major New Year’s Eve announcements. Bruce Eckel wrote The departure of the hyper-enthusiasts, a lengthy riff on Beyond Java, giving Bruce Tate a hard time on some issues and ranting away informatively on Ruby & Python & Zope & EJBs & Rails & lots more; he’s a little less enamored of Ruby than others. Speaking of Rails, David Heinemeier Hansson responds at length. If you’re going to read these, do not fail to read the comments, which are even more interesting and informative; Bruce Tate turns up in both conversations. The other item that caught my eye was Cameron Purdy noting that Caucho claims to have a module that compiles PHP to bytecodes and runs it on the JVM four times faster than
mod_php (first benchmark, but on a real app not synthetic). It’s GPL’ed. This is more than a little surprising. I’ve been campaigning heavily in the Java community at large and here at Sun specifically to make dynamic languages on the JVM a major priority, but I’d never really focused on PHP, because I didn’t know anyone was even working on the problem. (Well, to be honest, also because PHP has always made me nervous.) This changes the “On Beyond Java” picture. [Late addition: last word to Steve Jenson.]
· I just spent some of the afternoon rewriting a chunk of the ongoing code. If anything’s broken, do let me know. Read on for some notes on the process and the technology ...
Radical New Language Idea
· I had this program that was running slow and fixed the problem by fixing the I/O buffering. If I had a quarter for every time I’ve done this over my career, I’d have, well, two or three bucks. I think the language-design community ought to take notice. Currently, they cook up new languages so object-oriented that each individual bit has a method repertoire, languages so concurrent that threads execute in incommensurable parallel universes, languages so functional that their effects are completely unobservable... How about a radical new language where the runtime system is hyperaggressive about ensuring that all of its I/O primitives are by default buffered unless the programmer specifically requests otherwise? I’ve heard that there’s something called “COBOL” that has this capability, I’ll have to check it out.
Atom API Sketches
· I’m thinking about Atom 1.0 from the coder’s point of view. I’m not thinking about the Publishing Protocol, I’m thinking about how you, the programmer, should go about inhaling and exhaling the stuff. I’ve never believed in One True API for XML, it’s just too broad-spectrum, but Atom’s pretty tightly constrained. Obviously, you can use something generic like SAX or one of the many DOM-style APIs, or one of the modern pull APIs. Maybe for Atom we could use something simpler and more natural. I’m thinking out loud in this space, this is far from finished, not even a proposal yet. But, I bet there are other people out there who care ...
The Joy of Threads
· I’ve had quite a bit to say here about how concurrent software, which is getting more important, remains brutally difficult—beyond the reach, some say, of many application programmers. I’m a little worried about negative spin, because if you enjoy programming, you should give concurrency a try; some of us find it especially satisfying. I can remember like yesterday in the undergrad CS course when I first understood what a “process” was, and then a few years later the same feeling when I really got threads. Yeah, it’s tough; you’ll find yourself debugging by print statement, and sometimes with a compile-run-think cycle time measured in minutes. But when you have the computer doing a bunch of things at once, and they all fit together and the right things happen fast, well, that’s some pretty tasty brain candy. All this brought to mind during our recent long weekend in the English countryside; it seemed entirely reasonable to me to sit in a quiet corner of the pub, or with a view of the ocean, and get a few of those compile-run-think cycles in. I can understand that not everyone feels this way, but to all the coders out there: this stuff is not only good for your career, it can be its own reward.
· Brian Zimmer did most of the work behind this alpha release, but Bill de hÓra has the best write-up. For simple servlets, by the way, I’ve found the current 2.1 release just hunky-dory, but it’s good to see progress. Congratulations and thanks to Brian.
Putting the Soft Back in Software
· Check out Bill de hÓra’s No More Nails: Making Good Technology Choices. I’m not sure that Bill’s nails-vs-screws metaphor works that well, but he says some really smart things about how enterprise software is done. By the way, Paul Hoffman and I recently appointed Bill co-editor of the Atom Publishing Protocol draft, in preparation for the final charge to the finish line; I expect great things. (Of course, the IETF’s lamentable all-ASCII-all-the-time publishing policy will keep Bill’s name from showing up properly on the cover).
Mind Expansion by Mikael
· The Rails (& Ruby) hype is becoming deafening to the point that I can’t ignore it; while poking around I came across the home of one Mikael Brockman, yet another precocious Scandinavian hacker (what’s going on up there?). Anyhow, I’d always vaguely understood continuations and knew that smart people thought they were great, but I looked at the code from his essay Continuations on the Web [Sigh, that link is dead. It’s in the archive] and thought “I can’t believe that does what he says”, but it turns out that OS X comes with Ruby and yes, it does what he says. But I had to spend a long time looking at it to see why. Will this kind of idiom ever enter the mainstream? I’m not sure, but internalizing it will make you a little smarter.
Things That Just Work: CVS
· I’m working on a Java.net project that we’re getting ready to de-cloak, and I agreed to fix up some of the files. So I looked at the setup instructions, and the command-line CVS here on my Mac worked first time, and the whole thing checked out no problem, and there I was doing
commit. Things aren’t perfect; Java.net seems to run awfully slow sometimes. But it’s so great when things that are supposed to plug together just plug together and work.
Sun EC: On XP and Agile
· Today through noon Friday I’m at the internal Sun Engineering Conference. We opened with a couple of speeches on XP and Agile Software Development by Ron Jeffries and John Nolan. I think there were some people in the audience who weren’t quite convinced, but I learned a couple of things ...
· I just wasted some time by making a real dumb mistake in my unit testing setup, and I think that when tech bloggers do this they should publish the details, because wisdom is in large part the knowledge of how to avoid doing dumb things, and thus grows globally as a function of the published inventory of stupid mistakes. Thus, herewith, a description of how you can waste time by doing your unit testing just slightly wrong. [Updated: A suggested best practice to avoid this.] [And again.] ...
One IDE to Rule Them All?
· Don Box has an interesting set of Predictions for 2005. Every one of them is thought-provoking and well-framed. There is one, though, that I have to push back on: the surface prediction (#2) is that “Sun will embrace Eclipse”. The deeper issue here pops up a sentence later, when Don talks about “unifying on a common tool platform”. Well, as I (and everyone else who attended) learned at the “IDE Shootout” event at the last Java One, the Java IDE landscape is like a messy, vigorous, noisy, public marketplace. Each of the big IDEs is here for the long haul; and it’s not just Eclipse and NetBeans. Don’t forget Emacs, JDeveloper, and the IDE with the most fanatical fans of all, IntelliJ IDEA. Unlike the Windows world, where Visual Studio is all that really matters, what we have here is an ecosystem, a market, a place where competition and evolution happen. There is absolutely zero chance that the Java world will ever “unify on a common tool platform”. Which is A Good Thing.
· Lauren wanted to visit her Mum on the farm for a few days before Christmas. The world is well into its pre-Christmas slowdown and I’m coding away on Zeppelin these days, which doesn’t require much Net access, so I said OK. So I’m sitting in front of NetBeans except when I’m out pushing the kid’s sled down the hill or visiting with the cows. Zeppelin, like most software, has lots of layers, and I haven’t fiddled with the bottom-layer APIs for a while. Except for I did, added this trivial little method that Couldn’t Possibly Go Wrong, but (arrrrgh) no JUnit test to be sure. Which cost me the best part of a day of debugging a completely incomprehensible application full stop because down at the bottom level there was an
args instead of
args. To all those who sat in rooms at one point or another this last couple of years and listened to me drone on in a superior tone of voice about the extreme importance and Karmic excellence of unit testing, you are now entitled to one large snicker in my general direction.
· I got an email from Phost, who works for Sun in Beijing; it turns out that my disk benchmark named Bonnie (not Russell Coker’s updated Bonnie++ but my original 1990 version — argh, that page is horribly unmaintained) has been part of the Solaris Hardware Compatibility Test Suite for years, and it needed some updates, and they’d made them, and they sent it to me and asked what I thought. So I popped it open in Emacs, what a weird feeling, all of a sudden it was 1989 and I was sitting in an office at the University of Waterloo. Anyhow, I suggested a little mod to the mods and sent it back. Given that I suspect the core semantics of Unix-style filesystems are not apt to change for, well, I can’t imagine how long, I realize Bonnie is certain to outlive me. Sobering.
· This is the permanent status page for Genx (tarball · docs). Genx is a library, written in the C language, for generating XML. Its goals are high performance, a simple and intuitive API, and output that is guaranteed to be well-formed; the output is also guaranteed to be Canonical XML, suitable for use with digital-signature technology. There is a Python wrapper. Genx comes with a GPL-Compatible but non-viral Open-Source license. Latest news: In production, carrying hundreds of thousands of subtitles per day; thinking of taking off the “beta” stamp ...
Java Coalface Notes
· I managed to ignore Atom for a few hours this week and get back to working on project Zeppelin, which leads to a few thoughts on object transmission, concurrency, Jython, and other stuff of possible interest to hands-on Javans ...
Last First Program?
· I just wrote my first Python program. It occurs to me, given the generally grey colour of my beard, that this may be the last time I learn a new programming language. Which, frankly, would be OK, it’s real work. This thing scans all the feeds coming out of Planet Sun using Mark Pilgrim’s Universal Feed Parser, detects any that have changed in the last day, and pings weblogs.com, technorati.com, and blo.gs to let them know. (Question: who else should be pinged? Answer: thanks to the many people who wrote about Ping-o-matic; doesn’t quite fit our bill, but interesting.) It’s only 57 lines of code, but I had to learn a modest amount of Web wrangling, string munging, time arithmetic, and data structure walking to get it going. I suspect it’s not a very good Python program, but I can live with that. If you’re going to scale the Pythonic slopes, you’ll need one browser tab open to Dive Into Python, another to the Python Tutorial, a shell window handy where you can type things like
pydoc time, and a nontrivial chunk of Python code in a nearby editor buffer (I used the Feed Parser) so you can look up idioms. At the end of the day, the code looks distinctly weird to my eye, kind of ragged without a supporting visual lattice of
;’s. But I’m sure you get used to it quickly.
· I’ve been using NetBeans to inflate the Zeppelin, and you know what, it’s not bad. The Mac integration could be a little better, but a whole lot of things I need to do are one keystroke away. It runs plenty fast enough on the PowerBook (mind you, only a few thousand lines of code so far). JUnit’s right there, which is nice. The debugger makes it a little too hard to to keep an eye on class variables, but aside from that does about what I need; when I was driven to “print” statements the other day I was fighting a complex socket conversation between two machines where one side suddenly started seeing EOFs and I couldn’t even tell which side was wrong, I’m not sure there’s a debugger in the world that would have been much help there. Now all I need is to get Jython integrated, and we’re making progress on that, stay tuned.
A Really Satisfying Feeling
· Going through a bunch of source-code files and and, one by one, removing the dozens of “print” statements that let you focus in tighter and tighter and tighter on a really obscure problem until you could finally see it. Debuggers are OK, but when the going gets tough, the tough use “print”.
Yet Another TDD Sermon
· I’ve preached here before (more than once) about the virtues of Test-Driven Development (TDD), but never given it top billing, so here goes. Over the last twenty years I’ve seen the rise of Structured Programming and Object-Oriented Programming and Message Passing and the Relational Model and those are all good things, but TDD is the single biggest advance in my lifetime. It might (finally) turn software from an amateur’s kitchen to an engineering discipline. Herewith some more anecdotal evidence, and practical advice for Software Development managers. [Updated with pointers to PyUnit, NUnit and PHPUnit.] ...
Coding Makes You Dumb
· I quote from an article in this week’s Economist (read it here if you’re a subscriber) arguing that the negative impact of “Offshoring” is exaggerated. The reasons we need not worry include ... the bulk of these exports will not be the high-flying jobs of IT consultants, but the mind-numbing functions of code-writing. [Update: my first cut of this had a snarky aside, but I decided to lose that and let the assertion above stand or fall without commentary.]
· I just posted a Genx tarball; the documentation is separately available here. This is Alpha code, not because it’s all that buggy (it doesn’t do that much, after all) but because it’ll quite likely change once some other smart people see the problems I haven’t. There are quite a few departures from the designs I posted earlier and where the ensuing discussion got to, simply because I’ve now written the code; and I’m never smart enough to understand the problem until I’ve written the code. For those who care about such things, discussion will probably be mostly on the XML-dev mailing list. Genx currently has an ultra-minimal copyright statement but I plan to adopt the latest rev of the Apache copyright before I do another release. [Updated: Oops, tarball was mis-placed; it’s there now.]
· In between beach time and rainforest time, I’ve been coding away on genx; herewith some impressions with one important lesson and an interesting bit of history ...
· It seems there’s some considerable demand for a C-callable API which will write XML safely and efficiently. I sketched out an interface design which you may peruse here; I think it’ll be pretty self-evident to the C-literate. It compiles and I wrote and tested the
genxScanUTF8() method, so it’s not entirely vapor. Upon consideration, I think it will be virtually no extra work to make it emit Canonical XML, ready to be signed, sealed and delivered (and Rich Salz said he would help) so why not? Major thanks to Anthony J. Starks for the name—I am not a member of Gen X myself, but I do share a city with Coupland, so there you go. Since ongoing doesn’t have comments, I’ll post a pointer to this item over in the xml-dev mailing list, which is a natural place to discuss it. It would be very surprising if this first-cut sketch didn’t contain some stupid errors, so go get ’em.
On Writing XML
· In a recent essay I offered, given demand, to author some XML-writing software. There’s been quite a bit of feedback, and the consensus seems to be that the Java community is fairly well-served with XML writing software, but that this would be real useful at the C level. So that’ll be my coding fun for the month of February. The rest of this essay lists some of the Java options that people told me about, and introduces some issues around the C implementation ...
The Three-Legged Future
· There’s a real interesting note from Campbell and Swigart lamenting the fact that, down in the coding trenches, the worlds of objects and of RDBMSes and of XML are far from unified, and that attempts in that direction have been less than enthralling. I think we just have to get used to it, and from here on in, the practice of software engineering is a three-legged discipline ...
Do Databases Suck?
· A couple of month ago I was writing about a C coding project; that code is now wired into the 4.1 release of Visual Net, which comes out sometime early next year, and there’s an interesting optimization lesson or two buried in there ...
· Recently it became obvious that the Visual Net data-prep and index-build subsystems needed refactoring, and I took on the job. So I’ve been up to my elbows in heavy C coding for a week now—my first such excursion this millennium. Herewith some extremely technical low-level notes on the subject, probably not of interest to non-professionals, except perhaps for a paragraph on the world-view of the aging coder. There is some discussion of XML and scaling issues ...
· This is a programming war story with a moral that I think is is important for those who care about their code running fast ...
· I just had a big Visual Net index update blow up because the data structures got to be bigger than 231 bytes (only a couple million records, but each with lots of wordy text and tons and tons of metadata). We’re OK, the solution is to use the more expensive 64-bit iron, but this does mark a turning point ...
· Perl is such a great language, except for when it’s not. There’s this problem, which is best illustrated by example: a colleague came into my office with a testy expression and said “You wrote this; what the $#@!%! does it do?!?” I told him I’d get right back to him. [Update: I get spanked.] ...
· I read, via Don Box, Jan Gray’s monumental piece about performance of managed code in .NET. If you care about performance in general it’s a good read. This provoked a lot of thought and I’ll write more, but also suggested a specific coding technique for making loops faster; I tried it out and it failed the first rough-cut test, but suggests an improvement for future language designs. (Warning: in-the-trenches geeky.) (Update: on iterators and dynamic languages and Java and C#, with more benchmarks.)
(Update: massively-erudite write-up from Erik van Konijnenburg.) ...
On Software Performance
· This is provoked by a monumental essay over on MSDN by Jan Gray entitled Writing Faster Managed Code: Know What Things Cost. I think people who care about performance in modern programming environments, even those who don’t plan to go near the .NET CLR, ought to read this. A bunch of reactions and observations at varying levels of meta-ness ...
· If you're a programmer, I think you're lucky, because this is an exciting time we're living in: there's some powerful intellectual ferment in progress out there. This may be the golden age of programming, as Paul Graham argues, and maybe everything we thought we knew about strong typing is wrong. Herewith a bit of surveyware with a touch of debunking and a practical footnote on Java exception handling. (Warning: geeky.) (Update May 9, good feedback on Java Exceptions.) ...
I Was Going to Write About Programming Languages
· Because there've been a couple wonderful essays published recently about testing and typing and other deep stuff, and of course it's not long since Paul Graham's excellent The Hundred-Year Language. But when I hit Paul's site to grab the hundred-year URI, what do you know but he's posted another one, Hackers and Painters, and now I'm going to have to think some more before I write on this subject. Seriously, check it out.
Lap to Lamp
· A slight rework this evening for ongoing, most visible in the sidebar material to the left and right. The previous cut was Linux/Apache/Perl, this one adds Mysql to the mix. But the LAMP acronym comes up short, there really ought to be an X in there for XML, time for Udell to think up something. I thought it might be interesting to write up some of the design issues, but then I decided no, that wouldn't be interesting at all, so this is just to ask for feedback if I've broken anything, make a couple of general observations, and note that I now hate SQL much less ...
The Joy of Refactoring
· I wanted to make some changes to the code that generates ongoing (stand by: redesign incoming), and this required some fairly serious refactoring. Refactoring is right at the center of good coding practice; programmers often bend over backwards to avoid it, which is almost always wrong. This theme shows up in the best writing on the subect going back decades, and illustrates an even more central lesson about software development ...
On Hating SQL
· The initial cut of the software that drives ongoing is not scaling well as the number of essays grows, and it's keeping me from making a couple of changes I have in mind. So, I'm stumbling from LAPdog to LAMPlight status, that is to say adding Mysql to the existing Linux/Apache/Perl software basis. Which means that there are splodges of SQL dotting my perl code like zits on a teenage face. In theory, I like SQL a lot. In practice it revolts me, and I'm not sure why ...
In the Perl Mines
· I spent most of Friday grinding out Perl code to pull data out of six different big Excel spreadsheets and integrate it and cross-match it and feed it into one of our maps. It gets harder to code as you get older, but there are rewards ...
Writing the Hard Line of Code
· When I'm writing code, and building something complicated, there's this place I usually get to where I'm about to write "the hard line of code". Usually, I've just finished setting up a bunch of code to aggregate the data I need both to read and update, getting the locks I need, and other housekeeping chores, and now I're going to write the complicated bit of arithmetic or really subtle conditional that's going to compute the outcome value or decide whether to include the record or whatever the goal is. I always pause involuntarily and lean back. Then one of a few things happens:
By Tim Bray
I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.
A full disclosure of my
professional interests is
on the author page.