[RAD stands for Ruby Ape Diaries, of which this is part I.] To build a validator you need an HTTP engine and an XML parser, both of which Ruby is advertised as having. JRuby, when I first took this on, was as at release 0.9.0 and had plenty of rough edges. But I decided to use it anyhow.

The Ape needs to check that various protocol messages are well-formed XML, and to validate them against the RelaxNG schemas for Atom and the Atom Protocol. Ruby’s built-in XML engine, JREXML, has a casual attitude towards correctness. The pretty-good Ruby Cookbook offers this example:


The cookbook airily claims that the fragment above “is unambiguous, which means REXML can parse it”. This should be re-written to say that the fragment above “is malformed, which means REXML is egregiously broken”. There are some things to like (and dislike; more later) about REXML, but it’s really not an appropriate choice for a validator.

I poked around, and there are various bits of guerilla code you can get that link Ruby up to Libxml2 or Expat, with various degrees of cooked-ness, and their own APIs.

On the other hand, Java comes with a solid XML parser and standard APIs built-in. Also, the only general-purpose RelaxNG validator I knew of is Jing, built in Java.

Also, under my Sun hat, I’ve been promoting the notion of other languages on the Java platform, but I’ve never actually written any serious code in any of ’em, so this was a chance.

The Bad Stuff · In the big picture, performance should be a non-issue. The Ape does trivial little bits of computing in between its exchanges of messages with an HTTP server out there; once you get JRuby in production in some sort of servlet container, I’d be astounded if you could measure the difference between JRuby and hand-crafted C.

But while I’m debugging it I’m eating the startup delay all the time, and at the moment, it’s brutal; I mean seriously bad. The JRuby guys have been doing some pretty bold chest-beating about how fast JRuby’s gonna be—bolder than I’d be in their position—but at the moment it’s kind of sucky to develop in.

I’m not going to complain too much about the fact that JRuby still has bugs (hey, I reported #65, #66, and #94). Because things, by and large, Just Work until you start to rub up against the edges.

I was a little irritated by the fact that it’s too easy to produce a traceback that combines a handful of lines of Ruby stack with a couple of hundred lines of Java stack.

There is clearly a psychological gap between the Java Way of doing things and the Ruby way, and it leads to some really weird-looking Ruby code, but I don’t think it’s actually a problem.

The Good · Yep, if it’s out there in the vast echoing universe of Java APIs, you can get to it from JRuby. You can do things you wouldn’t expect, like subclass Java interfaces (!) and refer to getters and setters with Ruby idioms; by some magic foo.setBar() is also known as foo.bar=.

Here’s an example: I have a little class that bridges to Jing, it loads a Relax schema and validates arbitrary chunks of XML. The Jing API is not built for comfort, it definitely has that factory factory factory feel. But the glue class is only 36 lines of Ruby, 9 of which are Java includes, and has never caused any trouble; this includes setting up a StringWriter to collect error messages and so on.

Surprises · First of all, I’ve been writing XML processing code longer than any other human being, but I’ve always been a lean, mean, stream-parser guy; I’d never been near the standard DOM. Well, blecch. Yes, I understand why it is the way it is, but it sure isn’t fun. If I were doing this again I’d look hard at JDOM or XOM or something else instead. But anyhow, I got it to work.

I also had managed to forget how lame Jing’s error messages are. It usually doesn’t tell you much more than “Missing element. Bye.” I’d like to use the Feed Validator code, which is in Python, hmmm...

More To Come · There’s lots more to write about here at the Ruby/Java coalface; interfaces and constants and glue code and coding conventions and so on. Keep reading the diaries.

author · Dad · software · colophon · rights
picture of the day
August 17, 2006
· Technology (85 fragments)
· · Dynamic Languages (45 more)
· · Java (123 more)
· · Ruby (93 more)


I am an employee of Amazon.com, but the opinions expressed here are my own, and no other party necessarily agrees with them.

A full disclosure of my professional interests is on the author page.