I’m thinking about Atom 1.0 from the coder’s point of view. I’m not thinking about the Publishing Protocol, I’m thinking about how you, the programmer, should go about inhaling and exhaling the stuff. I’ve never believed in One True API for XML, it’s just too broad-spectrum, but Atom’s pretty tightly constrained. Obviously, you can use something generic like SAX or one of the many DOM-style APIs, or one of the modern pull APIs. Maybe for Atom we could use something simpler and more natural. I’m thinking out loud in this space, this is far from finished, not even a proposal yet. But, I bet there are other people out there who care.
Constraints · Here are some of the things that should make an Atom API easier:
Atom elements mostly have unsubtle programmer-friendly data types, easily represented in O-O terms, with the exception of Text Constructs and “atom:content”.
The order of elements is not significant, except for the useful fact that all the “atom:entry” children of “atom:feed” are grouped at its end.
Some Non-Atom elements can be recognized as being what 6.4.1 calls “Simple Extension elements”, with simple, easily-modeled structure
A Feed doc is guaranteed to have title, date, and unique-ID children, as well as possibly other known Atom elements.
An Entry doc is guaranteed to have title, date, and unique-ID children, as well as possibly other known Atom elements.
The computation of which metadata values apply to an Entry is nontrivial, involving values from the Entry itself, feed-level metadata, and from an embedded “atom:source” element.
Guesses · Here are some predictions about the likely characteristics of Atom data in the wild:
Feeds, in practice, may be arbitrarily and unpredictably long, that is to say, have huge numbers of entries.
Entries, in practice, will be reasonably short.
Recommendations · I’ll try to avoid language-specificity, but I see the world through trifocals with the panes labeled “C”, “Java”, and “P-languages”.
Streaming · Because of the unpredictable size of Atom feeds, DOM-style APIs for whole feeds are probably unusable in many scenarios. Thus, a general-purpose Atom API must include a streaming capability, preferably in a pull rather than callback flavor.
Similarly, generating an Atom Feed must be possible in a streaming fashion, with entries going out on the wire as they are generated.
Iterating · For modern languages that have the concept, why shouldn’t a feed just present as an Iterator over the entries?
Metadata Distribution · Per-entry metadata is sourced from a combination of feed-level, entry-level, and source-level child elements. The API should hide the mechanics and let readers pull out the per-entry data. Finer control over where the metadata goes is probably required on the Atom-generation side.
Foreign Markup · Support for what the spec calls Foreign Markup which does not constitute Simple Extension Elements is not required, beyond offering a generic XML interface to the contents.
Text Constructs · I expect these things to be the major sources of complexity and difficulty in dealing with Atom; here are some ideas on how to approach them.
The case where you want to display a text construct is probably very
common, and has two modes: you want HTML you can hand to a renderer, or you
want raw text you can pour into a display widget.
I envision two calls, perhaps named
getHTML, which take care of figuring out the
attribute and doing the right amount of unescaping.
The only magic in the case of
getHTML is the case where you have
type="text"; in this case the call should wrap the text in a
getText is simple for
type="text"; in all other
should simply remove all the markup and return the raw text.
There should be an
isText call to find out whether a content
element can return something useful for
For non-text values of
type or remote content, I don’t think it’s
cost-effective to do much in the API other than exposing the
RSS · This API should work just fine with Atomic RSS, and might optionally try to deal with other flavors.