Peo­ple who hang around with blog­gers all know what RSS is (if you don't, I'll in­tro­duce it.) RSS is head­ed for some in­ter­est­ing times as re­gards client soft­ware, traf­fic man­age­men­t, and busi­ness mod­el, and it would be rea­son­able to ex­pect some break­age along the way.

RSS for the Unini­ti­at­ed · The his­to­ry of RSS is fraught and com­pli­cat­ed and I'm not go­ing there. To sum­ma­rize, RSS is a lit­tle XML lan­guage that you use to de­scribe changes in a web site. That is to say, on your web­site at http://www.ex­am­ple.­com/news you put a file with a name like http://www.ex­am­ple.­com/news/rss.xml, and it con­tains a sum­ma­ry of what's new on the web­site, in XML. Usu­al­ly this is called an "RSS feed". Then all kinds of dif­fer­ent pro­grams can read the RSS feed and give you click­able news sum­maries that mean you don't ac­tu­al­ly have to vis­it all those web­sites un­less you know there's some­thing there you want to read.

If you want to learn more about the his­to­ry and de­tail­s, type "RSS" in­to any search en­gine.

There are a bunch of dif­fer­ent RSS vari­a­tions but don't let that both­er you, the soft­ware out there can deal with it.

Most peo­ple, once they start us­ing RSS to check the news, just don't go back, the amount of time and ir­ri­ta­tion saved is to­tal­ly, com­plete­ly ad­dic­tive.

The Prob­lem of Be­ing Away from Your Home Ma­chine · I use a pro­gram called NetNewsWire for RSS, a nice lit­tle desk­top ap­pli­ca­tion that reg­u­lar­ly vis­its the RSS feeds of the places I read (New York Times, BBC, C|Net, Kuro5hin, some blog­ger­s) and lets me see at a glance which have new stuff. It works great (it's Mac-only, but there are nice ones for Win­dows and oth­ers that run through a brows­er and run any­where).

The prob­lem is, as soon as I go to an­oth­er com­put­er, NetNewsWire isn't there and there's no quick way for me to check the world and see what's hap­pen­ing.

Now, I have a sim­i­lar prob­lem with my home page, which is a hand-crafted HTML file sit­ting on my hard disk. For those times when I'm not us­ing that com­put­er, there's a copy on my pub­lic web site so I can al­ways get at it.

So what I want is a file out on the Web that sum­ma­rizes the RSS feeds I want to read, that I can point any old RSS soft­ware at and find out what's hap­pen­ing in the world.

What To Read RSS With? · It seems ob­vi­ous to me that I should ag­gre­gate and read RSS feeds right there in my browser. Much and all as NetNewsWire is a tru­ly great lit­tle piece of soft­ware, why do I have to use a non-Web-browser to chase Web con­tent? Right now the RSS of­fer­ings from User­land and oth­ers do the work on the serv­er and gen­er­ate a pure-HTML in­ter­face. That's fine, but NetNewsWire shows that in prin­ci­ple you don't need a serv­er at al­l, you just need to know which RSS feeds to read and you're self-contained.

Which brings us to...

The Traf­fic Prob­lem · When I turn on my lap­top in the morn­ing, NetNewsWire goes out and scans 21 RSS feed­s. Then it checks up on them at 30-minute in­ter­vals af­ter that (this is con­fig­urable). I don't know how typ­i­cal that is, but I know there are peo­ple who track way more than I do. There's a prob­lem here - if RSS be­comes as wild­ly pop­u­lar as a lot of prog­nos­ti­ca­tors (in­clud­ing me) pre­dic­t, there is go­ing to be an un­god­ly traf­fic bulge ev­ery morn­ing, and then at half-hour in­ter­vals all day.

Peo­ple who read RSS through web-based prod­ucts like the User­land of­fer­ing are go­ing to present a much small­er load to the sites pro­vid­ing the RSS, but I think that it's go­ing to get wired in­to Mozil­la and IE and Sa­fari and peo­ple will just do it from their desk­top.

For­tu­nate­ly, I think the Web's caching mech­a­nisms will hold up un­der the load as­sum­ing ev­ery­one plays by the rules. Un­for­tu­nate­ly, at the mo­ment we're not...

The Media-type Prob­lem · Web en­gi­neers know, but most peo­ple don't have to, that when a web serv­er sends a page to your browser, it al­so sends along some things called "HTTP Headers" that tell the brows­er what kind of thing it's send­ing, for ex­am­ple an HTML page or a JPG emage or a Quick­time movie or what­ev­er, and some oth­er use­ful facts about the page. The HTTP Head­er that says what the page is is called "Media-type". This al­lows the brows­er to wake up the right soft­ware with­out hav­ing to peek in­side the da­ta to fig­ure out what it is.

At the mo­men­t, there doesn't seem to be much agree­ment about what Media-type to use for RSS files. User­land, which is kind of the RSS in­dus­try lead­er, serves them as tex­t/xml. In­foworld, a fair­ly so­phis­ti­cat­ed text pub, us­es tex­t/html. Mark Pil­grim, a pop­u­lar blog­ger who thinks re­al­ly hard about these things, us­es ap­pli­ca­tion/rss+xml.

Web Ar­chi­tec­ture wee­nies would prob­a­bly be hap­pi­er with some­thing like Mark Pilgrim's choice, but one way or an­oth­er I think it's prob­a­bly im­por­tant that the com­mu­ni­ty get its act to­geth­er and de­cide what Media-type to use and start us­ing it. This will help solve two prob­lem­s:

  1. When some­one writes RSS-reader code to live in the Web Browser, it's go­ing to need a con­sis­tent Media-type to be able to rec­og­nize RSS.
  2. To man­age the traf­fic load we're go­ing to have to do some caching. For­tu­nate­ly, RSS con­tains some pub­li­ca­tion and expiry-date da­ta to help in­ter­me­di­ate soft­ware do this, but to do this it has to rec­og­nize the da­ta as RSS and read this stuff. This isn't go­ing to hap­pen un­til RSS gets served with the prop­er Media-type.

The Busi­ness Model Prob­lem · I hate to be a wet blan­ket but I just don't see RSS read­ers per­sist­ing for too long as a stan­dalone ap­pli­ca­tion class, this stuff just be­longs in the browser. It will take a cou­ple of years for this stuff to get cooked in­to main­stream browsers in a ma­ture enough form to be us­able, so the guys with the RSS-reader soft­ware should make hay while the sun shines and start fig­ur­ing out their Next Big Thing.

RSS tech­nol­o­gy was driv­en by the Weblog-technology com­pa­nies and I sus­pect they'll con­tin­ue to do just fine, We­blog­ging ain't go­ing away any time soon. Al­so, any­one who does any kind of pub­lish­ing soft­ware had bet­ter start of­fer­ing a real-easy-to-use RSS in­ter­face and soon­er rather than lat­er or they're just not go­ing to be in the game.


author · Dad · software · colophon · rights
picture of the day
January 26, 2003
· Technology (76 more)

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.