ongoing by Tim Bray · Blog & Tweet

If you blog and also are on Twitter or a competitor, I think that’s a problem. Twitter doesn’t (yet) have a business model, and doesn’t make it terribly easy to refer back to the beginning of your tweet-stream, and the data is full of fragile URL-shortener output. So it’s time to reel things in.

First, I used BackupMyTweets as a quick-and-easy way to get an XML copy of my Twitterlife back to March 2007.

The next step is, I’m working on a script that’ll morph the XML into ongoing entries splashed back over the date hierarchy. I was thinking of grouping them into per-day batches, but maybe not. Hmmm...

The next step is, I’ll write a script that every day or so goes and uses the Twitter API to fish out everything I posted since its last run, and mashes that into the ongoing stream. Should I put them in the Atom feed? One at a time, or grouped by day or week or whatever? Hmm, not sure.

I could also write a local URI-shortener to make all my Twitter links at least as persistent as this blog, but if I just un-shorten the links in the versions I store here, that seems just about as good.

Because whenever you see a vendor owning a communications medium, that’s part of the problem, not part of the solution. Even if the vendor is as lovable as Twitter; and I do love ’em. So I’m going to route around the breakage, and you might want to think about doing the same.

Contributions

Comment feed for ongoing:

From: Seth A. Roby (Aug 12 2009, at 10:43)

I vote note in the Atom feed; I already see those updates via Twitter, so why would I want to be notified when they reappear here? Maybe you could have another Atom feed that did include them, for those people who don't follow you on Twitter?

[link]

From: Matthew Laird (Aug 12 2009, at 10:45)

Sounds like a great plan. I know ongoing runs on a very custom engine, but I hope you'll consider releasing this series of scripts when completed. Why have everyone reinvent the wheel, eh?

Good luck.

[link]

From: Jeff Waugh (Aug 12 2009, at 10:45)

WordPress users will enjoy Alex King's "Twitter Tools" plugin, which maintains the history of your Twitter stream in a separate database table.

Although that history is not exposed anywhere (other than recent tweets in the widget), it's still great to know your Twitter history is backed up in your blog. :-)

[link]

From: Bill Mill (Aug 12 2009, at 10:50)

Please do not include your tweets in your Atom stream.

I want essays in my feed reader, not tweets; I've had to unsubscribe from several blogs I like for just this reason.

[link]

From: mxt (Aug 12 2009, at 11:16)

It seams like the "net" needs some type of "There and Back Again" system, call it the Hobbit Protocol - HBBP. ;-)

Where once you place some info on a web site it lives there for a while and then they send it onto a service that you've set-up.

Maybe a paid cloud service that retains and backs up your data or maybe something like Opera's Unite - http://unite.opera.com/ - that you run on your personal machine.

After the web site has send you that block of data, if they require it again, say for an historical search request, then they pull the data from you.

That way they don't have to manage big racks of data storage that hardly ever get called on.

Can you image how much disk space this would save Flickr.

mxt

THINK

think different

Think Open Source

[link]

From: Drew (Aug 12 2009, at 11:23)

I consider Twitter archival akin to having a voice recorder on whenever you say _anything_. You could, but why would you or anyone else want to sort through the noise.

Just my two cents.

[link]

From: David (Aug 12 2009, at 11:42)

Have you considered going the other way around, where you extend your blog system to give you the option of automatically cross-posting to Twitter? It means you have to go through your website to post something, so you can't use any third-party Twitter clients, but on the other hand it seems vastly simpler. It would also make it very easy to provide your own URL shortening system, eliminating another dependency.

[link]

From: Norman Walsh (Aug 12 2009, at 11:54)

Yep. I've been planning to to do the same thing, though I hadn't found BackupMyTweets. Thanks!

[link]

From: Kevin Spencer (Aug 12 2009, at 11:59)

Please do not put your tweets in your Atom feed. Perhaps a separate feed but not the main one.

As a sidenote, Movable Type users have the rather lovely Action Streams plugin to pull and store (amongst many other things) your tweets.

[link]

From: Terry Jones (@terrycojones) (Aug 12 2009, at 12:51)

Hi Tim

I have some Python to do some of that. One script pulls back all your tweets, at least as far as Twitter allows it. It's a bit of a hack as I had to monkeypatch the Twitter Python library, but it works well enough. See http://jon.es/other/timbray.html

Another that I find useful shows intersections and set differences of followers, followees, etc. See http://blogs.fluidinfo.com/terry/2008/10/13/digging-into-twitter-following/ Some example output is linked at the bottom of that page.

We'll do some stuff with FluidDB on this. Imagine queries like "has timbray/tweet" or "has timbray/follows except has timoreilly/follows" etc.

[link]

From: Terry Jones (@terrycojones) (Aug 12 2009, at 13:10)

See also http://jon.es/other/timbray-sets.html

[link]

From: gvb (Aug 12 2009, at 14:17)

PLEASE, Please please write a URI lengthener. There is little reason for shortening URIs and none for a captured tweet stream that (I presume) will be archived and made available on a web page.

As you observed, shortened URIs are even more brittle than "real" URIs (tr.im, we all are looking at you) and are a severe form of information <em>hiding</em>, which is antithetical to the web.

[link]

From: Kevin H (Aug 12 2009, at 14:25)

Will your process maintain the "In Reply To" information? Because without it, it can be somewhere between difficult to downright annoying trying to figure out what the one-sided half of a conversation you are reading is about.

As it stands, I sometimes come across tweets that are clearly replies but don't have an "in reply to" link ( example: http://twitter.com/cwgabriel/status/2659353080 ). The only recourse in this case is to go to the @username link, drill down to the date in question and hope that the reply came shortly after the initiating tweet.

[link]

From: Bob Aman (Aug 12 2009, at 14:26)

It's obviously not for everybody, but I feel that it's probably best to embrace the transient nature of the stuff on the internet. Information is created and information is destroyed. If it's sufficiently valuable, there will be backups somewhere, either electronic ones, or the kind of backup that occurs in the interested mind. It may be just me, but I don't think Twitter has ever lent itself to the latter.

[link]

From: Aristotle Pagaltzis (Aug 12 2009, at 14:56)

Another nay on tweets in the feed. Weblogs and Twitter are vastly different media, each with strengths that in many ways are exactly opposed. In isolation, I like them both; but mixing them together produces a high-noise digital slush that lacks anything that is good about either of the sources. Don’t do it.

[link]

From: Paul Bartlett (Aug 12 2009, at 15:21)

I'm sure you've considered this, and apologies if I've completely misunderstood, but why not write a Twitter client that saves messages off in parallel to posting them to the Twitter API?

- Paul.

[link]

From: Eddie Welker (Aug 12 2009, at 17:24)

I agree with the others, keep the tweets out of the Atom feed. They aren't substantial enough to sit alongside regular posts.

I've thought about backing up, and probably will at some point. I'm not very interested in re-syndicating them (at least not at the moment while twitter gives that to me for free), but I am not quite sure what else I would do with them. Have you given any thought to other uses they may have?

[link]

From: Howard Katz (Aug 12 2009, at 17:36)

Tim,

I know it's only distantly related, but have you ever taken a look at a StreamGraph dump of your twitter history? If you haven't seen this before, I'm sure you'd be fascinated, if only from a data-visualization perspective:

http://www.neoformix.com/Projects/TwitterStreamGraphs/view.php

The author's Canadian btw.

Ta,

Howard

[link]

From: Gerard (Aug 13 2009, at 12:37)

Have to vote as well against seeing all that twittering in this feed. Maybe a different feed for all the photos too while you're at it?

[link]

From: len (Aug 13 2009, at 16:54)

Tweets are infrequently interesting reading. They are a great alert system.

As to the business model and the vendor, the interesting question is what does happen to web business models if web servers are at the edges?

[link]

From: Ciaran (Aug 14 2009, at 10:41)

There was a time when I thought this guy had figured out how to route around the breakage you describe:

http://identi.ca/tbray

But apparently not. ;)

[link]

From: Alex Morega (Aug 18 2009, at 01:50)

It's distracting to see the "[Original.]" text next to each entry; better to use one of the many symbols in Unicode (although I couldn't quite find an arrow I liked for this purpose).

Also -1 for including tweets in the ongoing feed.

[link]

What this is ·

Subscribe to ongoing

Truth · Biz · Tech

author · Dad
colophon · rights

picture of the day

August 12, 2009
· The World (158 fragments)
· · Life Online (280 more)
· Technology (90 fragments)
· · Publishing (162 more)

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!