Comments on Comments

Last week I started sketching a design for a commenting system, and asked for feedback. I got lots, and I’m reproducing it here.

Sam Ruby · His piece Comments Please has lots of good stuff and, of course, lots of good comments. ¶

Walter Higgins · I think adding comments to ongoing is a mistake. What's wrong with good ol' email ? Anyone who wants to contact you about an article will use it (I know I have in the past). If you add comments you'll end up doing one of two things... ¶

fighting spam because the feedback threshold is too low
ticking off honest commenters because the threshold is too high.

Yes - ongoing is a bit of an anomaly in that it's one of the few blogs which doesn't support comments. But that's part of it's charm.

Good luck anyway, whatever you decide to do.

Rimantas Liubertas · glad to know you are going to implement comments for your blog, I think this is long overdue--at least I missed this ability more than couple of times. ¶

So, my humble suggestions:

Framework: do try Rails. I am not sure, how this will work w/o database, but other parts of rails are nice too, so, give it a try.
A couple of things I'd like to try out for comments myself. These may be "nice to have" features or a complete overkill, but I find them interesting anyway.

One is behavior which was called "playing dead" by Joel Spolsky [1], a way to deal with spam and unacceptable posts in forums:

"Show it to the original poster, so he feels smug and moves on to the next inappropriate discussion group. But don't show it to anyone else. Indeed one of the best ways to deflect attacks is to make it look like they're succeeding. It's the software equivalent of playing dead."

I think it is good for moderated system too.

Another is the limited time window, just a few minutes after posting a comment, when poster has the ability to edit the comment. Some may enjoy the possibility to correct nasty typos which would be stuck otherwise...

As for HTML in comments I'd consider using Markdown on Textile. Both have ports to Ruby (BlueCloth and RedCloth).

Looking forward to see your system in action :)

Regards,
Rimantas
http://rimantas.com/

Bob Aman (1) · I'd recomment using a version 1 UUID instead for the filenames of comments. RFC 4122 if you feel like a long read. Has the added advantage of being ideal for use within the id field of any comment feeds. It is my opinion that, in general, comments should be abstracted as a related data structure to posts whenever possible. Ideally as either a sibling class or a child class. Makes a lot of tricky things very easy. ¶

In any case, a version 1 UUID contains a timestamp, a random number, and the MAC address of the computer that generated it. Ideal for what you're trying to do, if a bit hard to parse.

I'd encourage the use of Textile markup for rich HTML entry. Just be sure to have some kind of preview option, because it confuses the heck of people sometimes.

Anyways, glad you're enabling comments. It'll be nice to break the monopoly that hoodwinkers currently have on commenting on your blog.

Bob Aman (2) · -grumble- ¶

This'll teach me to shoot off an email before finishing the reading of the blog post in question.

Since you're thinking about writing it in Ruby (which I didn't realize), I suppose I'll shamelessly plug my UUIDTools library:

require 'rubygems'
require 'uuidtools'

comment_uuid = UUID.timestamp_create.to_s
# => "c4c5cb40-e47b-11da-8552-00112486f05c"

uuid = UUID.parse(comment_uuid)
uuid.timestamp
# => Mon May 15 21:31:59 EDT 2006
uuid.mac_address
# => "00:11:24:86:f0:5c"
uuid.version
# => 1

Also, now, that I'm looking at my code again, I may have been wrong, I think version 1 UUID don't have a random number component, but since they're guaranteed to be universally unique across all computers, I'm assuming that shouldn't be an issue. Version 4 UUIDs, however, do have a random number component.

Robert Hahn · I just read your post about noodling out a comment system. The use of Rails sounds surprising, and I had a couple of comments to share. ¶

I understand that you don't want to use any sort of DB infrastructure for the system, and you know what? I think that's very easy to do. As you know, files that are stored in the model directory generally looks like this:

class Comment < ActiveRecord::Base
...
end

And that would be how you'd normally invoke the DB infrastructure. Well, there's nothing that keeps you from writing something like this:

class Comment < File
...
end

which would allow you to create a class that inheirits from File, extended by all the methods that you would want. Even better, you could create your own set of objects that did exactly what you want, save them in the vendor/ directory, and it would become part of the search path if you want to extend it as part of your model. Your model might then be:

class Comment < Tims_Comment_Infrastructure
...
end

As you can guess, you'll be giving up a lot of functionality that ActiveRecord has, but given that you're working in a completely different domain space, you're probably better off writing code specific to that domain instead of trying to use ActiveRecord in a way it's not designed to be used.

Now, when it comes time to use your Comment model in the controller, it's as simple as doing something like this:

def post_comment
	c = Comment.new(params[:comment])
	...
end

Rails automatically handles any require's that would be, um, required to load up the model for you.

So. Hopefully you can see that this would be fairly trivial to do. Here's my other comment.

You might want to consider using TurboGears, which is a Python based interpretation of Rails. What's cool about TurboGears is that it's built on best-of breed infrastructure that has already been built and maintained separately. I have not used this, and cannot comment on it on a technical level, but I have read about one shop that uses TG for all their work; they like it because if a site doesn't require a DB, they can simply drop it and still enjoy the advantages the framework has to offer. Sounds pretty good to me, if it's true.

Thought I'd share this as well: I'm actually taking a look at a microframework called Camping[1], written up by why. it's fantastically minimalist, and could be a good fit for you as well. That said, it's so new, people are having a bit of trouble figuring out how to unwire the DB or templating engine from it - it's possible, I'm sure (I'm wanting to use it w/o the DB engine myself), but most people simply aren't sure how that works precisely. I point this out only because it sounds to me like you're more interested in learning Ruby than Rails, and the one compelling advantage that Camping has is a much simpler deployment (think on the level of a cgi) phase.

[1] http://code.whytheluckystiff.net/camping/

http://www.roberthahn.ca

John Cowan · For your comment system, do what Norm does for comments: he lets people type arbitrary HTML (and expands blank lines into <p> so that plain text paragraphing works too) and then runs it through TagSoup and an XSLT script that only keeps the tags and attributes he trusts. Contrary to what his input form says, href *is* permitted on a elements. ¶

http://www.ccil.org/~cowan

Eric Dobbs · My experience is with Movable Type 2.6ish. As such I was hammered with automated spam attacks. I agree that your homegrown code is a less likely target, but though I'd share anyway. I spent a fair amount of time hacking the MT comment system in an effort to stem the flow of comment spam. Though I eventually just shut comments down, there's one thing that made me mostly happy. ¶

I left in the email notification MT sends for comments while disabling the writing of comments to the database. That gave me two principal benefits:

I could just delete the spam in Mail.app (where I'm accustomed to such stuff) and I could skip having to touch MT.
It also meant I could re-use Mail.app spam filtering which has been rather thoroughly trained in what I think of as spam.

What I didn't finish before I shut down the comments was a system allowing me to publish the comments via a reply to the notification.

I'm still interested in the general idea of re-using one well-trained spam filter for many of the in-bound streams of data in my digital life. Not sure that email is the right interface for that, except for the fact that I'm in Mail.app probably more often than any other application.

-Eric
http://dobbse.net

Jonas Galvez · I'm sure you'll get a bazillion of e-mails from people encouraging you to use Rails, and since I've been (luckily enough) using it for the past 10 months on my day job, I could surely be amongst them. ¶

But I'm not.

Rails is great, very well designed, written and mantained. It's a great way to get applications up and running quickly. And there's no doubt Rails apps do very well in production.

But when it comes to minimalistic, personal applications, such as the comment system of your blog, I think you'll soon notice and be uncomfortable by the fact it's got simply way too many files. Rails apps unfortunately still resemble J2EE apps in this aspect, no matter how concise and well organized they try to make it.

Not every programmer will share this concern, but if you are like me, you'll want to keep your applications to only a few files, in a directory tree as simple as possible. So what I'm saying is, Rails is great for big applications, but if you're doing a personal system, give Camping[1] or web.py[2] a whirl.

Camping wraps all Rails functionality in a single file and lets you build full applications using also a single file. Same thing with web.py. In fact, at work (Blogamundo.com), I've two main trunks of code, one for our web user interface, which is built in Rails, and one for a series of tiny Python daemon services that talk to the Rails app via HTTP/JSON. I did this[3] initially because I wanted to use Mark Pilgrim's Universal Feed Parser for our system and there was nothing close to that written in Ruby.

[1] http://redhanded.hobix.com/bits/campingAMicroframework.html
[2] http://webpy.org/
[3] http://blogamundo.net/dev/2005/10/23/shared-exceptions/

Nicola Larosa · This is an advocacy piece for Python and Django (http://www.djangoproject.com/). Here's why. ¶

I've been using Python, and doing web programming, since 1999. I think Python is a better choice than Ruby for a dynamic language, for reasons of clarity and popularity, RoR hype notwithstanding.

I've been using a number of Python web frameworks through the years, including Zope2, some Zope3, Quixote, Twisted, and Nevow. Recently I discovered Django: it definitely feels like a better tool than the aforementioned ones, at least for CMS, publishing-oriented web applications.

Is it better than RoR? I don't know, but it seems to be at least on the same level:

A comparison of Django with Rails here.

Django and Rails here.

Django is at the same time tightly integrated and modular, with a distinct REST feel. Bill de hÓra, among others, is using it without database, and with RDF:

Django without database here.

A final, less relevant note. Since discovering Twisted in 2003, I've been an asynchronous events rabid fanatic, despising preemptive multithreading. Well, Django does not support asynchronous events, while it does support multithreading, but I feel compelled to use it just the same. It's that good.

My web site (URL below) is not built with Django. Yet. It will be soon. :-)
- --
Nicola Larosa - http://www.tekNico.net/

Deepak Sarda · You should take a look at Demokritos. From the project page: ¶

"Demokritos is a Python library and content repository implementing the Atom Syndication Format (RFC4287) and Atom Publishing Protocol (draft-ietf-atompub-protocol-08). Persistence is via a subversion repository."

I don't know how far along it is, though!

URL: http://www.jtauber.com/demokritos.

As for Rails, I suspect it'll be an overkill for the task. If Ruby is what you want, then consider Camping (http://camping.rubyforge.org/). Or web.py (webpy.org) on Python - which I've used and recommend!

And yeah, ongoing does need comments :-)

deepak
--
http://www.antrix.net/

Stuart Langridge · A couple of comments about comments on ongoing: ¶

First, think about whether you want to ask for people's email addresses. Most sites don't display a commenter's email, so it's only of benefit to you specifically, and I personally find it highly annoying that I have to type my email address into most of the sites out there to make a comment. URL, fine, and if you need someone's address to reply privately then it should be fairly easy for you to find it from the associated URL.

Second, please don't put the comments on a separate page; you're much less likely to get people reading them if you have to click from your RSS reader to read the post (which you already have to do, unless I've missed a full-text syndication feed for ongoing) and then click *again* to see the discussion.

Can't help you with Ruby, I'm afraid; I'm a Python guy for web apps, and Wordpress runs my site.

If you're going to have to approve all comments, then please also add something which says "your comment has been sent for approval" after posting, since otherwise people think that the browser broke in some way and post the same comment twice. And, without wishing to be mean and didactic, approval of everything requires you to stay pretty on top of it, or it'll kill conversations.

Best of luck!

Stefan Tilkov · You might want to look at Camping instead of Rails as a Web framework for your comment solution: ¶

http://camping.rubyforge.org/files/README.html

Camping is done by the simultaneously weirdest and most brilliant guy in the Ruby community, a fellow who goes by the handle of "why the lucky stiff" (http://whytheluckystiff.net/)

Best regards,
Stefan
--
Stefan Tilkov, http://www.innoq.com/blog/st/

Ramin Miraftabi · I just read your post on *finally* implementing spam prevention on ongoing. You're thought that ongoing wouldn't be the subject of automated spam attacks is, unfortunately, erroneous. ¶

I implemented a simple commmenting system for the photoblog we have (http://fierymill.net/loj/), which in fact was among the first systems to support Atom 1.0 in its feeds ;) and thought that comment spam wouldn't be much of an issue because of the small install base and relatively few incoming links.

However, I was wrong and quickly implemented a hidden field with a key that is checked (the key doesn't change). This method worked for several months keeping the system spam free, but then one morning I woke up to 50 spam messages. Now, while designing a changing key system that doesn't require any work from the user I've disabled comments from entries over a week old, which seems to help at the moment.

But, I'm rambling on. My point is, that a small install base will not prevent you from being the target of automated spam attacks, especially since ongoing is quite well known.

With greetings from Finland,

ramin
--
Ramin Miraftabi http://fierymill.net/

Aristotle Pagaltzis · you may want to consider supporting ¶

PGP-Signed Comments
http://golem.ph.utexas.edu/~distler/blog/archives/000320.html

Regards,
--
Aristotle Pagaltzis // http://plasmasturm.org/

Bryan Feeney · I've been reading your blog for a while (still readying one of my own), and I noticed your article about adding comments. There's just one thing, you almost certainly will get spammed. I set up a simple message board for a college club a few years ago (http://www.bfeeney.uklinux.net/msgboard and http://www.ucc.ie/mountaineering respectively). It can be configured to force registration or work without it. At the moment, it works without it, as registration was found to put a lot of people off. ¶

However here's the thing: traffic tends to vary from 50 posts a day to 3 acceptable posts a day depending on the time of year, but for at least a year now I've been getting a huge amount of additional comment spam: mainly from bots, but also from people. Yesterday there were 17 normal posts and 46 blocked spam messages.

When this started, I extended the usual nuisance-user blocking to block spam. I use regexps to block message content, and simple text to block names. Regexps would be better for names, but it was a lot of work at the time. I've included the code and blocks file in case it's useful. It's all written in Perl: the blocking is taken care of by /mboard-sys/lib/Blocks.pm and the blocking rules are specified in /msgboard/blocks.dat

The other alternative would be to use the Typekey system (http://www.sixapart.com/typekey/api) to force registration, without writing all that user-management code yourself. Typekey is, of course, the exact same idea as Microsoft's passport, but people seem more trusting of it at the moment.

Aaron Quint · First of all, I read ongoing everyday, and Im sure a comments system would be a great addition. (This message, for example, would have been a great candidate for a comment.) ¶

I think you should definitely look into using Rails. Im just getting to a level of proficiency with Ruby and Rails myself, and I can tell you its the most fun I've ever had with any sort of framework or language. The main reason being, every time you've done something very cleanly and elegantly, you realize there's a way to do it even MORE cleanly and with half the code. There are examples of Rails being used without a database, the new book Rails Recipes (http://www.pragmaticprogrammer.com/titles/fr_rr/) even features a section on it. If you wanted to really delve in, I think you could even use ActiveRecord (Rails DB Abstraction) to manage your flat-file comment storage.

In terms of letting people post rich comments you should look into Cal Henderson (one of Flickr's lead developers) lib_filter: http://iamcal.com/publish/articles/php/processing_html/ (and part 2: http://iamcal.com/publish/articles/php/processing_html_part_2/ ). It's written in PHP, but Im sure it wouldn't take much to implement in Java or Ruby as its all just regex parsing. The main idea is that he created a white-list of allowable tags, and then stripped the rest, even the trickier stuff like double quoted and nested <script> tags.

I hope this helps. Thank you for 'ongoing', it has a way of lightening my day.

--AQ
----------------------------------------
Aaron Quint
: http://www.quirkey.com/blog/

Jeremy Dunck · Since you're Perl-y, and since you're doing such a small thing (Rails, Django and most other frameworks are heavier (err, have different sweet spots) than what you want), maybe you'd like HTML::Mason. ¶

I haven't personally used it, but I understand its well-regarded. del.icio.us runs on it, for example.

http://www.masonhq.com/

Don Davies Brackett · I urge you to reconsider threading your comments. Not only does it expose you to the management headache of "oops, I intended to reply to that post over *there*, can you move it please?" but it also makes discussions more repetitive and fragmented; different people end up posting the same thing twice or three times over without adding much value, because it's much harder to read everything that's been said when there's clicking involved. Discussions also tend to be longer-lived without actually progressing, as two people get into a protracted argument that other people occasionally interject into. ¶

I feel like unthreaded discussions selfmoderate a little better too, though I'm not sure why I have that feeling.

Alan Griffith · KISS when it comes to adding comments. Have people email in comments to maybe comments@tbray.org. The subject line could contain the URL of which page to add it to. ¶

James A. Robinson · I'm a reader of your blog, and I was interested in your most recent entry re comments. I thought you might be interested in a piece of Java software I've started to use to transform HTML typed by users into a semblance of XHTML: ¶

http://home.ccil.org/~cowan/XML/tagsoup/

paired along with something like JDOM (or whatever your favorite XML manipulation technology happens to be), it's possible to build a piece of software which puts HTML input into a semblance of order, and can also filter out the nasty things we don't want to allow (e.g., scanning for embedded javascript).

I'll be curious to see how you decide to put together the comment system. Thanks for sharing your thoughts over the years. :)

David Magda · Most commenting systems ask for name, e-mail, and URL. Some people are hesitant to give out the latter two (especially e-mail). It may be useful to have the fields, but have a checkbox to toggle whether the information should be published publicly. It would be visible to you, but the person in question would have some control over whether things would be available to the world. ¶

Also, having each comment individually address would probably be handy (an obvious feature, but many weblogging systems only have a "#comments" option, with not much more granularity).

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

May 20, 2006
· Technology (90 fragments)
· · Publishing (161 more)
· · Web (397 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!