I have comments, but no spam in my comments. Here’s why.

Moderation · I’ve had a comment system running for about three years here, but I haven’t got around to turning off moderation. Still, when I moderate, I almost always just one-click accept all the comments. When I reject them, it’s mostly for being vacuous, not adding anything to the discussion. Then there are a very few that are toxic or poisonous or maybe libellous. Finally, there are a single-digit number every year that are sort of spam; written by real people trying to place a comment with a link back to their lame or porny or MLM or whatever site. I covered the latest variation in Is This Spam?

Should I turn off moderation? It would require me to implement some code for quick easy one-click removal of a comment when one gets through that I don’t like. I’m really not sure it’s worth the trouble.

Be Different · This site is different from all the other blogs, of course, in that I wrote the software and it only runs here. So a spammer who figured out a route into my comments would only have one site to attack; the rewards for subverting WordPress or Blogger are way higher.

It’s different in a subtler way. There’s no comment form at the bottom of the entries; just a link that invites you to click on it to make a contribution. A minor barrier for a spambot, but my feeling is that a succession of minor barriers is the best way to fight back. The economics of spam are such that everything you do to make your target a little more complicated and less soft discourages a certain proportion of the bad guys.

Having looked at WordPress a bit, I suspect that it wouldn’t be too hard to introduce a lot of simple, maybe random, variation in the way that any given blog asks for comments. Which might help.

Be Human · The single best way to defeat the economics of spam is to make it non-free. The best way I know of to do that is to force a human to get involved. This is the purpose of the Captchas you see on so many comment sites.

My approach is simpler. If you try to comment and ongoing hasn’t seen you before, it’ll ask you to answer a simple question that I think almost every real English-speaking human should be able to handle. Here are a few random examples taken from the just over 200 I cooked up:

  • Is France a city or nation?

  • How many sides has a square?

  • Which word is longer, alphabet or box?

It turns out this is overkill. Joe Gregorio just asks prospective commenters to include a single hard-wired string which he helpfully provides in an explanation right there on the comment form. Maybe I’ll go to something simpler too.

Be Sneaky · If the comment system here decides you’re a spammer (you try to post too often, you fail the “any-human-should-know-this” test), it takes evasive action by pretending it’s had an enterprisey-sounding programming error. Here are some of the messages you might see:

  • Error: Document pre-parse replicator unavailable; exiting.

  • Error: Module synchronization initializer terminated; exiting.

  • Error: File marshalling responder invalid; exiting.

It’s a simple random-phrase generator. Some of the messages verge on haiku; go refresh www.tbray.org/atompub/ouch a few times to savor the flavor.

To protect the guilty, I won’t name the correspondents who’ve written me with detailed instructions on how to fix my Rails routing or Spring configuration to make the problem go away.

Does This Help? · I don’t know. Maybe it’s just some combination of moderation and luck that keeps me spam-free. But maybe one or two of these tricks will help someone else.



Contributions

Comment feed for ongoing:Comments feed

From: Bob Aman (Jul 27 2009, at 13:53)

I maintain that the two most effective and unobtrusive anti-spam measures you can take are:

1) Be unique.

2) Put a hidden field with an initial value inside your form. Use JavaScript to set the value to something else. Verify on the server side that the value is something else. Put it in the moderation queue if the value is initial. Drop it on the floor if the value isn't present at all. Use a proper nonce setup if you want to go all-out.

[link]

From: David Magda (Jul 27 2009, at 14:19)

For Bob's point (2), this is what Mike Davidson does:

http://www.mikeindustries.com/blog/

So the initial HTML has one thing:

[form action="http://www.mikeindustries.com/wp/wp-comments-post.php" method="post" id="commentform"]

But just below that he has:

[script type="text/javascript"]

function changeaction () {

document.getElementById('commentform').action = '/deke';

}

setTimeout("changeaction()",5000);

[/script]

Jeremy Zawodny doesn't even have different questions for his 'CAPTCHA'. You simply have to type the word "Jeremy" in the correct box:

http://jeremy.zawodny.com/blog/

[link]

From: gvb (Jul 27 2009, at 14:35)

Related to Bob's comment's point #2, see Beating comment spam: Improving the Honey Pot. Per that blog, all you need is a hidden form as a honeypot, the spambots cannot resist filling it in, but humans don't know to do so since it is hidden.

[link]

From: Tuom A. (Jul 27 2009, at 14:37)

Maybe you know, this maybe you don't:

http://en.wikipedia.org/wiki/Xrumer

Also, some tricks include hidden form fields, Javascript-generated form, Akismet, eCAPTCHA, ... but I guess you know all this.

[link]

From: barristor henry williams (Jul 27 2009, at 15:07)

hello sir-

i have great offer to make you. pleaze give me ur ear. my uncle who was a very rich oil miner had died and has left me with the $10 million dollars that he had. the problem is that i cant access his bank acc. please send me 1k dollars and i will give you 5 million dollars in return

waiting to hear from you!

henrywil492@yahoo.co.uk

[link]

From: Jach (Jul 27 2009, at 15:17)

JavaScript methods are pretty good, but I still think they're overkill and as more people switch to NoScript then the laziness factor will kick in and you won't get so many comments. I almost didn't comment because I had to enable JS to see the "make a comment" link.

My own system (which I'm getting around to finishing) is two-fold. A visiting user has two choices: they can either sign up, or not. If they sign up, they provide a little information such as email, display name (which no one else can then use), and password, and must pass a simple arithmetic question this once. Then when they post a comment and are logged in, they just have to type in their comment. (And maybe down the line extra features.)

A visiting user who opts to not sign up just has to enter their desired display name and message and then answer an arithmetic question, on every single comment. So this encourages repeat visitors to sign up.

Oh yeah, and it all works even if the user has JS disabled, but they're not going to have at all the same experiences. (They'll just get to see the json string the server passes back. I figure anyone smart enough to not be running JS everywhere is smart enough to realize it worked and they can just go back a page.)

Anyway, being unique is to me the best solution regardless of how your captcha system is set up.

[link]

From: stand (Jul 27 2009, at 15:20)

I like your comment setup on the whole, Tim. The randomized error thing is especially cute. I may have to steal that idea sometime :).

I'm curious though. I think the combination of stuff that you do wrt comments seems to discourage voluminous and free form, rambling conversations. Assuming you agree, was that a design goal? I'm interested in how your type of approach would scale to sites with orders of magnitude more comments.

[link]

From: Craig Loftus (Jul 27 2009, at 15:40)

Of the previous suggestions it seems that only the first unobtrusive JS honey pot types don't degrade functionality.

I've never been a fan of the question type Captchas. As a mock example, take the question I've just been asked, 'How many musicians in a trio?', do I answer 3, three, a trio, a few?

As a slightly stupid idea, perhaps you could crowd source the problem (apologies) by taking your current moderation system and feeding the flagged comments to previously vetted commenters?

[link]

From: Joao Pedrosa (Jul 27 2009, at 15:45)

When I first created my unique blog system I allowed all posts to go through and it took just a week for spammers to find their ways around that. It gave me an excuse to create a Bayesian filter and it was fun to block posts automatically based on their content. Until it was not fun any more. ;-)

With more Javascript programming, I had the posting depend on Javascript and Ajax and on a unique API around that. The result was I started getting 0 spams so the Bayesian filtering did not need to exist at this point.

Many people do not share that enthusiasm because they do not want to program their own systems.

[link]

From: Elaine Nelson (Jul 27 2009, at 15:56)

For some reason, the faux errors remind me of the faux messages that scroll by when starting up Sims/Sim City. You need some reticulated spline failures in there someplace.

[link]

From: Peter Brooks (Jul 27 2009, at 16:07)

I do like your methods and for the Human part. How long until answer engines such as Wolfram Alpha or Google Squared gives a tool to bypass such a feature?

But I think we're safe for now and can remove the tin foil hats.

[link]

From: Derek Martin (Jul 27 2009, at 16:40)

From 1999 - 2009 I used a custom made piece of blogging software, written & update in PHP... It had a few captcha-esque things to try to prevent comment spam... but I started getting upward of 3000 comment spam per day. Added akismet and things got much much more manageable, but I just didn't want to manage my blog anymore. I wanted to write blog posts! So, I switched to Wordpress, wrote a content importer, and brought all my old content over. Now I use Akismet for Wordpress, and the built-in moderation system, and it works flawlessly.

[link]

From: Paul Thrasher (Jul 27 2009, at 16:41)

Palin free Cialis Click here for free stuff: http://donkeyshow.com

Just testing. Let's see if that gets through! :)

-Paul

[link]

From: Justin Watt (Jul 27 2009, at 17:45)

Apparently I also do what Bob suggests:

http://justinsomnia.org/2007/03/escalating-the-war-on-comment-spam/

I think forced moderation unnecessarily slows down/interrupts the conversation. Mike does this on The Online Photographer and it irks me to no end. If you're ever away from the keyboard (and I hope that happens once an a while), a whole stunted conversation happens in a vacuum because no one was able to get a word in.

Obviously you have a larger audience willing to comment, and so forced moderation might actually act as a positive throttle that prevents threads from getting out of hand while you are AFK. But really, that's putting very little trust in the community you foster here.

The best weapon against human-generated spam/vitriol is to delete comments vigorously and ruthlessly. Rather than approving every comment that comes through, why not delete/redact every comment that comes through that you find inappropriate. Thus the conversation continues unhindered, but you reserve the right to redact any comment that you find out of line.

Also check out Donncha O Caoimh's WordPress plugin, Cookies For Comments. Yeah, I know you're not on WP, but you could implement the concept.

http://ocaoimh.ie/cookies-for-comments/

[link]

From: Tony Fisk (Jul 27 2009, at 18:41)

...bloody vikings!

[link]

From: matthew (Jul 27 2009, at 18:50)

have you considered sending response status 404 instead of a funny error code?

[link]

From: Andre Bogus (Jul 28 2009, at 02:39)

The downside of your system is that commenting requires javascript on the client side.

If I hadn't read your article I wouldn't even have seen the "contribute" link with NoScript turned on until I trust a site.

Otherwise, it's bad for accessibility. Why not use a normal link?

[link]

From: Mark Bell (Jul 28 2009, at 06:00)

I also have a custom blog (hand coded in ASP.NET, no Wordpress or Movable Type anywhere to be seen), and a simple question/answer field below my comment form seems to keep spammers at bay nicely.

Admittedly, this might be because my site is fairly new and hence hasn't had much exposure yet; however, my previous blog used the same system (and indeed exactly the same question and answer), and I had no spam comments in the entire four years it was running - from bots, anyway...

To tackle Craig Loftus' point above: There's no need to degrade functionality, you just need to be careful about your questions and lenient with the answers. For example, the answer to my question is 5, but it will accept '5' or 'five' or 'FIVE' or... well, you get the picture.

[link]

From: Mohit Soni (Jul 28 2009, at 06:21)

Using captcha is a proven way to block spamming by automated spam bots. But, spam bots used in tandem with captcha breaker might turn out to be a nightmare for bloggers. In that case, comment moderation is the one that comes to rescue.

[link]

From: Geoffrey Sneddon (Jul 28 2009, at 12:56)

Both while previously using WordPress and now Habari, my main line of defense was simply replacing all references to the @name and @id of the input elements to numeric character references (for all characters in them). Amazingly, this cut (and still cuts) out almost all spam.

[link]

author · Dad · software · colophon · rights
picture of the day
July 27, 2009
· The World (112 fragments)
· · Life Online (267 more)
· Technology (81 fragments)
· · Web (388 more)

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.