8.5. Dealing with comment spam

Expect it to happen. Some folks get comment spam trickling in and others get a torrential downpour. It's best to deal with it from the start.

As of contributed plugins pack 1.2, the comments plugin has a "comment_reject" callback which allows other plugins to examine the comment and reject it according to their individual heuristics.

Also, you can run multiple comment rejection plugins. The comments plugin calls them one after another until one has rejected the plugin or all have said it's ok.

8.5.1. wbgcomment_blacklist

I wrote a simple comment rejector which rejects based on whether certain words show up in the comment. I noticed that "casino", "blackjack", and "pharmacy" show up with reckless abandon and yet none of my posts talks about anything related to these terms.

To get it running, make sure the comments plugin is installed and working first. Then get the wbgcomment_blacklist plugin from my web-site. Then set the comment_rejected_words property in your config.py file like this:

py["comment_rejected_words"] = ["poker", "casino", "gambling"]

Notegood blacklists

Each blog covers different topics and thus your word list will almost certainly differ from mine. I figured my word list out mostly by waiting for my blog to get comment spam and then picking out specific words in the spam to use for signifying automatic rejection.

Every month or so, make sure the list of rejected words still makes sense. For example, if I started talking about sleazy poker nights, then I should probably remove most of the poker-related words.

8.5.2. nospam plugin

The nospam plugin was written by Steven Armstrong. It uses PIL (Python Imaging Library) to create images of 5 digit numbers. The person writing the comment has to type in the number they see in the box for the comment to go through.

As of this writing, there are more details here: http://pyblosxom.sourceforge.net/blog/registry/input/nospam and http://www.c-area.ch/code/pyblosxom/plugins/nospam.py

Notea warning about captchas

Captchas require that the user be able to see with their eyes and recognize the text they need to type in. As such, this is prevents anyone who is using a web-browser that doesn't display images or anyone who cannot see from leaving comments on your site because they cannot perform the captcha.

If you want your site to be available to all people, don't use captchas.

8.5.3. rolling your own

It's not hard to roll your own comment rejection plugin. First figure out what the heuristics involved would be. Then write a plugin with a cb_comment_reject function in it. In that function, look at the data provided and reject the plugin if it seems appropriate to do so.

A basic template for writing a plugin to reject comments is as follows:

Example 8-1. Template for plugin for rejecting comments

FIXME - Documentation for what your plugin does and how to set it up
goes here.

FIXME - License information goes here.

FIXME - Copyright information goes here.
__author__      = "FIXME - your name and email address"
__version__     = "FIXME - version number and date released"
__url__         = "FIXME - url where this plugin can be found"
__description__ = "FIXME - one-line description of plugin"

def verify_installation(request):
    # FIXME - code to verify that this plugin is installed correctly 
    # here.

    return 1

def cb_comment_reject(args):
    req = args["request"]
    comment = args["comment"]

    blog_config = req.getConfiguration()

    # FIXME - code for figuring out whether this comment should
    # be rejected or not goes here.  If you want to reject the
    # comment, return 1.  Otherwise return 0.