We live in the world that does not like any kind of emptiness. Captchas became too difficult for solvers, so people figured out another way around. Instead of writing complex captcha solvers it was cheaper to hire people to solve captchas for spammers. That means having only one kind of protection is not enough.

On my blog I run at this point 3 different anti-comment-spam protections:

  • captcha
  • comment approval
  • new: my “commentards” script

That list should have been extended by Akismet – a very good service that fights comment spam (commercial subscriptions are payware). With the amount of traffic on my blog I feel comfortable doing some filtering by hand. But then there came this day when I simply said: “enough”. To be honest I was curious whether blocking spamming IPs will have any effect on the spam rate.

On 10.12.2012 I build a very simple bash script that simply reads all IP addresses from my comments table and puts them into a special iptables chain that filters HTTP traffic. At this point I have not enough data to draw any meaningful conclusions, but after 2-3 months I’d like to post a part 2 and simply see if my idea had any sense. If you feel any resemblance to fail2ban – yes – it’s more or less a similar idea.

Comment spam rate

As you can see in the picture there were days when spammer activity was pretty high. Below is a few interesting bits – please have in mind that all data was gathered in the following time frame: 07-08-2012 – 20-12-2012!

  • spam came from 73 unique IP addresses
  • maximum 6 spam comments from one IP address
  • majority of spam comments came from different addresses (56 IPs)
  • a total of 100 spam comments was “donated” to my blog

Spam comments by IP

As I already wrote – I will be very much interested in making the same analysis in 2-3 months. Below are sources for a simple commentards.sh bash script. The script is extremely easy and I am already thinking about expanding it a bit with blocking for whole subnets and to introduce delays similar to fail2ban’s idea – but let us see whether it all has sense at all. The script works on Ubuntu 10.04 LTS  – and in theory it should be compatible with most (if not all) Debian-based distros.

#!/bin/bash
 
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 
rm -f /tmp/block.txt
mysql --defaults-file=/etc/mysql/debian.cnf --execute="SELECT DISTINCT comment_author_IP FROM mywp.wp_comments WHERE comment_approved='spam' ORDER BY comment_author_IP INTO OUTFILE '/tmp/block.txt'"
 
IPT=iptables 
 
$IPT -F ctards
 
for i in `cat /tmp/block.txt`; do
$IPT -A ctards -s $i -j DROP
done
 
$IPT -A ctards -j RETURN

The script should be run from crontab – say every day. Please have in mind that it only updates the ctards chain. You should have another rule in your INPUT chain that redirects the HTTP traffic to it:

iptables -A INPUT -p tcp --dport 80 -j ctards

The last rule will redirect all TCP traffic heading for port 80 to your ctards chain – and then the rest will be filtered there. You can of course add more ports, i.e. port 443 if you wish to protect also blogs/sites behind SSL.