ASPit - Totally ASP
Search PHPit

Use this textbox to search all the content on PHPit. Seperate keywords with a space.

Advertisements
Partners
  • Norwich University
  • Search Online
  • Hummer
  • Pattaya Hotels
  • Norwich University

Blacklist that spam, with PHP!

Introduction


Spam is a big problem lately, and if you run any kind of website where a comments system is in place, you’ll probably have to deal with spam sooner or later as well. You might already be receiving several spam comments each day.

To combat this problem you can use several different methods, for example you could setup a moderation system, whereby all comments must be validated before appearing on the website. Or, you could work with a registration system, where all visitors must first register to post comments. But both these solutions make it harder for your visitors to post comments. Definitely not a good thing.

Another option would be a blacklist. The goal of spam is usually to increase the PageRank of certain websites (usually porn websites), so it gets a higher ranking in Google and other search engines. If you can prevent spam, that contains these URLs, from ever being posted, then the spammers no longer have a need to spam your websites.

Implementing a blacklist is relatively simple using some easy PHP. In this tutorial I will show you how to validate a message against a blacklist. I’m not including the comment form or any other form validation. I’m only going to show how to create a blacklist system, in a few easy steps.

The blacklist format


The blacklist we will be using has a specific format, already in use by other blacklist system, e.g. MT-Blacklist, a blacklist system for MovableType. Basically, we will put a separate blacklist entry on each line. Lines starting with a # or blank lines are to be ignored as those are comments or blank. To make our blacklist system even more powerful, it will also support regular expressions so we can match thousands of domains with one rule. For a very good begin to your blacklist, you might want to download Simon Willison’s Blacklist.

The blacklist code


Creating our blacklist system is very. The first thing that must be done is open the blacklist. This is easily done using the file() function.

// Open blacklist file
$blacklist = file("/home/you/public_html/blacklist.txt");
?>
The blacklist is now loaded in an array called blacklist. Each entry is a separate array item. All that’s left to do is loop through each entry, using foreach(), then checking if the entry isn’t a comment or blank, and finally checking the message for that entry, using preg_match() (after all, we did want regular expression support). The code looks something like this:

// Loop through blacklist, and check message
foreach ($blacklist as $item) {
        // Check if item isn't empty or comment
        $item = trim($item);

        if (!empty($item) AND substr($item, 0, 1) != "#") {
                // Not a comment, or empty, check message
                if (preg_match("/" . $item . "/i", $message, $match)) {
                        // Blacklisted URL found
                        die("Sorry, your message has been denied because a blacklisted URL (" . $match['0'] . ") has been found.");
                }
        }
}
?>
That’s all there is to it. You can certainly make it a bit better by changing the error message, checking other fields (e.g. a separate ‘homepage’ field), and even including relevancy points. But the above code will already stop a hell of a lot spam, if you’ve got a decent blacklist.

View sample script | Download sample script
About the author
Dennis Pallett is the main contributor to PHPit. He owns several websites, including ASPit and Chill2Music. He is currently still studying.

Dennis Pallett has written 2 FAQs, 19 codesnippets, and 5 tutorials.