Solving spam problems without using captcha

Some of you know that automated spam bots for the web can be pain in the ass.

And while there are many ways to deal with this problem, most of them complicate things for the regular user and are not hacky enough to satisfy me. 🙂 For example captchas, math problem solving, etc. mostly work fine but they also scare away some of the users. There are also some paid tools (Akismet?) that helps dealing with spam but it doesn’t work on all spam.

I will share information on how I stop almost all of the automated spam messages to my WordPress blog without complicating things for the user.

 

Knowing limitations of HTTP protocol libraries and “exploiting” them

I have been writing a lot of scripts that automate GET’ing and POST’ing things to the web. There are many libraries that help doing it and one of the most popular ones (if not the most popular) is Curl.

While Curl is great, it still lacks some features, like building a zero length multipart POST data file upload part. Basically it can’t simulate file upload field, that has no file selected for uploading. And since many spam bots use Curl and similar libraries, this can be used to identify real browser and a script.

Example usage in PHP:

Not just Curl

I haven’t used Curl for a year now because I switched to Perl’s LWP. And it has the same problem. I solved it by making my own function for building the multipart form data and it can simulate the behavior but it doesn’t happen by default.

And while in case of LWP it was rather simple, doing it in Curl (used in PHP) would probably require recompiling the Curl library and then recompiling the module for the programming language that is using that library.

 

Changing field names

Another trick I use is changing field names and adding additional ones that are meant to be left empty.

This method also seems to catch part of spam messages.

 

From the user point of view

User obviously doesn’t see those extra fields because they get hidden using CSS. The comment form looks like any other standard comment form.

 2013-04-12-125208_1440x900_scrot

Conclusion

I have been using these techniques for about a year now and haven’t had any spam problems since then. Also I have a huge log file with spam that was caught using these methods. This really does work. 🙂

Of course it won’t help in case of directly targeted spam but in that case captchas won’t help either.

And I understand that writing about this will probably contribute to fixing Curl and other libraries and eventually making this protection method useless. Well, at least they will finally fix those libraries! 😀

Jailkit, mini_sendmail and custom HELO

To be sure that a server stays safe in case when one site is compromised, I try to lock every single site in its own chroot jail. To make it a bit easier I use Jailkit.

Since you probably don’t want to set up sendmail for each chroot, you could use mini_sendmail. It will work as relay and will pass messages to actual sendmail.

The problem is that there is no way to specify a custom username or hostname and this could be quite important in some cases.

In order to solve this problem I did some quick and dirty modifications and here is the patch in case you need it:

Save it as some.patch. Move it inside mini_sendmail source directory and run:

You can specify username with -u and hostname (and HELO message) with -h parameter.

If you are going to use it with PHP, change sendmail_path in php.ini to something like this:

This should make php connect to sendmail running on 127.0.0.1 port 5555 and send example.com as HELO and noreply as username.

Patch was made for version 1.3.6.

Nginx un citi webserveri

Pirms daudziem gadiem hostēju savu blogu uz vecas Pentium 2 kastes ar, šķiet, 128MB RAM. Lieki piebilst, ka aparāts nebija no tiem ātrākajiem un varēja lieliski novērot cik ļoti prasīgas ir dažādas aplikācijas.  Sākumā lietoju Windows (XP) + Apache + PHP+ MySQL. Lai uzlabotu ātrdarbību, pirmo reizi pamēģināju Linux (ja nemaldos, Slackware ar XFCE, jo terminālis bija kas svešs). Apache + PHP + MySQL uz Linux darbojās ievērojami labāk, Linux netērēja tik daudz resursus un biju priecīgs.

Vēlāk izmēģināju Lighttpd. Tas tērēja vēl mazāk resursus, bet man neiepatikās konfigurācijas sintakse un internetos sūdzējās, ka tam esot sūces, kuras, šķiet, pāris gadus nelaboja. (Pats nevienu memory leak nenovēroju.)

Vēl pēc kāda laika pamanīju, ka draugiem.lv un citas lielas lapas Lighttpd nomainīja uz Nginx. Uzreiz to neizmēģināju, jo uz jaudīgākām kastēm biju apmierināts ar Apache2, un tajā brīdī pieejamā dokumentācija bija pārsvarā krievu valodā (varbūt nemeklēju kārtīgi). Bet 2010. gada sākumā/vidū, beidzot, pienāca brīdis, kad uzliku to uz sava mājas rūtera. Nedaudz papētot tā iespējas, biju patīkami parsteigts. Tas ir ērtākais un foršākais webserveris/proxy kādu esmu līdz šim lietojis. Konfigurācija ir super ērta, ar daudzām iespējām (manurpāt, viens no lielākajiem Nginx plusiem), ir pieejama kaudze ar papildus moduļiem, tērē maz resursus un ir lielisks community support.

Izmēģinājis to uz savas kastes, uzliku uz vēl pāris serveriem un iesaku to arī jums. 🙂

Ja nemaldos, lielās Linux distribūcijas to piedāvā savos repozitorijos, bet es iesaku to nokompilēt pašiem, lai var pievienot kādu papildus moduli vai noņemt nost kādu no jums nevajadzīgajiem.

Kādu webserveri lietojat jūs un kāpēc tieši to?

Web panel for Uploader

Since some people prefer shiny user interfaces over command line, I have added web administration panel to auto uploader.

Since it is really early über pre-alpha version, it doesn’t have many options yet but it will change in time.

The idea is to create admin panel that allows you to manage Uploader using only your browser and to forget about terminal commands.

Alpha version of web panel for Uploader

As you can see in the screenshot above, you can already search, remove, upload and redownload releases.

Web panel is built using PHP and jQuery. To make it work, make sure you have installed Sqlite3 for PHP (it comes installed by default from PHP 5.3.x). You should be able to install it on Debian/Ubuntu by using sudo apt-get install php5-sqlite.

It is available through SVN repository, inside trunk/web_panel/.