Sigh. I guess it is a good sign that I’m researching CAPTCHA’s at the moment. Spammers finally seem to have found our site and deemed it good enough to spam it. Surely a sign of succes, but also a nuisance for the normal people using the site. Making the signup procedure a bit more difficult for computers seems the way to go, so some kind of CAPTCHA has to go in. The other approach would be to use a DNS-based blocking list, e.g. through an apache module like mod-defensible but in general I’m not a big fan of blocking lists due to false positives.

Besides, CAPTCHA’s still seem to be working fine as a deterrent so I don’t have real doubts about the effectiveness. But putting up a crappy image and expect people to parse it and retype it is an additional hurdle, so my attention got caught by the term ‘invisible captcha’, where some Javascript is used to add a hidden field to a form with a secret. This should be trivial to work around if the spam robot would actually execute the javascript, but that doesn’t seem to be the case for most of the robots.

Some examples: this method in .NET just uses a random GUID value for each form. This PHP example creates a simple sum to be done, which has the added benefit that the user can be asked this task as well, thus degrading gracefully for people without javascript. Things seem to have started with this article on a Lightweight Invisible CAPTCHA control which also provides a simple sum to make.

I’ve ended up cobbling my own invisible captcha together based on the ideas put forward in the .NET article mentioned above, and so far it seems to work fine, keeping the spambots out while letting normal people in. I’d link to our signup page so you can see for yourself, except that it’s an invisible CAPTCHA. :-)

Published on 26/04/2007 at 16h48 by Hans de Graaff, tags , , ,


Today I had to map free text to plausible filenames, with the caveat that the text could contain UTF-8 characters with accents. Even though it is possible to have filenames with these characters, I wanted to end up with ASCII-only filenames for easier handling. Also, the filenames will be exposed via URLs, and just having ASCII there takes away a log of headaches. But how to convert this?

I quickly found the apparently wonderful Text::Unidecode for Perl which seemed to do anything I wanted, but since we build our web services with Ruby on Rails I needed a Ruby solution. I hoped that someone would already have created a ruby version of Text::Unidecode, but that’s not the case (or I could not find it). I did find the Asciify gem, though. Although simpler in design and reach than Text::Unidecode, it does enough for my purposes and custom mappings can be created for it.

Asciify’s documentation is pretty much non-existing, but some reading of the source code revealed that this was how I could convert my text:, ‘_’)).convert(‘some text’)

The default replacement character for Asciify is a question mark, which makes sense in general, but not in URLs, so I opted to use the underscore character instead for lack of a better candidate. Since I’ve included the gem as a plugin in the Rails project I’ve just changed the default mapping to include some characters rather than using my own mapping.

Published on 27/02/2007 at 16h59 by Hans de Graaff, tags

Powered by Publify | Photo Startup stock photos