Spam registrations even after putting a captcha and 2 questions

tomyake · January 28, 2013, 11:34:09 PM

I've been having recently a lot of spam registrations everyday event though I've enabled the captcha verification and 2 questions when submitting the registration, does anyone know how can I fix this??

Ricky. · January 28, 2013, 11:42:43 PM

Well, bots are getting smarter day by day, you may try mods like stop forum spam or httpl , try searching mod site.

Arantor · January 28, 2013, 11:48:30 PM

The questions are almost certainly too easy and you can ditch the CAPTCHA.

tumbleweed · January 29, 2013, 12:05:48 AM

Here is a good articlke that has some mods that may help:
http://wiki.simplemachines.org/smf/Spam_-_my_forum_is_flooded_with_spam,_what_can_I_do

Otherwise change your questions to something that is more specific to your forums content.

Mick. · January 29, 2013, 12:09:29 AM

Dont use the captcha. Use 1 question related to your website. Dead on their tracks.

Arantor · January 29, 2013, 12:10:51 AM

Use a *good* question related to your website. A generic question whose answer is the first hit on Google if you search for it is likely a bad question.

tomyake · January 29, 2013, 06:41:21 AM

Okay thanks a lot, will try all of this...

tomyake · January 29, 2013, 07:46:20 AM

I've added a firewall mod and everything worked out. Thanks a lot

But I'm curious, how can a bot bypass the captcha or recognize the letters since a human can't sometimes, and if it's trying random possibilities there would be a lot since we are working with 7 letters!!!! Unless there's a bug in the captcha?

Arantor · January 29, 2013, 07:52:13 AM

Bots can actually do a better job of reading it than humans can, and have been able to for years. Remember, they're not looking at it the same way humans do.

tomyake · January 29, 2013, 07:55:03 AM

I'm researching in the field of hand writing recognition and I can tell you that recognizing such captcha can be quite hard. I'm going to try and implement my own captcha, remove the firewall and see what happens

Storman™ · January 29, 2013, 08:12:05 AM

Sounds interesting, don't forget to feedback on the outcome

Mick. · January 29, 2013, 08:13:41 AM

Quote from: tomyake on January 29, 2013, 07:55:03 AM
I'm researching in the field of hand writing recognition and I can tell you that recognizing such captcha can be quite hard. I'm going to try and implement my own captcha, remove the firewall and see what happens

Trust me, captchas suck. Dont use them. Just use the security question related to your website.

Dont use green or 2+2=4.

The question should be related to your forum genre. If your forum is about cars, use a security question like:

-How many spark plugs a Ford F-150 v8 has? Sure, the answer would be 8.

Captchas are annoying to all potential new members. Make it easy for them, not for you.

Arantor · January 29, 2013, 09:53:51 AM

QuoteI'm researching in the field of hand writing recognition and I can tell you that recognizing such captcha can be quite hard.

Handwriting, yes, because letters are not consistent.

But when you're dealing with a set of letters that are identical every time and you're just looking for variations of those letters (the CAPTCHAs use fonts that are bundled with SMF, it is not hard to identify which font is in use), it's not remotely a problem.

QuoteI'm going to try and implement my own captcha, remove the firewall and see what happens

Having actually done this myself, I can tell you that you're probably going to fail it on the first hurdle. That's not my being rude or self-aggrandising, but my experience.

Your first step pretty much needs to be to add your own font. If you use the fonts that are shipped with SMF, any attacker has a ready-made set of exemplars to work against. In my case I reused those fonts knowing that limitation, but the things I had in mind didn't really make that much of a problem.

Your second step revolves around noise and lettering. The standard practice of beating a CAPTCHA is to find the letters. Step 1 is to identify what is the background and what is not, step 2 is to eliminate the background noise, which usually is low contrast to the background, or when it's not, it's usually sparse noise, which is also reasonably easy to remove. A single low-frequency blur filter will usually do the job of figuring out what's noise and what isn't.

If you have lots of colours, invariably this makes it much easier to beat, especially if all the letters are the same colour vs multi coloured background.

Now let me explain what I did. I created a CAPTCHA that would put out multiple different styles of image, rather than just a single style. That makes it significantly harder to break.

From there:
* one style creates a background colour and a foreground colour, draws the text in the foreground colour then proceeds to draw a grid over it. The effect is that it's much harder to figure out where the grid is and where the letters are.
* one style creates a background colour then a sort of 'LED array' effect, as if the entire CAPTCHA is a matrix of bulbs, some lit, some dimmed, to make the letters up. The effect is that the usual methods of finding consecutive pixels of colour doesn't really help because there are gaps in between the bulbs. There's also an animated version of this too.
* one style (with some variations) is what I call 'recompose' - over the period of about 20 animation frames, we draw in pixels frame by frame to get to the point where the cumulative effect has all the letters. The variations cycle this back and forth.
* one style creates the letters over a series of frames like the recompose animation, except it doesn't draw in the letters; it draws in the *shadows* of those letters as if the letters are solid. The effect is that the human eye can usually deal with it but bots can't.
* one style draws in equal noise across the page then erases some of the foreground pixels to draw the letters in what amounts to negative space. The effect is to make it very difficult to pinpoint where the letters are because the letters are effectively the same as the background.
* one style draws a series of diagonal lines, then rubs out the space occupied by the letters, before drawing another series of diagonal lines the other way. There's enough space in the negative space to be readable in most cases, just not by bots.

This is still no substitute for proper Q&A but it certainly works as a better line of defence than a CAPTCHA that has remained mostly unchanged in years.

MrPhil · January 29, 2013, 10:00:14 AM

Don't forget that it may not necessarily be a bot that's reading the CAPTCHA image. Spammers can employ "farms" of poorly paid Third World peasants to sit at PCs all day and do nothing but sign up for forums. Still beats toiling in the fields, I guess. Good Questions are likely to be beyond the built-in knowledge of these people, and the spammers might not want them to take the time to look up the answers -- just go on and find an easier target.

Arantor · January 29, 2013, 10:02:15 AM

That's the point of the animated ones - it's a little slower for them to complete

tomyake · January 30, 2013, 12:26:46 PM

Do you have a sample of the captcha you've created?

Arantor · January 30, 2013, 03:00:57 PM

I do have a sample but the powers that be do not like me linking to the site here because it is a rival to SMF... (though *based* on SMF's code)

tomyake · February 03, 2013, 02:25:41 PM

hehe okay thank you, I'll see what I can do from my part... and get back to you with the feedback as soon as I finish

News:

Spam registrations even after putting a captcha and 2 questions

MrPhil