News:

Want to get involved in developing SMF, then why not lend a hand on our github!

Main Menu

Anti spambots

Started by gkawa, October 22, 2018, 12:42:16 PM

Previous topic - Next topic

gkawa

Hi guys

New here. A friend asked me to help with his forum, using SMF 2.0.15. It was on an older version but he wanted to change hosts so I did a whole new installation. That went well but the site started to get a nasty spambot attack. Not a crippling one but certainly annoying, 30/40 registration attempts every day.
I have it under control at this point. I'm seriously considering reCaptcha, it's my last option because I hate it. Right now, I'm letting them apply to check the IPs and the methods they use. I don't want to use antispam mods that let them get in and block them from posting, I want them out because that's the safest option. Maybe it's the impossible one, but that's what I want.

In the meantime, I found a couple interesting question that may be used against them.

One thing that's killing me is how easy is for them to break the SMF native captcha. I wonder, is the validation code a function of the text in the captcha? Can they calculate the text from that? For a second I thought the could use human interaction but the sequence is too fast for that. I don't think they're even downloading the image. I forgot to check that, I'm almost sure I didn't see it in the traffic.

The questions don't work. I checked many times changing them over and over. The bot kind of stops for a moment and then starts again. So I think there's some human interaction there. It can be solved increasing the number of questions and making them random or adding random elements. It's a drag anyway. As a user, I hate those questions and I'm sure everyone else does.

One thing that (I think) it's easy to implement and could make a huge difference is to use the strength of the bot against them. I checked the traffic and the timing of the bot is beautifully precise. The registration form is completed in almost 0 time, something that a human couldn't do. So, if the GET of the registration form is remembered with an ID and a timestamp and the POST is sent below a reasonable time margin (let's say 1 or 2 seconds), we know it's a bot and the registration can be automatically discarded. No need for captcha. The same system could be applied to posting.

I'm guessing the HTTP referrer is already validated by SMF. The bot is not going straight to the POST. it GETs the main menu and then goes to the registration. But I doubt it checks the content of the web pages, I'm sure it creates the URLs as needed. I was thinking about changing the action names.

Like changing

      'register' => array('Register.php', 'Register'),
      'register2' => array('Register.php', 'Register2'),

for

      'registratione' => array('Register.php', 'Register'),
      'registratione2' => array('Register.php', 'Register2'),

And all the scripts that generates URLs pointing to them. I'm guessing the registration link and the Register.php script. Could it work? Is it more complicated than that?
I understand that this could be a problem eventually, not using the original script will interfere with upgrades and mod installations.
But if it's a small change, I think I can keep it under control.

Thanks to you all for the work you do. I hope you can get something back from this aimless brainstorming.



Arantor

Firstly, you seem to be assuming bots have reverse engineered a weakness in SMF's captcha. They haven't, they just OCR it because that's trivially doable even in JavaScript these days. Changing the URL doesn't slow them down much either, we've tried that. And yes it is more complicated that the change you allude to. The bots know they have to go to the registration page because they know they need the session details (which can't be faked, they have to be retrieved)

Our first, best recommendation is still questions, but if you're not having luck, that presumes questions that are too easy. Bots can use Google too...

More info: Spam - my forum is flooded with spam, what can I do

Sesquipedalian

The trick is to use questions that are easy for humans to answer but not easy for bots to guess using a brute force attack.

For example, I run a couple of forums geared towards church groups, so I created some fill in the blank questions where the user had to fill in the missing item in a sequence of the books of the Bible.

This kind of question has several virtues. First, the answer is long and would not appear on a list of common answers. Second, the answer is obvious to anyone in the target demographic (although they might need to look up the correct spelling in some cases). Third, the question has exactly one, clear, and unambiguous answer.

I require answering two such questions for guests to post or register. Using this approach I have had zero spam bots get through, and no complaints from users.
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

Arantor

While that's true, it's not impervious. Bots do have a circulating list of sites, questions and answers and if someone does get to your site, the questions can be added to the list and then other bots can get in.

Kindred

Adding about 30 questions, asking 2 at a time -- and changing the question pool about once a year seems to be the ideal solution that I have found.

(because eventually, your questions do get cataloged and added to the spammer's database of auto-check)
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

gkawa

Quote from: Arantor on October 22, 2018, 12:48:34 PM
Firstly, you seem to be assuming bots have reverse engineered a weakness in SMF's captcha. They haven't, they just OCR it because that's trivially doable even in JavaScript these days.
It's not that I assumed, I thought it was a possibility. It came to my mind because I thought they were not asking for the image at all. But it's there, action=verificationcode.   

Quote from: Arantor on October 22, 2018, 12:48:34 PM
Changing the URL doesn't slow them down much either, we've tried that. And yes it is more complicated that the change you allude to. The bots know they have to go to the registration page because they know they need the session details (which can't be faked, they have to be retrieved)
Yeah. I was afraid it could be difficult and pretty easy to break.

Quote from: Arantor on October 22, 2018, 12:48:34 PM
Our first, best recommendation is still questions, but if you're not having luck, that presumes questions that are too easy. Bots can use Google too...
I'm under the impression that they have human interaction to solve them once. Tricky questions are discouraging for real users, I'd go back to that if there's no other solution. For the time being, admin validation is a burden but seems to be the safest. Eventually, I'd have a clear idea of their networks and move all that to the .htaccess, it will be a lot more efficient.

Quote from: Arantor on October 22, 2018, 12:48:34 PM
More info: Spam - my forum is flooded with spam, what can I do
I went through it. And I'm sure you're all right but I'm still hoping to find a better solution. Call me naive  ;D

What about timing control? I understand that they will eventually find out and adjust their bots. However, setting a time margin of 10 seconds (rarely a human would fill that form in less time), could reduce their capacity to less than one tenth (I'm considering that they use less than one second from GET to POST).

Thank you very much for your answer.

Arantor

Timing control is a feature in 2.1, I did make it available as a mod for 2.0 but it is currently not available, not likely to return any time soon.

Kindred

I have
Stop Spammers
Bad Behavior + HttpBL
30+ questions (ask 2)

I have not had a single spambot in 3 years.
I had one human spammer make it through the questions, get caught and changed the questions.


So, you say that you're looking for a "better solution" --- what is better than 3 years of no spam?
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Shambles

Quote from: Kindred--- what is better than 3 years of no spam?

10 years of no spam ;)

I'm with Kinders here. Devise the questions/answers pertinent to your forum content.

GigaWatt

Quote from: Sesquipedalian on October 22, 2018, 02:06:10 PM
The trick is to use questions that are easy for humans to answer but not easy for bots to guess using a brute force attack.

Like this ;). I have mine set up to ask only a single question (no captcha), not one bot registered since I followed that template ;).
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

gkawa

Thank you all for your answers. I'll give it a try.

Advertisement: