News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

question about creating anti spam or bot questions

Started by MacGig, June 26, 2011, 09:35:50 PM

Previous topic - Next topic

MacGig

what type of anti spam questions would work the best to keep bots out?

math questions like 4-3=1

or something like "how many months are their in a year?

or something else?

Just wondering what sort of questions are best, what do others use? or does it really matter that much?

thoughts?

mashby

Maybe try this approach. Two simple questions. Change them every month or so (or when you start getting new spam members). Or try one of the anti-spam mods.
Always be a little kinder than necessary.
- James M. Barrie

xrunner

I made a paragraph with random words in sentences, and posted on my website as a stand-alone page.

I then made up a question like this:

Find the 4th word in the 6th sentence in the paragraph located here (link to the paragraph) and enter it as the answer to this question.

MacGig

Thats a cool idea. I had a few ideas, just not sure what to go with. I know the admin section says to keep the questions fairly simple... I may do a combination of math questions or showing a word and asking them to spell/type it backwards... so CAT must be entered as TAC.. or something like that... since it looks like the questions rotate, I plan on making 20 or so at least... for starters.

Topman

Be aware that number answers may be like   "six" or "6"

better to have a really simple daft question like

What is the third word in this question?   Answer "the"

Aleksi "Lex" Kilpinen

Math questions are bad really, we all know computers can count ;)
Something simple, that a living person can solve fast but a computer won't know, is best.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

MacGig

thanks for the tips. I agree, math may be a bad idea, unless I use words??

one + three = four


Illori

then i bet some of your users will type 4 and not four and complain to you about it

Topman

Stick with "what is the capital of France?"  and the like. 
If they dont know that they are too thick to be usefuil members!

MacGig

yep I want to keep it simple... good ideas thanks. :)

MacGig

will a robots.txt file help keep bad bots off the site or is that useless?

Illori

that is only for search engine type bots and not all bots follow that file.

WantSome

I saw in another forum that someone had questions 'what colour do you get when you mix blue and yellow?'  Everyone knows the answer is green, right?

Red and blue = purple, etc.

I thought that was a clever way of getting around the math type questions.

On my own forum I have 2 questions with answers, and the third one is 'if you are human leave this block blank'.  Bots tend to try and fill something in there and so are rejected.  So far only one human or two spammers have gotten through.

Topman

"Only one human or two spammers"  ????
Sounds like you are blocking everyone!

WantSome

I'm going crazy

I mean one or two human spammers!

BigMike

I have some questions to add to this discussion. I know the topic is old, I looked at a few other old ones before deciding to reply here.

My question refers to the psyche or behavior of the spammer.

At face value it seems that using verification questions is the de facto solution to preventing spam.

Consider this example: "What did Chuck Norris eat for lunch today?" Obviously only people who were with Chuck during his lunch would know the answer (dictionary attacks aside).

So my question is, generally speaking how many spammers are actual real humans going registration form to registration form? Are there organizations that actually hire people to sit at home all day spamming? When I look at someone spamming a site by registering and posting links for XYZ shoe company.... Either this was an automated script or it was an actual person who perhaps is hired by XYZ shoe company to try to increase their search engine rankings perhaps. Are there actually companies who do this?

Certainly its not possible to bypass the Anti Spam Question, right? It could be made impossible, for instance, if I had the question: "What is on the other side of the moon?" and the answer was exactly "There are three toaster ovens, a box of rocks, 11 rookie football cards of Jerry Rice, four nitro powered radio controlled cars, and pictures of Bob, Mark, Ted, and Frank", then wouldn't this effectively be the same as turning off registrations all together? (since no one other than the admin would surely know the answer, and a dictionary attack would fail or take an eternity to pass)

I started thinking about all this after doing a lot of recent work at combating spam. I discovered there is a lot of spam prevention software available that I didn't know about. So why not just create questions that you'd only expect your members or target market to know, and then update them monthly? Wouldn't this be a end-all solution without needing any other levels of protection? (baring any known security exploits in the website)

What are your thoughts? I'm trying to learn what or who spammers are so I can better deal with them.

Thanks
Mike

Arantor

Quotegenerally speaking how many spammers are actual real humans going registration form to registration form?

It's typically fairly small. And the human spammer population is typically slow enough that you'll notice them. But don't rule out what I've come to term human-backed spammers, where a human does the work of breaking CAPTCHAs and then provide access to bots that way. People being paid $1 per 1000 CAPTCHAs (think India, China) are not unheard of, either.

QuoteEither this was an automated script or it was an actual person who perhaps is hired by XYZ shoe company to try to increase their search engine rankings perhaps. Does this occur on a small scale? Large scale?

Evidence suggests this is somewhere in the middle. The account may have been a human posting, or it may have been a human registering and defeating any measures, or even just straight up a bot.

QuoteCertainly its not possible to by pass the Anti Spam Question, right? It could be made impossible, for instance, if I had the question: "What is on the other side of the moon?" and the answer was exactly "There are three toaster ovens, a box of rocks, 11 rookie football cards of Jerry Rice, four nitro powered radio controlled cars, and pictures of Bob, Mark, Ted, and Frank", then wouldn't this effectively be the same as turning off registrations all together? (since no one other than the admin would surely know the answer, and a dictionary attack would fail or take an eternity to pass)

Yes, that is true for all practical matters.

QuoteI discovered there is a lot of spam prevention software available that I didn't know about. So why not just create questions that you'd only expect your members or target market to know, and then update them monthly? Wouldn't this be a end-all solution without needing any other levels of protection? (baring any known security exploits in the website)

This is essentially what is being recommended, with one caveat: the question should be something that, when put into Google, does not give you the right answer as the first few links - e.g. someone just performing a search on the question to find the answer. There is evidence to support the notion that bots are doing this.

But yeah, essentially this is the way to do it.

BigMike

Arantor,

I have a lot of respect for you and I appreciate your input. This is all very interesting to say the least!!

I read in another post of yours that this is the single most important measure in your opinion.

I'm sure the bots are programmed to look for common key words or phrases, such as "What color is the sky" and they'd enter "blue".

We are fortunate enough to be in a specific industry. To test this, I have removed my visual verification and increased the verification questions to 4 (from a pool of one dozen) using relatively specific questions that only our audience and target users would know. Indeed a human spammer could visit our product website and search for information (and answers) to my questions but that is going to take some time. There is always the chance of scaring away genuine members, but our site is also unique in our marketplace so the demand should negate this for the most part. I'll keep a watchful eye on registrations moving forward.

I guess I never cared too much about this until just recently we started getting a flood of new users that weren't doing anything at all, just a bunch of stale accounts (probably someday later to spam with) (I soon discovered nearly all of them are in the StopForumSpam DB). I did have questions configured but they were pretty easy.

I set a monthly reminder on my phone to update our questions. Gonna see how this does!

Have a nice evening,
Michael

Arantor

Yeah, writing a good question is most of the battle on this one. Writing a math question is no defence at all, because it's solveable by machine, even if that's only as far as dumping it in Google (which is partly what leads me to believe bots are doing it, I've never been able to get hold of a bot)

Being in a specific niche is great because you can reasonably assume that your target audience would know the material and thus can probably be able to find the right answer with little effort - one of the better examples I have of this is a forum for a game called Elements, and the question was how many elements there are. The joy of that question is that not only is it something a player is going to know, and a non player isn't, there is sufficient ambiguity to distract bots with a false answer. To me, that's the ideal question but the confluence of events is not common, sadly.

One of the other things I tend to do is purge unfinished registrations - if a registration has come in and not validated their email in, say, 3 days, I just delete it because it's more than likely a spammer gone wrong, or a human who put in the wrong email address. Either way, it's not a loss to the forum as such (humans who cared would put in the right address in the first place)

BigMike

Quote from: Arantor on January 10, 2013, 09:06:16 PM
I have of this is a forum for a game called Elements, and the question was how many elements there are.
That is really cool. Now you'll know that every member at least knows the basics of what the community is all about in the first place :)

Quote from: Arantor on January 10, 2013, 09:06:16 PMOne of the other things I tend to do is purge unfinished registrations - if a registration has come in and not validated their email in, say, 3 days, I just delete it
This is a good thing also. I have a 15 day policy and I send out reminder emails. I think you're right, if the person genuinely wanted to sign up and she didn't get a confirmation email, then she would jump through hoops to get it sorted out.

Hang on a second...

Okay I just removed a bunch of old pending registrations. Felt reallly good haha

Mike

Arantor

I actually added auto-prune to the daily maintenance task on my stuff for that reason, setting a 3 day window for them.

ayuub

I use to have sow many spammer registering everyday I use to used numbers example 5+1 or 10+2

but since i changed my spam question in writing questions I start getting 0 spammers

Here is my sample
Q) How many days in July? A) 31
Q) How many days in a week?  A) 7
Q) What date is Christmas day  A) 25 December

This worked for me well

Kindred

Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Deprecated

Quote from: MacGig on July 05, 2011, 02:32:49 PM
will a robots.txt file help keep bad bots off the site or is that useless?
Robots.txt is honored only by *GOOD* robots. The bad ones either don't read it, or they might even analyze it for places they are supposed to stay out of -- and look there first!

Deprecated

Quote from: WantSome on July 06, 2011, 02:59:09 AM
I saw in another forum that someone had questions 'what colour do you get when you mix blue and yellow?'  Everyone knows the answer is green, right?

Red and blue = purple, etc.

Bad question -- not everybody is an artist.

Deprecated

The best way to use this feature is to have a very large question pool. Even 20-30 questions is not unreasonable. Then present only maybe 3 questions at random out of your large pool.

The way spammers work on this challenge when you have 3 questions and ask them all, they register once by human and program the answers and next thing you have hundreds of new members. That actually happened to me, and over a few months I had to delete hundreds of accounts.

Instead select a big pool of questions, at least 20 up to maybe 30, but ask only 3 for each registration. That would take a huge number of attempted accesses for a human to get answers for your questions. How many? About 30 x 29 x 28 = 24K+ attempts to get asked every question of all 30 of them.

As far as how to get the right answer when the correct answer could be maybe '6' or 'six,' ask them this way:

Q: "What do you put your hat on? (4 letters)" A: "head"

You provide the information you need to get the correct answer in the question.

"What doesn't fit" questions work well too. Q: "What doesn't fit? up, down, left, backwards, green." (obvious answer)

It is also good to use questions that are specific to your forum theme. For a sports forum, Q: "what doesn't fit? Kings, Dodgers, Giants ... Crips." I'm not a sports fan (obviously) but Crips is an urban gang. Any sports fan should know there's no team named that.

Here's another one: Q: "Type this word backwards: spammer" (obvious answer)

Leave my post with this most useful idea. Have 20-30 questions to ask, but as ony maybe 3 at a time. Bot programmers will cry in frustration trying to get all your questions written down. Bots don't have a chance since they need answers programmed in.

I still have a few spammers but now they are all English speaking humans.

At some point if your 'bot registrations increase, change all the questions. :)

Advertisement: