News:

Wondering if this will always be free?  See why free is better.

Main Menu

Bots, bots, bots

Started by MrMike, December 31, 2008, 04:19:56 PM

Previous topic - Next topic

Charles Hill

I added this to the post form of zCommunity and my forum's post forms.  I also have recaptcha and a few other minor anti-bot thingamajiggers.  You think of any other stuff to kill those bots, let us know :)

MrMike

Quote from: Charles Hill on January 04, 2009, 09:12:52 PM
I added this to the post form of zCommunity and my forum's post forms.  I also have recaptcha and a few other minor anti-bot thingamajiggers.  You think of any other stuff to kill those bots, let us know
I have several more tricks up my sleeve, I just have to work out some of the details. :)

Let me know how the bot-buster code works, I'd be curious to see how effective it is on your forum. I've also got a list of ~28,000 IP addresses used by bots that I'm thinking of preloading into the "failed" table.

So far I haven''t had a single one get through and they're trying like crazy. The "been here before" detection code seems to be working very well so far. :) In fact, in the time it took for me to type this message it caught 4 more, two of which had been previously flagged by IP, email, or name. LOL!!

Charles Hill

I'll keep thinking about new ways of tackling bots.  I assume you already have the minimum amount of time anti-bot check in place.  Where if a form is submitted in under a second it is auto-rejected.  I also do funky stuff to all the form fields of my post forms for zCommunity.  I first add random letters/symbols to the end of them, then encrypt them using sha1.  This was very tricky to get working properly with spell checking / previewing / etc.  But before anyone asks for that code, it is designed for zCommunity's post form... which is totally different from SMF's.  Sorry :(

MrMike

Quote from: Charles Hill on January 05, 2009, 12:30:46 AM
I'll keep thinking about new ways of tackling bots.  I assume you already have the minimum amount of time anti-bot check in place.  Where if a form is submitted in under a second it is auto-rejected.
This sounds like it would be relatively straightforward to add/implement in the bot-buster code. Let me see what I can do.

WhiteEagle

Very interesting. I will be watching this to see how it continues to develop. I currently am running two SMF forums.
I fold for team 52482. Do you Fold@Home?
SMF powered sites: Leet Link LeetSpace.com

MrMike

Quote from: WhiteEagle on January 05, 2009, 05:52:01 PM
Very interesting. I will be watching this to see how it continues to develop. I currently am running two SMF forums.
From my own stats I'm catching about 100 a day. I also preloaded my "failed" table with ~2000 of the most active bot IPs I found on another site. I added some code to my personal installation to send me emails when it bounced a bot, added a bot, or caught a previously seen bot, and I'm just about to turn the email notifications off as they're flowing in like crazy. If you have problems with bots, this might work for you.

I'm going to post a zip with the code packaged up as an include() file so it'll be very easy to add, maybe a 10 to 15 minute job with very minor edits.

Night09

I been thinking over the actual legitimate fields and looking at adding a set of random to them.

So it still sends the correct data on a real submittal but will confuse bots maybe programmed to avoid hidden fields.For example the name could be written multiple ways and the template chooses a random one to show.

Username:
Desired username:
Forum username:
Forum name:

All mean the same but will help stop a bot reading the preset fields and ignoring additional ones by anti bot programmers since the forum can show a random field for even legitimate inputs.

Ive not had time to look at any code because of a full system rebuild but thought id share the idea if you havent thought of this as well already.

MrMike

#27
Quote from: nightbre on January 05, 2009, 06:35:30 PMI been thinking over the actual legitimate fields and looking at adding a set of random to them.
The current code (see the attached ZIP) randomly selects among several fields (name, age, color, country, street), but you can change them or add  as many as you want. I've added some extra ones to my installation. If you add them in the BotBuster.php file, make sure you add them in the Register.template.php file (and vice versa).

Also attached as "preload1.zip" are ~2000 high-activity bot IPs that you can preload the 'smf-failed_reg' table with.

Charles Hill

I haven't added the database tables aspect of what you have done yet... but I imagine it is one of a very few ways you can possibly stop the human spammers who take over where their bots got stopped - unless they're using a different computer/ip

MrMike

Quote from: Charles Hill on January 05, 2009, 07:01:02 PM
I haven't added the database tables aspect of what you have done yet... but I imagine it is one of a very few ways you can possibly stop the human spammers who take over where their bots got stopped - unless they're using a different computer/ip
The database table is a key part of this, as it allows the system to "remember" previous attempts by bots that are now using a different name, IP, or email. If you're not using it then I highly recommend adding it. It catches about 30% to 50%  of the attempts I've been seeing.

Charles Hill

But wouldn't your other anti-bot techniques that don't involve database tables catch those bots anyways?  Unless, of course, a human took over where a bot left off and could beat your anti-bot techniques.

MrMike

Quote from: Charles Hill on January 05, 2009, 07:46:47 PM
But wouldn't your other anti-bot techniques that don't involve database tables catch those bots anyways?  Unless, of course, a human took over where a bot left off and could beat your anti-bot techniques.
Not necessarily, and that's the beauty of recording the attempts for later comparison. If a human tried to register from a previously seen IP or using a previously seen name or email address, they'd get booted too.

Seriously, if you aren't using the table then you're decreasing the efficiency of the code radically and potentially going to miss identifying a lot of the bots (and some of the humans as well). I've seen lots of them in the last day or so skipping some of the fields in the form...if not for the comparison table they might very well have gotten through.

Sabre™

Just a quick question slightly off topic, but yet, still on topic lol

With those IP lists floating around.
Say I created bots, and it was sent from my IP and got logged(or however it works), If my IP is Dynamic, and the next receiver of my old IP tries to join, they will not be able to right?

Thanks for your time and knowledge :)
Do NOT give admin and/or ftp details to just anybody, see if they are trust worthy first!!  Do your homework ;)


Charles Hill

Quote from: Sabre™ on January 05, 2009, 10:00:08 PM
Just a quick question slightly off topic, but yet, still on topic lol

With those IP lists floating around.
Say I created bots, and it was sent from my IP and got logged(or however it works), If my IP is Dynamic, and the next receiver of my old IP tries to join, they will not be able to right?

Thanks for your time and knowledge :)

That's correct.. that's the problem I am seeing with the whole idea.... Should probably put a time limit on the ip storage at the very least.  Something like 36 hours sounds reasonable.

MrMike

#34
Quote from: Sabre™ on January 05, 2009, 10:00:08 PM
Say I created bots, and it was sent from my IP and got logged(or however it works), If my IP is Dynamic, and the next receiver of my old IP tries to join, they will not be able to right?

Technically yes.....but look at it this way: there are 4,294,967,296 (4.3 billion) possible IP addresses, of which 2,147,483,648 (2.1 billion) are normally available to folks like you and me. (And bots.)

2.1 billion is a lot. If you have 100,000 IPs stored and you had 100,000 users that tried and sign up tomorrow, the odds of a collision would still be vanishingly small. If you actually got one I'd tell you to run out and buy a lotto ticket immediately, because your luck is running red hot. :)

Yes, in theory of you collected enough IP addresses you could/would eventually start running across "real" users who ended up with one, but the odds are against it.

I thought about this early on and that's why I log the date and time in the "smf_failed_reg" table....my intent was to clear the oldest IPs  every few months.

Honestly though, you could probably run it for years before you'd be statistically likely of banning an IP of someone who actually wants to register on your forum. Unless you're getting million and millions of signups it's probably not going to be an issue.

Sabre™

So what youre saying is that there is a chance an innocent could be effected?
Oh my!  :o

pmsl..  just playin ;)
Of those 2.1 billion IPs, thats levelled out by ranges, so narrows the window dramatically.
But enough to grow old waiting for a possible conflict lol
I see pros n cons(like anything), but luckily theres other methods to adopt aswell.
Keep up the good work guys, and thanks for your time. :)

regards
Do NOT give admin and/or ftp details to just anybody, see if they are trust worthy first!!  Do your homework ;)


Night09

Even if a dynamic IP was allocated to somone legit by accident after a bot they would only need to log of the net and release their data(dns cache) and get allocated a new IP which would likely not be one used by a bot.

Also dynamic IP ranges will be getting smaller as more people use static with always on connections.If somone did have a problem they could still e mail the forums admin if they were totally stuck to get the IP removed but I doubt the bot users would be relying on dynamic IP's to be active.Theyre more likely to be on proxies since they dont want to tell you their real IP anyway if they can.

Atomic Blaze

Quote from: Sabre™ on January 05, 2009, 11:57:38 PM
So what youre saying is that there is a chance an innocent could be effected?
Oh my!  :o

pmsl..  just playin ;)
Of those 2.1 billion IPs, thats levelled out by ranges, so narrows the window dramatically.
But enough to grow old waiting for a possible conflict lol
I see pros n cons(like anything), but luckily theres other methods to adopt aswell.
Keep up the good work guys, and thanks for your time. :)

regards

According to my logic there are 4,228,250,625 possible IP combinations, chance that a bot and a real user might be sharing the same IP, 1/4.2 billion.
Trick number one, looketh over there. Doth endeth the trick.

Follow me on Twitter!

Party Llama || GitHub

Sabre™

Quote from: Atomic Blaze
According to my logic there are 4,228,250,625 possible IP combinations

Thank you Spock, we've established, there is a chance. The cup is half full, not only half empty ;)

lol

Anyways, back on topic yeah?
Do NOT give admin and/or ftp details to just anybody, see if they are trust worthy first!!  Do your homework ;)


MrMike

Quote from: Atomic Blaze on January 06, 2009, 07:36:06 AM
According to my logic there are 4,228,250,625 possible IP combinations, chance that a bot and a real user might be sharing the same IP, 1/4.2 billion.

And the chance that they'd stumble across my forum and want to sign up: 1/600 billion, lol. ;)

Advertisement: