News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

Massive bot attack

Started by spiros, February 11, 2025, 01:45:02 AM

Previous topic - Next topic

spiros

I have been getting for over a week massive traffic from bots spread along multiple IP groups which are difficult to ban via CSF and at times slowing down significantly the forum. For example I would get 2000+ concurrent where normal visitors are about 100. Any advice on that would be appreciated.

Kindred

Block in htaccess by bot type
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

shawnb61

If it helps, feel free to borrow from my robots.txt & .htaccess.

Since I change them regularly, I now keep them up on GitHub here:
https://github.com/sbulen/SMF-bot-hygiene

They may not suit your needs; consider it a starter pack. 

Yes, things have gotten pretty bad with the bots...  It's the wild wild west out there... 
A question worth asking is born in experience & driven by necessity. - Fripp

vbgamer45

That great shawn

I ended up blocking all of China and Russia yesterday on a hardware firewall was just too much...

Also, note you can find lists with apache/other webserver configs to block on a country level at https://www.ip2location.com/free/visitor-blocker
Community Suite for SMF - Grow your forum with SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com - Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

shawnb61

I was getting smothered by activity from some US and European networks; I am assuming these are ISPs used by unscrupulous corporate entities trying to hide their identities.  I.e., plagiarism engines (some call them AI bots)...

I ran a quick script to check, & confirmed none of my users were in their IP ranges. 

Once I confirmed that, I basically cut off LOTs of IP ranges for FastPlanet, TrafficTransit, Fine Group & a couple others.

I didn't want to cut off Russia, because frankly, I have a fair amount of actual Russian members in my forum.

China is odd...  A lot of the funky Alibaba/Huawei activity actually reports out as HK, not China.  I've blocked a lot of HK IPs. 

OTOH, I have about 3 or 4 valid users who participate in forum discussions with actual Chinese IP addresses, not HK... 
A question worth asking is born in experience & driven by necessity. - Fripp

spiros

Thank you all guys. I switched to directadmin with nginx_apache server so some things are not as simple as editing .htaccess anymore

@shawnb61 methinks googlebot does not accept Crawl-delay and the Crawl Rate Limiter Tool in Search Console has been depreciated.


shawnb61

Quote from: spiros on February 13, 2025, 01:31:44 PM@shawnb61 methinks googlebot does not accept Crawl-delay and the Crawl Rate Limiter Tool in Search Console has been depreciated.

Yep: https://www.simplemachines.org/community/index.php?topic=590038.0

I do see that they honor the disallows though.  I've been checking what they're linking to, and altering robots.txt accordingly.

Dissallowing msg level links helps a lot, I believe.  It's a waste anyway, they end up loading the same page over & over.
A question worth asking is born in experience & driven by necessity. - Fripp

Sir Osis of Liver

Admin > Server Settings > Disable hostname lookups

Check that.  If bot traffic is heavy enough, hostname lookups can crash your forum.

When in Emor, do as the Snamors.
                              - D. Lister

Kindred

Shawn,

question on your htaccess --

You use
BrowserMatchNoCase 01h4x.com bad_bot
My htaccess is using
SetEnvIfNoCase User-Agent "01h4x.com" bad_bot

Is there an appreciable difference in the approach?

my mySQL usage has tripled in the last few days (on a 2.0.19 forum) but no influx of actual users or posts - and the "bots visiting" in the who's online seems constant at 125+ at any point in any day
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

shawnb61

Yes, they're the same.  You're good.

It'd help to have a sense of the activity they're doing...  If you see a lot of message-level GETs in the web access logs, for me, it really helped to disallow those to robots.txt.

If your query report from your host shows a lot of session writes, that cumulatively add up to a significant #, you can make a 2.0 version of this in the write session function, near the top:
Quote// Don't bother writing the session if cookies are disabled; no way to retrieve it later
   if (empty($_COOKIE))
      return true;

Finally, I added a lot of IP bans for groups like FastPlanet, TransitTraffic, Fine Group, etc.  I couldn't find any users in those IP ranges, and they were hitting me very hard.

For further clues that might help: https://github.com/sbulen/SMF-bot-hygiene
A question worth asking is born in experience & driven by necessity. - Fripp

spiros

Despite having

Disallow: /forum/index.php?action=printpagehttps://www.translatum.gr/robots.txt

I get many bots visiting print pages; is there a way to eliminate that? Perhaps some sort of JavaScript to create the print link?

Also, is this syntax acceptable? I.e. multiple user agents and at the end the Disallow.

User-agent: Zeus
User-agent: ZumBot
Disallow: /

Advertisement: