News:

Want to get involved in developing SMF? Why not lend a hand on our GitHub!

Main Menu

PHP Session IDs "exploded" my database to 1 Gigabyte, exceeding hosters limits..

Started by Medizinmann99, April 27, 2024, 09:09:10 AM

Previous topic - Next topic

Medizinmann99

Quote from: shawnb61 on June 25, 2024, 12:04:01 PM
Quote from: Medizinmann99 on June 25, 2024, 10:24:54 AMIf you can recommend any additional bots which are currently creating useless traffic which I could add please recommend them to me, thanks!

See reply #10 above for a list.

Or...  I built my list out from this resource:
https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/blob/master/_generator_lists/bad-user-agents.list

Thanks :-)

In my current .htaccess file, I just added very few bots as mentioned above, but since I added them, guest numbers dropped from ca. 400 to currently around 10, so I guess I must have blocked exactly the bots which caused the activity :-)

Hm - but it is good to know that I can still add many more to my .htaccess file, if the problem shows up again!

Thanks so far!!

Medizinmann99

Short update, since I modified the .htaccess file as I mentioned, the number of guests (or bots) has never exceeded 25.

I also set the cloudflare security of my hoster to "high", however, a few days ago, even setting it to "I am under attack!" had ZERO influence on the "guest bots" when there were around 400 in my forum all the time, so this is most likely not the reason that this changed.

The numbers immediately dropped rapidly as soon as I modified the .htaccess file, so it must indeed be the robots which I added in the file. The robots to exclude were btw recommended to me by an AI, which I asked which robots are currently creating the most problems for a simplemachines forum software which runs on simplemachines version 2.0.19. Seems like the AI was spot on :-)

Steve

I'll mark this solved then. If you have any further issues with this particular problem, by all means, mark it unsolved and let us know.
My pet rock is not feeling well. I think it's stoned.

Medizinmann99

@Steve, ok, thanks.

In the last few days, a few hundred bots showed up again, but their behaviour entirely changed. They stay in the forum only for a very short time, dont create much traffic at all, then vanish again. The "permanent" number of guests is still always around 20 or so.

I guess this means that these hundreds of bots must be ones which I did not exclude from the .htaccess file, but they have a much more "civilized" behaviour, seems like they "just" grab the latest forum postings and disappear, while the bots I excluded in the .htaccess file stayed and downloaded and downloaded and downloaded.

So I guess the AI which I asked which bots are the most problematic at the moment and which bots I then entered into the .htaccess file was right.
The "additional bots" are seemingly unproblematic, so I dont see a reason for any modifications to the .htaccess file.

Well, strange. I will keep watching.

Perhaps it will soon be time for the simplemachines.org developers to implement captchas for suspected bots pretending to be (human) guests, as I guess this problem will intensify.

Kindred

you can't do captcha to browse the forum... that would prevent normal users as well
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Arantor

And search engines, assuming you want people to find your forum...
Holder of controversial views, all of which my own.


shawnb61

Bots aren't static.  There are constantly new ones, pretty much every day.  Some choose to crawl you & many don't - they're looking elsewhere (for the time being...).  Plus, existing ones change their behavior over time.

So it's a never-ending game of whack-a-mole...  You monitor mysql time & cpu, and when it spikes, analyze the logs to see if there's a new one you need to block.

The ones that really bug me are the "standard vulnerability script" bots...  These guys ones look for things like various WordPress files, .env files, .conf, .ini, xmlrpc.php, port scanning tools, etc.  There's LOTS of them.  They check you out & move on, never to be seen again.  On any given day, I get at least 30 of these, all different IPs.  They usually vary their useragents, using common browser useragents.

They're like flies.

I don't know why they bug me so much, I guess I'm concerned some day one of 'em will find a real chink in the armor. 

I'd love to, in real time, detect calls for their common requests, e.g., xmlrpc.php or wlmanifest.xml, and put a 48 hour ban on that IP address.  I wonder how much CPU that would save...

Here's one such IP address:  https://scamalytics.com/ip/62.146.233.30

Some of that guy's activity:
You cannot view this attachment.

And he's not the only one:
You cannot view this attachment.
A question worth asking is born in experience & driven by necessity. - Fripp

shawnb61

Another useragent I've started blocking is python-requests.

Always VERY suspicious behavior.  Lots of IPs, I think folks are sharing some scripts somewhere. 
A question worth asking is born in experience & driven by necessity. - Fripp

Arantor

Python-requests just means they're using the requests library, basically Python's version of Guzzlehttp, and not bothering to set the user agent.
Holder of controversial views, all of which my own.


Advertisement: