News:

Wondering if this will always be free?  See why free is better.

Main Menu

Bingbot

Started by SomeoneElse, May 20, 2020, 04:57:20 AM

Previous topic - Next topic

SomeoneElse

Looking at the web server logs, bingbot 2.0 makes over ten times the visits that the Googlebot does - and that's after telling it to reduce visits to once every ten seconds.

But we get a tiny fraction of the traffic from Bing compared to Google. Much less than a twentieth. If I added up all of the Google domains, Bing would be a couple of percent of its total. (We're primarily aimed at a country where that Bing is in a very distant place to Google in terms of people using it.)

Is it really this clueless in terms of crawling or do plenty of people fake being it?

For this level of traffic, I'm getting tempted just to ban it.



shawnb61

Yes, bingbot goes crazy on forums sometimes.  Unbelievable.  They really need to fix that, but haven't, in years.

One solution outlined here:
https://www.simplemachines.org/community/index.php?topic=571935.msg4047207#msg4047207

In that thread, someone reported that bingbot honored robots.txt restrictions.  I didn't see that myself.  But if you could address it via robots.txt to slow them down, that would be preferred, as it would still remain a valid search engine. 

I completely cut them off with an update to .htaccess:
SetEnvIfNoCase User-Agent bingbot bad_bot

<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>


Of course, we no longer show up on their searches, but they no longer cause CPU spikes.
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Sir Osis of Liver

Tried robots.txt on a forum that was getting hammered, didn't even slow it down.  Have to use .htaccess to kill it.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Arantor

If robots.txt doesn't fix it, it's because of impersonations. Real Bingbot does respect it.

Impersonation is not uncommon.

shawnb61

I checked IP addresses at the time & they were definitely Microsoft IPs.

I even tried using Bing webmaster tools, and they (somewhat) honored that.  But BWT would consider those changes temporary & periodically reset them, discarding my input, and the problem would return...  That was when I gave up on them. 

Again, this was years ago & maybe they've gotten better since.

Bing recently announced that, if you have a bingbot section in robots.txt, they ignore the defaults...  I.e., it may be HOW the entries are configured in robots.txt.

I'm too nervous to remove the restrictions & experiment.  I spent days on it last time, to no avail, while the forum was getting hammered...
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Sir Osis of Liver

Found my notes on this, it was msnbot, not bingbot.  Wasn't bingbot supposed to replace msnbot?  As of a couple months ago, both were still running.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Arantor

It was, but many more of the msnbot entries are fake compared to bingbot, at least in my logs.

Advertisement: