SMF Support > SMF 1.1.x Support

How to block baidu spider?

(1/6) > >>

peps1:
I know baidu spider are a legitimate Chinese search engine bot, but i have no Chinese content or desire any Chinese traffic........but there are 100's of little buggers bleeding my limited bandwidth!

Only mod i can find to block themis for SMF2

Please Help!   

DavidCT:
In my experience Baiduspider obeys robots.txt.  You can either deny it personally or globally, it seems to respect * wildcard.

robots.txt

--- Code: ---#Baiduspider
User-agent: Baiduspider
Disallow: /

#Others
User-agent: *
Disallow: /

--- End code ---

Some bots refuse to obey robots.txt, if they even bother to check it.  Those you can block through your htaccess file.

peps1:
Thanks DavidCT, I just slap this in the root right?

Also will it only block Baiduspider, or every bot?

DavidCT:
Yes, robots goes in the root, so you'd see it if you did yourdomain.com/robots.txt.

The * wildcard will block any bot who respects * and isn't specifically mentioned otherwise.  If you didn't define Googlebot, Slurp, MSNBOT, etc, and want those to crawl you need to allow them by removing the /...


--- Code: ---User-agent: Googlebot
Disallow:

--- End code ---

peps1:
The baidu spider are still crawl the forum.....50 ever couple of hours.

Is there a way to just block the ip range 220.181.7.*** ?

Navigation

[0] Message Index

[#] Next page

Go to full version