i recently been noticing google bots on my site...even if i ban ips it still gets in..
im wondering if this is bad or good ?? can this slow my forum down??
and why do they do this....checking traffic .?gathering information?...
crawl-66-249-66-112.googlebot.com
ips 66.249.66.112
theres others too and theyre routing thru NY to california...
anybody with this knowlegde on this is appreciated.....
aloha
google has thousands of different ips it uses in different ranges.
If you want to stop google, rather than ip banning, you should use a robots.txt
Also, I don't see much point in trying to stop them either, unless it's a private site or something. When they're crawling the site, they're adding your pages into the Google index, so people can search for them.
thank you..for the information,,,,,
And just to answer you other question, no, they don't slow down your site ;)
Please I have submitted my site map to google and i have a steady robots.txt file to stop some places i don't want google to index but google those not index other parts of my site.
please check this on my forum xml's links to confirm
May be am right?
www.dejimanaire.com/sitemap.xml (http://www.dejimanaire.com/sitemap.xml)
http://dejimanaire.com/index.php?type=rss;action=.xml (http://dejimanaire.com/index.php?type=rss;action=.xml)
I your robots.txt you don't wildcards at the end. It is IMPLIED.
Also note, only Google/Yahoo support wildcards.
I've been doing alot in this area on my forum recently.
I've had to implement a smart robots (http://www.youposted.com/robots.txt) and serve a specific robots.txt to yahoo/google vs msnbot vs every other bot because using the wildcards invalidated my robots.txt in several checkers.
You fluffed the link, Karl. You have a double http:// there. ;)
Lainaus käyttäjältä: Ðyєgσv - helmikuu 08, 2008, 11:04:40 IP
And just to answer you other question, no, they don't slow down your site ;)
Yahoo can even bring a site completely down :D But, yeah usually search bot's don't do damage of any kind. :)
Indeed theres two very aggressive bots i'm aware of
Omgilibot and Yahoo.
As a point of first instance, i think its better to try to use a robots to block off some area (which you can guarantee that yahoo slurp is crawling) rather than a crawl-delay.
+ Edit, oops fixed my link ;)
Lainaus käyttäjältä: karlbenson - huhtikuu 13, 2008, 08:35:45 IP
I your robots.txt you don't wildcards at the end. It is IMPLIED.
Also note, only Google/Yahoo support wildcards.
I've been doing alot in this area on my forum recently.
I've had to implement a smart robots (http://www.youposted.com/robots.txt) and serve a specific robots.txt to yahoo/google vs msnbot vs every other bot because using the wildcards invalidated my robots.txt in several checkers.
PLease how can I add to my key word with google and smf forum? please How can I get the link for SMF SITE MAP that I am to submit to google? Please did I created this [urlhttp://www.dejimanaire.com/sitemap.xml[/url] site map well? please should I change all the html to php in the created site map?: Please I will appreciate your effor?
Please How would google index other part of my site with titles?
You can submit the sitemap to Google via Google Webmasters.
You can also add the Sitemap autodiscovery thing to your robots.txt like I've done in mine (http://www.youposted.com/robots.txt)
Eg like
Lainaa
Sitemap: http://www.youposted.com/sitemap.xml
User-agent: *
Disallow: /attachments/
... {and the rest of my disallow}
Note, the line between Sitemap and UserAgent MYMUST be there.
Please I want you to review my site map and my robots txt. that I submitted to google webmaster. check out this link for my site map http://www.dejimanaire.com/sitemap.xml (http://www.dejimanaire.com/sitemap.xml)
Google only index my site map with their robots.
and check out this link for my robot txt www.dejimanaire.com/robots.txt (http://www.dejimanair.com/robots.txt)
Please am I on the right track?
Really I need google robots to crawl all around my site except the restricted area.
Thanks for your support.
You dont need wildcards on the end of all those Disallows.
It is implied
So
action=arcade
would also block
action=arcade;andanythingthatfollows
Lainaus käyttäjältä: karlbenson - huhtikuu 15, 2008, 01:30:05 IP
You can submit the sitemap to Google via Google Webmasters.
You can also add the Sitemap autodiscovery thing to your robots.txt like I've done in mine (http://www.youposted.com/robots.txt)
Eg like
Lainaa
Sitemap: http://www.youposted.com/sitemap.xml
User-agent: *
Disallow: /attachments/
... {and the rest of my disallow}
Note, the line between Sitemap and UserAgent MYMUST be there.
Can you recommend a good tutorial on setting up a sitemap? Never having done it before I'm completely at a loss as to how to proceed.
There are mods for smf which can make them.
I didn't actually look at any sitemap tutorials
I looked at SlammedDimes mod and then wrote my own.
Cool. Thanks. I'll check out his mod.
Lainaus käyttäjältä: antechinus - huhtikuu 17, 2008, 08:01:56 IP
Lainaus käyttäjältä: karlbenson - huhtikuu 15, 2008, 01:30:05 IP
You can submit the sitemap to Google via Google Webmasters.
You can also add the Sitemap autodiscovery thing to your robots.txt like I've done in mine (http://www.youposted.com/robots.txt)
Eg like
Lainaa
Sitemap: http://www.youposted.com/sitemap.xml
User-agent: *
Disallow: /attachments/
... {and the rest of my disallow}
Note, the line between Sitemap and UserAgent MYMUST be there.
Can you recommend a good tutorial on setting up a sitemap? Never having done it before I'm completely at a loss as to how to proceed.
PLease did you use SlammedDimes mod for your sitemap? When I installed his site map mod i recieved an error in the index.template.php file. will it work with that error?
I looked at SlammedDimes mod. But wrote my own. (I have however previously used his mod)
No, if you get an error installing you'll need to install it manually.
Use either the package parser in the mod site or an external one such as http://www.adrevenueshare.com/parser
I just uninstalled his mod and tried using google site map generator. you can now check my site map http://www.dejimanaire.com/sitemap.xml
But I have a problem while validating my sitemap with http://www.w3.org I received this errors
Schema validating with XSV 3.1-1 of 2007/12/11 16:20:05
Schema validator crashed
The maintainers of XSV will be notified, you don't need to
send mail about this unless you have extra information to provide.
If there are Schema errors reported below, try correcting
them and re-running the validation.
* Target: http://www.dejimanaire.com
(Real name: http://www.dejimanaire.com
Last Modified: Fri, 18 Apr 2008 21:06:47 GMT
Server: Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.8b mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635)
* The target was not assessed
Low-level XML well-formedness and/or validity processing output
Error: Mismatched end tag: expected </td>, got </table>
in unnamed entity at line 58 char 10 of http://www.dejimanaire.com
Please what should i do?