Simple Machines Community Forum

SMF Support => SMF 1.1.x Support => Aiheen aloitti: hotrod007 - kesäkuu 29, 2011, 08:49:35 AP

Otsikko: robot.txt location
Kirjoitti: hotrod007 - kesäkuu 29, 2011, 08:49:35 AP
Been having some legacy issues.. and I can see the Bots are really tearing up my site.. 

My forums are located in /forums  here  http://njsaltwaterfisherman.com/forums/index.php  I'm starting to wonder if If have this set up wrong..    is the /forums folder the right place for the robot.txt ?  with what I using below?


User-agent: Slurp

Crawl-delay: 60

User-agent: *

Crawl-delay: 30

User-agent: Googlebot
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage

User-agent: Slurp
Allow: /sitemap.xml$
Allow: /robots.txt$
Allow: /index.php$
Allow: /index.php?topic=*.0$
Allow: /index.php?topic=*.*0$
Allow: /index.php?topic=*.*5$
Allow: /index.php?board=*.0$
Allow: /index.php?board=*.*0$
Allow: /index.php?board=*.*5$



User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /installation/

User-agent: *
Disallow: /attachments/
Disallow: /Packages/
Disallow: /Smileys/
Disallow: /Sources/
Disallow: /Themes/
Disallow: /index.php?action=activate
Disallow: /index.php?action=admin
Disallow: /index.php?action=calendar
Disallow: /index.php?action=emailuser
Disallow: /index.php?action=findmember
Disallow: /index.php?action=help
Disallow: /index.php?action=helpadmin
Disallow: /index.php?action=login
Disallow: /index.php?action=logout
Disallow: /index.php?action=mlist
Disallow: /index.php?action=modifykarma
Disallow: /index.php?action=pm
Disallow: /index.php?action=post
Disallow: /index.php?action=printpage
Disallow: /index.php?action=profile
Disallow: /index.php?action=recent
Disallow: /index.php?action=register
Disallow: /index.php?action=reminder
Disallow: /index.php?action=search
Disallow: /index.php?action=theme
Disallow: /index.php?action=unread
Disallow: /index.php?action=unreadreplies
Disallow: /index.php?action=verificationcode
Disallow: /index.php?action=who
Disallow: /index.php?theme
Disallow: /archive.php
Disallow: /index.php?action=blog
Disallow: /index.php?action=viewblog
Disallow: /index.php?action=chess
Disallow: /index.php?action=comment
Disallow: /index.php?action=downloads
Disallow: /index.php?action=links
Disallow: /index.php?action=reporttm
Disallow: /index.php?action=recenttopics
Disallow: /index.php?action=mm
Disallow: /index.php?action=sitemap
Disallow: /index.php?action=staff
Disallow: /index.php?action=tags
Disallow: /index.php?action=thankyou
Disallow: /index.php?action=viewkarma
Disallow: /index.php?action=viewers
Disallow: /index.php?f=
Disallow: /index.php?filter
Disallow: /index.php?referredby
Disallow: /Downloads/
Disallow: /index.php?action=arcade;favorites
Disallow: /index.php?action=arcade;sa=highscore
Disallow: /index.php?action=arcade;sa=play;random
Disallow: /index.php?action=arcade;category
Disallow: /index.php?action=arcade;sort
Disallow: /index.php?action=arcade;stats
Disallow: /index.php?action=stats;expand
Disallow: /index.php?action=stats;collapse

User-agent: Twiceler
Disallow: /

User-Agent: W3C-checklink
Disallow: /

User-Agent: MJ12bot
Disallow: /index.php?PHPSESSID
Otsikko: Re: robot.txt location
Kirjoitti: lorth - kesäkuu 29, 2011, 05:49:24 IP
put it in your serverroot, one level above /forums

also, your robots.txt could be somewhat optimized, there is some redundancy.

also, you are aware of the fact that a robot.txt only works on bots which listen to it, yes?
Otsikko: Re: robot.txt location
Kirjoitti: hotrod007 - kesäkuu 30, 2011, 08:00:01 AP
Thanks for the info lorth

Wow.  yea  I have it in my Forums folder.  Maybe that's why I'm still getting hammered..  so should they all have the /forums in front of them?

Yea. I understand there are other bots that cause a lot of problems.. trying to slow down the main one for now..

Any suggestions on optimizing it?

So I would use the one in my Root Folder. for Joomla  and add the forums to it.  then it would look like this.

Is this over Kill?

User-agent: Slurp

Crawl-delay: 30

User-agent: Googlebot
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /forums/index.php?*;wap
Disallow: /forums/index.php?*;wap2
Disallow: /forums/index.php?*;imode
Disallow: /forums/index.php?action=printpage

User-agent: Slurp
Allow: /sitemap.xml$
Allow: /robots.txt$
Allow: /index.php$
Allow: /index.php?topic=*.0$
Allow: /index.php?topic=*.*0$
Allow: /index.php?topic=*.*5$
Allow: /index.php?board=*.0$
Allow: /index.php?board=*.*0$
Allow: /index.php?board=*.*5$
Allow: /forums/sitemap.xml$
Allow: /forums/robots.txt$
Allow: /forums/index.php$
Allow: /forums/index.php?topic=*.0$
Allow: /forums/index.php?topic=*.*0$
Allow: /forums/index.php?topic=*.*5$
Allow: /forums/index.php?board=*.0$
Allow: /forums/index.php?board=*.*0$
Allow: /forums/index.php?board=*.*5$



User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /installation/
Disallow: /forums/administrator/
Disallow: /forums/cache/
Disallow: /forums/components/
Disallow: /forums/editor/
Disallow: /forums/help/
Disallow: /forums/images/
Disallow: /forums/includes/
Disallow: /forums/language/
Disallow: /forums/mambots/
Disallow: /forums/media/
Disallow: /forums/modules/
Disallow: /forums/templates/
Disallow: /forums/installation/
Disallow: /forums/attachments/
Disallow: /forums/Packages/
Disallow: /forums/Smileys/
Disallow: /forums/Sources/
Disallow: /forums/Themes/
Disallow: /forums/index.php?action=activate
Disallow: /forums/index.php?action=admin
Disallow: /forums/index.php?action=calendar
Disallow: /forums/index.php?action=emailuser
Disallow: /forums/index.php?action=findmember
Disallow: /forums/index.php?action=help
Disallow: /forums/index.php?action=helpadmin
Disallow: /forums/index.php?action=login
Disallow: /forums/index.php?action=logout
Disallow: /forums/index.php?action=mlist
Disallow: /forums/index.php?action=modifykarma
Disallow: /forums/index.php?action=pm
Disallow: /forums/index.php?action=post
Disallow: /forums/index.php?action=printpage
Disallow: /forums/index.php?action=profile
Disallow: /forums/index.php?action=recent
Disallow: /forums/index.php?action=register
Disallow: /forums/index.php?action=reminder
Disallow: /forums/index.php?action=search
Disallow: /forums/index.php?action=theme
Disallow: /forums/index.php?action=unread
Disallow: /forums/index.php?action=unreadreplies
Disallow: /forums/index.php?action=verificationcode
Disallow: /forums/index.php?action=who
Disallow: /forums/index.php?theme
Disallow: /forums/archive.php
Disallow: /forums/index.php?action=blog
Disallow: /forums/index.php?action=viewblog
Disallow: /forums/index.php?action=chess
Disallow: /forums/index.php?action=comment
Disallow: /forums/index.php?action=downloads
Disallow: /forums/index.php?action=links
Disallow: /forums/index.php?action=reporttm
Disallow: /forums/index.php?action=recenttopics
Disallow: /forums/index.php?action=mm
Disallow: /forums/index.php?action=sitemap
Disallow: /forums/index.php?action=staff
Disallow: /forums/index.php?action=tags
Disallow: /forums/index.php?action=thankyou
Disallow: /forums/index.php?action=viewkarma
Disallow: /forums/index.php?action=viewers
Disallow: /forums/index.php?f=
Disallow: /forums/index.php?filter
Disallow: /forums/index.php?referredby
Disallow: /forums/Downloads/
Disallow: /forums/index.php?action=arcade;favorites
Disallow: /forums/index.php?action=arcade;sa=highscore
Disallow: /forums/index.php?action=arcade;sa=play;random
Disallow: /forums/index.php?action=arcade;category
Disallow: /forums/index.php?action=arcade;sort
Disallow: /forums/index.php?action=arcade;stats
Disallow: /forums/index.php?action=stats;expand
Disallow: /forums/index.php?action=stats;collapse

User-agent: *
Disallow: /attachments/
Disallow: /Packages/
Disallow: /Smileys/
Disallow: /Sources/
Disallow: /Themes/
Disallow: /index.php?action=activate
Disallow: /index.php?action=admin
Disallow: /index.php?action=calendar
Disallow: /index.php?action=emailuser
Disallow: /index.php?action=findmember
Disallow: /index.php?action=help
Disallow: /index.php?action=helpadmin
Disallow: /index.php?action=login
Disallow: /index.php?action=logout
Disallow: /index.php?action=mlist
Disallow: /index.php?action=modifykarma
Disallow: /index.php?action=pm
Disallow: /index.php?action=post
Disallow: /index.php?action=printpage
Disallow: /index.php?action=profile
Disallow: /index.php?action=recent
Disallow: /index.php?action=register
Disallow: /index.php?action=reminder
Disallow: /index.php?action=search
Disallow: /index.php?action=theme
Disallow: /index.php?action=unread
Disallow: /index.php?action=unreadreplies
Disallow: /index.php?action=verificationcode
Disallow: /index.php?action=who
Disallow: /index.php?theme
Disallow: /archive.php
Disallow: /index.php?action=blog
Disallow: /index.php?action=viewblog
Disallow: /index.php?action=chess
Disallow: /index.php?action=comment
Disallow: /index.php?action=downloads
Disallow: /index.php?action=links
Disallow: /index.php?action=reporttm
Disallow: /index.php?action=recenttopics
Disallow: /index.php?action=mm
Disallow: /index.php?action=sitemap
Disallow: /index.php?action=staff
Disallow: /index.php?action=tags
Disallow: /index.php?action=thankyou
Disallow: /index.php?action=viewkarma
Disallow: /index.php?action=viewers
Disallow: /index.php?f=
Disallow: /index.php?filter
Disallow: /index.php?referredby
Disallow: /Downloads/
Disallow: /index.php?action=arcade;favorites
Disallow: /index.php?action=arcade;sa=highscore
Disallow: /index.php?action=arcade;sa=play;random
Disallow: /index.php?action=arcade;category
Disallow: /index.php?action=arcade;sort
Disallow: /index.php?action=arcade;stats
Disallow: /index.php?action=stats;expand
Disallow: /index.php?action=stats;collapse

User-agent: Twiceler
Disallow: /

User-Agent: W3C-checklink
Disallow: /

User-Agent: MJ12bot
Disallow: /index.php?PHPSESSID
Otsikko: Re: robot.txt location
Kirjoitti: lorth - kesäkuu 30, 2011, 08:50:12 AP
Lainaus käyttäjältä: hotrod007 - kesäkuu 30, 2011, 08:00:01 AP
so should they all have the /forums in front of them?
yes

Lainaus käyttäjältä: hotrod007 - kesäkuu 30, 2011, 08:00:01 AP
Any suggestions on optimizing it?
you only need one section with "User-agent: * ".
i am also pretty sure (but you should research this one) that everything behind an ? in the url gets ignored anyway.

also make sure you alsways use the full path you want to block.
/forums/Themes etc is ok (there are index.php's in them which redirect to the forums main page, but well...)
all

Disallow: /Sources/
Disallow: /Themes/
Disallow: /index.php?action=activate
Disallow: /index.php?action=admin
Disallow: /index.php?action=calendar

lines are not full paths but relative to your forums folder, so they will not get hit anyway.
Otsikko: Re: robot.txt location
Kirjoitti: hotrod007 - kesäkuu 30, 2011, 10:03:59 AP
Thank you Lorth.. I will clean this up and see how it goes..