Simple Machines Community Forum

SMF Support => SMF 2.1.x Support => Topic started by: Mick. on January 25, 2022, 04:33:38 PM

Title: robots.txt; Need your input
Post by: Mick. on January 25, 2022, 04:33:38 PM
Does this look correct to you?  :o

User-agent: *
Disallow: /*action
Disallow: /*topic=*.msg
Disallow: /*topic=*.new
Disallow: /*PHPSESSID
Disallow: /*;
Allow: /$
Allow: /*board
Allow: /*topic
Allow: /*action=forum$
Allow: /*page
Allow: /*.xml
Allow: /*.css$
Allow: /*.js$
Allow: /*.png$
Allow: /*.jpg$
Allow: /*.gif$
Allow: /*sitemap

Sitemap: https://www.idesignsmf.com/sitemap.xml
Title: Re: robots.txt; Need your input
Post by: Sesquipedalian on January 26, 2022, 12:54:01 AM
SMF already inserts <meta name="robots" content="noindex"> into the HTML header of every appropriate page. So unless you are doing something unusual, there is no need to configure any allow or disallow rules in robots.txt.
Title: Re: robots.txt; Need your input
Post by: Arantor on January 26, 2022, 05:07:06 AM
The theory is that bots can avoid even requesting pages if specified in robots.txt but it's long been the case that bots follow anyway and that it's just a "don't use this for ranking purposes"... but the noindex should do that too.
Title: Re: robots.txt; Need your input
Post by: Mick. on January 26, 2022, 06:16:21 AM
Sooooo we don't need it? I've had this file for years but I feel cheated.

Yesterday I was browsing the showcase thread and noticed the amount of guests users have on their forums and snooped on their forum stats. Mind you, some had 300 guests others had 100 and so on and they're relatively new sites.

My site is lucky to have 20 guests daily lol but why? My site is established. 10yrs old. Wtf.

I figured this dumb robot file may be preventing my site from discovery even tho I'm indexed on google.

Title: Re: robots.txt; Need your input
Post by: Kindred on January 26, 2022, 07:58:53 AM
robots.txt used to mean something....

google still (sorta) follows the instructions in it...

many of the more aggressive search engines (like Baidu) completely ignore it now.
Title: Re: robots.txt; Need your input
Post by: Mick. on January 26, 2022, 08:07:02 AM
I renamed it for now to study the difference.