Advertisement:

Google Indexing Pages I Don't Want Indexed

Aloittaja forumposters, kesäkuu 01, 2006, 02:38:50 AP

« edellinen - seuraava »

forumposters

SMF Version: SMF 1.1 RC2
I notice Google has indexed the following two urls:

http://forumposters.org/forum/index.php?action=help;page=loginout
and
http://forumposters.org/forum/index.php?action=help;page=pm

Both of these are really not urls that would make sense for Google to index.  How can I tell Google not to index these pages?

moviespot

Thanks and regards

forumposters

Would you be so kind as to post a sample robots.txt file that works well for smf?

H

-H
Former Support Team Lead
                              I recommend:
Namecheap (domains)
Fastmail (e-mail)
Linode (VPS)
                             

forumposters

Awesome.  So, according to that thread this is all I need in robots.txt:

User-Agent: *
Disallow: /index.php?action=search
Disallow: /index.php?action=calendar
Disallow: /index.php?action=login
Disallow: /index.php?action=register
Disallow: /index.php?action=profile
Disallow: /index.php?action=stats

H

Also:

Disallow: /index.php?action=help*
Disallow: /index.php?action=printpage*
-H
Former Support Team Lead
                              I recommend:
Namecheap (domains)
Fastmail (e-mail)
Linode (VPS)
                             

forumposters

I failed  to realize that the * at the end of these urls is essential.  I've just added that.. 
After peeking at your robots.txt file, I noticed you have these two lines:

Disallow: /forum/index.php?*all*
Disallow: /forum/index.php?*msg*

May I ask why you do that?  Wouldn't that prevent Google from spidering the forum posts?

Dannii

#7
It will index the pages without all and msg.

Also, you can remove the /forum/index.php? and replace that with a * unless you have multiple forums and want to block only one.

Mine has:

User-agent: *
Disallow: *action=admin*
Disallow: *action=chat*
Disallow: *action=help*
Disallow: *action=login*
Disallow: *action=mlist*
Disallow: *action=post*
Disallow: *action=register*
Disallow: *action=search*
Disallow: *action=who*
Disallow: /Themes/
"Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise."

H

Lainaus käyttäjältä: forumposters - kesäkuu 28, 2006, 03:57:02 AP
I failed  to realize that the * at the end of these urls is essential.  I've just added that.. 
After peeking at your robots.txt file, I noticed you have these two lines:

Disallow: /forum/index.php?*all*
Disallow: /forum/index.php?*msg*

May I ask why you do that?  Wouldn't that prevent Google from spidering the forum posts?

Some of SMF links could be misinterpreted as duplicate content and as such I block individual message links and links that combine multiple pages ;)
-H
Former Support Team Lead
                              I recommend:
Namecheap (domains)
Fastmail (e-mail)
Linode (VPS)
                             

forumposters

I still seem to be having problems and I can't figure this out.  Here's my robots.txt file:

User-agent: *
Disallow: *action=admin*
Disallow: *action=help*
Disallow: *action=login*
Disallow: *action=mlist*
Disallow: *action=post*
Disallow: *action=register*
Disallow: *action=search*
Disallow: *action=trader*
Disallow: *action=profile*
Disallow: *action=who*
Disallow: /forum/Themes/
Disallow: /forum/admin/
Disallow: /forum/attachments/
Disallow: /cgi-bin/


For some reason Google is indexing hundreds of urls with *action=trader* in them and I don't want this.

Dannii

Are they still being newly indexed, or are you just seeing the old pages? It takes a long time for a page to be removed from the index..
"Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise."

forumposters

Good question.   After taking a closer look, I think it's just old pages that were indexed a couple months ago before I added the line

Disallow: *action=trader*

Advertisement: