Uutiset:

Wondering if this will always be free?  See why free is better.

Main Menu
Advertisement:

Print Page? Indexed all over google ?

Aloittaja shawdy, heinäkuu 07, 2010, 12:46:22 IP

« edellinen - seuraava »

MrPhil

If you use robots.txt to tell Google (and other well-behaved bots) to stay out of certain areas, make sure you've got the right path in the entries. I just posted to another topic on the same subject, where the OP had Disallow /index.php... entries, while their forum was installed into /forum (the entries should be Disallow /forum/index.php...).

aw06

#21
Lainaus käyttäjältä: MrPhil - joulukuu 31, 2010, 09:45:06 AP
If you use robots.txt to tell Google (and other well-behaved bots) to stay out of certain areas, make sure you've got the right path in the entries. I just posted to another topic on the same subject, where the OP had Disallow /index.php... entries, while their forum was installed into /forum (the entries should be Disallow /forum/index.php...).

You know that might just be it .. made changes ... will monitor  8)

My Question thou .. when if this works will google index less pages ?? or will it just index the actual post page now ?
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Arantor

What Jim said about using index.php?*;* is actually FINE.

Lainaa/index.php?action=post;topic=389687.0

Making bots no longer spidering posts.

Incorrect. It just means spiders won't be getting to the reply page but unless you have guest replying on, they wouldn't anyway.

It will likely deter Google Images picking up images however, since they use a ; in the link.
Holder of controversial views, all of which my own.


JimM

Keep in mind that these kinds of changes may take some time before you see results.
Jim "JimM" Moore
Former Support Specialist

aw06

Lainaus käyttäjältä: JimM - joulukuu 29, 2010, 09:06:31 AP
Replace all this:

User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage


with this:

Disallow: /index.php?*;*

in your robots.txt file.

OK, google still indexing print pages and wap pages .. your saying all i need in robots.text is Disallow: /index.php?*;* ???

Will that block Google Adsense Crawlers as well ??
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Arantor

LainaaWill that block Google Adsense Crawlers as well ??

Won't block Adsense but will push all content of your site out of Google - even the regular threads.
Holder of controversial views, all of which my own.


aw06

Lainaus käyttäjältä: Arantor - tammikuu 21, 2011, 06:59:39 IP
LainaaWill that block Google Adsense Crawlers as well ??

Won't block Adsense but will push all content of your site out of Google - even the regular threads.

:-X naw that's not good ... just need to block it from indexing print and wap pages
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Xavi-Nena

Is there anyway to disable this action or at least to unregistered members? I know how to remove the button but I am curious to know if there is a way to not allow the actual page to be viewed?

Arantor

Holder of controversial views, all of which my own.


aw06

Google still indexing my wap and print pages lol .. owell ... boo Google
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

MrPhil

Do you have your robots.txt correctly set up? That is, is your forum in the root (/index.php) or in a lower level (/forum/index.php)? Have you checked that "Googlebot" is the correct agent name, and not something like google-bot? Is there a reason you're doing this only for Google and not excluding other searchbots? Have you confirmed that your forum is generating URLs of the form you're giving in robots.txt, and not in some other "Pretty URLs" form? You may have to give both forms, or at least, the one that a bot would see.

aw06

My Forum is not in root .. it's in /forum ... See my Robots.txt

LainaaUser-agent: Googlebot
# Don't index mobile versions
Disallow: /forum/index.php?*;wap
Disallow: /forum/index.php?*;wap2
Disallow: /forum/index.php?*;imode
Disallow: /forum/index.php?action=printpage

Hmm, should the file go in the forum folder as well ?
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Arantor

It has to be in the root otherwise it will be ignored.
Holder of controversial views, all of which my own.


aw06

Lainaus käyttäjältä: Arantor - helmikuu 27, 2011, 10:44:28 AP
It has to be in the root otherwise it will be ignored.

OK, well i have it in the correct place then.. i have it in root ... /www
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Xavi-Nena

Lainaus käyttäjältä: Arantor - helmikuu 27, 2011, 09:28:08 AP
There's a mod for it...

The mod I found only works for 2.0 RC2 I use 1.1.13. Anyone know of a workaround to add permissions to prevent guests from viewing the print page action?

Arantor

Just inside the function that's in Printpage.php, just add is_not_guest();
Holder of controversial views, all of which my own.


DavidCT

Lainaus käyttäjältä: JimM - lokakuu 23, 2010, 07:16:21 IP
Using that Disallow: /index.php?action=printpage is the best way.  If the spider follows the robots.txt file they will not index the pages.

The problem with this is Google will still list the links in the index, just no content with it, and if you use Webmaster tools it'll bug you to death listing all the pages it can't crawl on the errors page.  The only proper way is to add rel="nofollow" to anything you don't want Google to place in their index.  I did this for mine, it took awhile to do everything :)  Now the only thing Google indexes is topic=###, though a few bad pages are still indexed until they get removed.

Sign up for Google's "Webmasters Tools".  You can remove links and see why robots.txt is/isn't working.

JimM

Lots of good info in this topic.  If this is solved, please mark it solved by clicking the Mark Topic Solved link at the bottom left.
Jim "JimM" Moore
Former Support Specialist

Advertisement: