Uutiset:

Want to get involved in developing SMF, then why not lend a hand on our github!

Main Menu
Advertisement:

Print Page? Indexed all over google ?

Aloittaja shawdy, heinäkuu 07, 2010, 12:46:22 IP

« edellinen - seuraava »

shawdy

How do i stop or remove the print page feature, all my pages are recieving google traffic to print pages, which basically display nothing of interest and there no way for the user to click a link to goto the homepage?

Please help me stop these being indexed..


mirahalo

theres son mods to enable/disable the print button http://custom.simplemachines.org/mods/index.php?action=search;basic_search=print   but there are for 2.0


also, you can make a robots.txt   with the following:

Disallow: /index.php?action=printpage


I personally use this:

User-agent: *
Disallow: /Sources
Disallow: /Smileys
Disallow: /Packages
Disallow: /avatars
Disallow: /attachments
Disallow: /Themes
Disallow: /index.php?action=printpage
Disallow: /index.php?action=stats
Disallow: /index.php?action=help
Disallow: /index.php?action=search
Disallow: /index.php?action=mlist
Disallow: /index.php?action=post
Disallow: /index.php?action=profile;area=showposts;u=*
Disallow: /index.php?action=profile;area=showposts;sa=attach;u=*
Disallow:  /index.php?wap2



cicka

Lainaus käyttäjältä: shawdy - heinäkuu 07, 2010, 12:46:22 IP
How do i stop or remove the print page feature, all my pages are recieving google traffic to print pages, which basically display nothing of interest and there no way for the user to click a link to goto the homepage?

Please help me stop these being indexed..



If you want to remove the print button find and either remove or comment out this code at the Display.template.php file.

'print' => array('text' => 465, 'image' => 'print.gif', 'lang' => true, 'custom' => 'target="_blank"', 'url' => $scripturl . '?action=printpage;topic=' . $context['current_topic'] . '.0'),

aw06

When you add Disallow: /index.php?action=printpage to robots.txt .. does is then show the actual page ? or will it show less results ?
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

JimM

Using that Disallow: /index.php?action=printpage is the best way.  If the spider follows the robots.txt file they will not index the pages.
Jim "JimM" Moore
Former Support Specialist

aw06

this does not seem to be working , guess i have to give Google some more time to re-crawl my site..
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Aleksi "Lex" Kilpinen

It can take a while for them to update the results, but it should work for the bigger search engines just fine, at least Google follows robots.txt nicely :)
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

aw06

My print pages and wap pages are still being indexed by Google Search  :-\

This is in my Index.php ... do i have any errors ?

User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

JimM

Jim "JimM" Moore
Former Support Specialist

aw06

:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Baby Daisy

To be honest I find print page more of a nuisance than a solution to a simple static page.

The penultimate way of disabling bots from viewing printpage is completely disabling the action.

Open ./index.php

Find:
'printpage' =>

Replace with:
// 'printpage' =>

In addition you can disable the button by either editing the array that compiles it in MessageIndex.php, or by simply denying the print permission from all groups (I recommend the latter, a lot easier).
あなたは私のお尻にキスするとき、私はそれを愛する

JimM

Or you can replace all the statements that you have with

Disallow: /index.php?*;*

That will basically disallow anything with a ; in it.

I use that in my robots.txt and don't have an issue with print pages getting modified.
Jim "JimM" Moore
Former Support Specialist

aw06

Lainaus käyttäjältä: JimM - joulukuu 28, 2010, 07:52:01 IP
Or you can replace all the statements that you have with

Disallow: /index.php?*;*

Not sure i follow ???
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Illori

in your robots.txt file, you dont need code changes to do this.

JimM

Replace all this:

User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage


with this:

Disallow: /index.php?*;*

in your robots.txt file.
Jim "JimM" Moore
Former Support Specialist

Baby Daisy

Lainaus käyttäjältä: JimM - joulukuu 29, 2010, 09:06:31 AP
Replace all this:

User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage


with this:

Disallow: /index.php?*;*

in your robots.txt file.

This would return a false positive:

/index.php?action=post;topic=389687.0

Making bots no longer spidering posts.
あなたは私のお尻にキスするとき、私はそれを愛する

Illori

where do you see that link? I dont see that type of link to a topic/post in my smf 1.1.12 install or here on this forum.

JimM

My suggestion is that we confine our comments to the OPs issue and not to disagreeing or analyzing each others comments. 
Jim "JimM" Moore
Former Support Specialist

aw06

:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

JimM

I was afraid that would happen.   ;D

I use the statement that I posted for the more active crawlers.  I have not had a problem with print pages being indexed nor a problem of topic pages not being indexed.

The best I can recommend is to try some different things and hopefully you will get a combination that works.
Jim "JimM" Moore
Former Support Specialist

MrPhil

If you use robots.txt to tell Google (and other well-behaved bots) to stay out of certain areas, make sure you've got the right path in the entries. I just posted to another topic on the same subject, where the OP had Disallow /index.php... entries, while their forum was installed into /forum (the entries should be Disallow /forum/index.php...).

aw06

#21
Lainaus käyttäjältä: MrPhil - joulukuu 31, 2010, 09:45:06 AP
If you use robots.txt to tell Google (and other well-behaved bots) to stay out of certain areas, make sure you've got the right path in the entries. I just posted to another topic on the same subject, where the OP had Disallow /index.php... entries, while their forum was installed into /forum (the entries should be Disallow /forum/index.php...).

You know that might just be it .. made changes ... will monitor  8)

My Question thou .. when if this works will google index less pages ?? or will it just index the actual post page now ?
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Arantor

What Jim said about using index.php?*;* is actually FINE.

Lainaa/index.php?action=post;topic=389687.0

Making bots no longer spidering posts.

Incorrect. It just means spiders won't be getting to the reply page but unless you have guest replying on, they wouldn't anyway.

It will likely deter Google Images picking up images however, since they use a ; in the link.
Holder of controversial views, all of which my own.


JimM

Keep in mind that these kinds of changes may take some time before you see results.
Jim "JimM" Moore
Former Support Specialist

aw06

Lainaus käyttäjältä: JimM - joulukuu 29, 2010, 09:06:31 AP
Replace all this:

User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage


with this:

Disallow: /index.php?*;*

in your robots.txt file.

OK, google still indexing print pages and wap pages .. your saying all i need in robots.text is Disallow: /index.php?*;* ???

Will that block Google Adsense Crawlers as well ??
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Arantor

LainaaWill that block Google Adsense Crawlers as well ??

Won't block Adsense but will push all content of your site out of Google - even the regular threads.
Holder of controversial views, all of which my own.


aw06

Lainaus käyttäjältä: Arantor - tammikuu 21, 2011, 06:59:39 IP
LainaaWill that block Google Adsense Crawlers as well ??

Won't block Adsense but will push all content of your site out of Google - even the regular threads.

:-X naw that's not good ... just need to block it from indexing print and wap pages
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Xavi-Nena

Is there anyway to disable this action or at least to unregistered members? I know how to remove the button but I am curious to know if there is a way to not allow the actual page to be viewed?

Arantor

Holder of controversial views, all of which my own.


aw06

Google still indexing my wap and print pages lol .. owell ... boo Google
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

MrPhil

Do you have your robots.txt correctly set up? That is, is your forum in the root (/index.php) or in a lower level (/forum/index.php)? Have you checked that "Googlebot" is the correct agent name, and not something like google-bot? Is there a reason you're doing this only for Google and not excluding other searchbots? Have you confirmed that your forum is generating URLs of the form you're giving in robots.txt, and not in some other "Pretty URLs" form? You may have to give both forms, or at least, the one that a bot would see.

aw06

My Forum is not in root .. it's in /forum ... See my Robots.txt

LainaaUser-agent: Googlebot
# Don't index mobile versions
Disallow: /forum/index.php?*;wap
Disallow: /forum/index.php?*;wap2
Disallow: /forum/index.php?*;imode
Disallow: /forum/index.php?action=printpage

Hmm, should the file go in the forum folder as well ?
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Arantor

It has to be in the root otherwise it will be ignored.
Holder of controversial views, all of which my own.


aw06

Lainaus käyttäjältä: Arantor - helmikuu 27, 2011, 10:44:28 AP
It has to be in the root otherwise it will be ignored.

OK, well i have it in the correct place then.. i have it in root ... /www
:: ShopinJA.com Powered by SMF 1.1.19 | Ig-Oh Theme by Koni | 70 Rock Solid Error Free Mods | Many Custom Edits & Tweaks ::
- Host Unlimited Websites - Free Website Builder & Templates - Unlimited Disk Space & Bandwidth

Xavi-Nena

Lainaus käyttäjältä: Arantor - helmikuu 27, 2011, 09:28:08 AP
There's a mod for it...

The mod I found only works for 2.0 RC2 I use 1.1.13. Anyone know of a workaround to add permissions to prevent guests from viewing the print page action?

Arantor

Just inside the function that's in Printpage.php, just add is_not_guest();
Holder of controversial views, all of which my own.


DavidCT

Lainaus käyttäjältä: JimM - lokakuu 23, 2010, 07:16:21 IP
Using that Disallow: /index.php?action=printpage is the best way.  If the spider follows the robots.txt file they will not index the pages.

The problem with this is Google will still list the links in the index, just no content with it, and if you use Webmaster tools it'll bug you to death listing all the pages it can't crawl on the errors page.  The only proper way is to add rel="nofollow" to anything you don't want Google to place in their index.  I did this for mine, it took awhile to do everything :)  Now the only thing Google indexes is topic=###, though a few bad pages are still indexed until they get removed.

Sign up for Google's "Webmasters Tools".  You can remove links and see why robots.txt is/isn't working.

JimM

Lots of good info in this topic.  If this is solved, please mark it solved by clicking the Mark Topic Solved link at the bottom left.
Jim "JimM" Moore
Former Support Specialist

Advertisement: