How do i stop or remove the print page feature, all my pages are recieving google traffic to print pages, which basically display nothing of interest and there no way for the user to click a link to goto the homepage?
Please help me stop these being indexed..
theres son mods to enable/disable the print button http://custom.simplemachines.org/mods/index.php?action=search;basic_search=print but there are for 2.0
also, you can make a robots.txt with the following:
Disallow: /index.php?action=printpage
I personally use this:
User-agent: *
Disallow: /Sources
Disallow: /Smileys
Disallow: /Packages
Disallow: /avatars
Disallow: /attachments
Disallow: /Themes
Disallow: /index.php?action=printpage
Disallow: /index.php?action=stats
Disallow: /index.php?action=help
Disallow: /index.php?action=search
Disallow: /index.php?action=mlist
Disallow: /index.php?action=post
Disallow: /index.php?action=profile;area=showposts;u=*
Disallow: /index.php?action=profile;area=showposts;sa=attach;u=*
Disallow: /index.php?wap2
Lainaus käyttäjältä: shawdy - heinäkuu 07, 2010, 12:46:22 IP
How do i stop or remove the print page feature, all my pages are recieving google traffic to print pages, which basically display nothing of interest and there no way for the user to click a link to goto the homepage?
Please help me stop these being indexed..
If you want to remove the print button find and either remove or comment out this code at the Display.template.php file.
'print' => array('text' => 465, 'image' => 'print.gif', 'lang' => true, 'custom' => 'target="_blank"', 'url' => $scripturl . '?action=printpage;topic=' . $context['current_topic'] . '.0'),
When you add Disallow: /index.php?action=printpage to robots.txt .. does is then show the actual page ? or will it show less results ?
Using that Disallow: /index.php?action=printpage is the best way. If the spider follows the robots.txt file they will not index the pages.
this does not seem to be working , guess i have to give Google some more time to re-crawl my site..
It can take a while for them to update the results, but it should work for the bigger search engines just fine, at least Google follows robots.txt nicely :)
My print pages and wap pages are still being indexed by Google Search :-\
This is in my Index.php ... do i have any errors ?
User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage
That looks fine for a robots.txt file!
ok, doesn't seem to be working :-[
To be honest I find print page more of a nuisance than a solution to a simple static page.
The penultimate way of disabling bots from viewing printpage is completely disabling the action.
Open ./index.php
Find:
'printpage' =>
Replace with:
// 'printpage' =>
In addition you can disable the button by either editing the array that compiles it in MessageIndex.php, or by simply denying the print permission from all groups (I recommend the latter, a lot easier).
Or you can replace all the statements that you have with
Disallow: /index.php?*;*
That will basically disallow anything with a ; in it.
I use that in my robots.txt and don't have an issue with print pages getting modified.
Lainaus käyttäjältä: JimM - joulukuu 28, 2010, 07:52:01 IP
Or you can replace all the statements that you have with
Disallow: /index.php?*;*
Not sure i follow ???
in your robots.txt file, you dont need code changes to do this.
Replace all this:
User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage
with this:
Disallow: /index.php?*;*
in your robots.txt file.
Lainaus käyttäjältä: JimM - joulukuu 29, 2010, 09:06:31 AP
Replace all this:
User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage
with this:
Disallow: /index.php?*;*
in your robots.txt file.
This would return a false positive:
/index.php?action=post;topic=389687.0
Making bots no longer spidering posts.
where do you see that link? I dont see that type of link to a topic/post in my smf 1.1.12 install or here on this forum.
My suggestion is that we confine our comments to the OPs issue and not to disagreeing or analyzing each others comments.
lol . OK .. now I'm Lost :laugh:
I was afraid that would happen. ;D
I use the statement that I posted for the more active crawlers. I have not had a problem with print pages being indexed nor a problem of topic pages not being indexed.
The best I can recommend is to try some different things and hopefully you will get a combination that works.
If you use robots.txt to tell Google (and other well-behaved bots) to stay out of certain areas, make sure you've got the right path in the entries. I just posted to another topic on the same subject, where the OP had Disallow /index.php... entries, while their forum was installed into /forum (the entries should be Disallow /forum/index.php...).
Lainaus käyttäjältä: MrPhil - joulukuu 31, 2010, 09:45:06 AP
If you use robots.txt to tell Google (and other well-behaved bots) to stay out of certain areas, make sure you've got the right path in the entries. I just posted to another topic on the same subject, where the OP had Disallow /index.php... entries, while their forum was installed into /forum (the entries should be Disallow /forum/index.php...).
You know that might just be it .. made changes ... will monitor 8)
My Question thou .. when if this works will google index less pages ?? or will it just index the actual post page now ?
What Jim said about using index.php?*;* is actually FINE.
Lainaa/index.php?action=post;topic=389687.0
Making bots no longer spidering posts.
Incorrect. It just means spiders won't be getting to the reply page but unless you have guest replying on, they wouldn't anyway.
It will likely deter Google Images picking up images however, since they use a ; in the link.
Keep in mind that these kinds of changes may take some time before you see results.
Lainaus käyttäjältä: JimM - joulukuu 29, 2010, 09:06:31 AP
Replace all this:
User-agent: Googlebot
# Don't index mobile versions
Disallow: /index.php?*;wap
Disallow: /index.php?*;wap2
Disallow: /index.php?*;imode
Disallow: /index.php?action=printpage
with this:
Disallow: /index.php?*;*
in your robots.txt file.
OK, google still indexing print pages and wap pages .. your saying all i need in robots.text is Disallow: /index.php?*;* ???
Will that block Google Adsense Crawlers as well ??
LainaaWill that block Google Adsense Crawlers as well ??
Won't block Adsense but will push all content of your site out of Google - even the regular threads.
Lainaus käyttäjältä: Arantor - tammikuu 21, 2011, 06:59:39 IP
LainaaWill that block Google Adsense Crawlers as well ??
Won't block Adsense but will push all content of your site out of Google - even the regular threads.
:-X naw that's not good ... just need to block it from indexing print and wap pages
Is there anyway to disable this action or at least to unregistered members? I know how to remove the button but I am curious to know if there is a way to not allow the actual page to be viewed?
There's a mod for it...
Google still indexing my wap and print pages lol .. owell ... boo Google
Do you have your robots.txt correctly set up? That is, is your forum in the root (/index.php) or in a lower level (/forum/index.php)? Have you checked that "Googlebot" is the correct agent name, and not something like google-bot? Is there a reason you're doing this only for Google and not excluding other searchbots? Have you confirmed that your forum is generating URLs of the form you're giving in robots.txt, and not in some other "Pretty URLs" form? You may have to give both forms, or at least, the one that a bot would see.
My Forum is not in root .. it's in /forum ... See my Robots.txt
LainaaUser-agent: Googlebot
# Don't index mobile versions
Disallow: /forum/index.php?*;wap
Disallow: /forum/index.php?*;wap2
Disallow: /forum/index.php?*;imode
Disallow: /forum/index.php?action=printpage
Hmm, should the file go in the forum folder as well ?
It has to be in the root otherwise it will be ignored.
Lainaus käyttäjältä: Arantor - helmikuu 27, 2011, 10:44:28 AP
It has to be in the root otherwise it will be ignored.
OK, well i have it in the correct place then.. i have it in root ... /www
Lainaus käyttäjältä: Arantor - helmikuu 27, 2011, 09:28:08 AP
There's a mod for it...
The mod I found only works for 2.0 RC2 I use 1.1.13. Anyone know of a workaround to add permissions to prevent guests from viewing the print page action?
Just inside the function that's in Printpage.php, just add is_not_guest();
Lainaus käyttäjältä: JimM - lokakuu 23, 2010, 07:16:21 IP
Using that Disallow: /index.php?action=printpage is the best way. If the spider follows the robots.txt file they will not index the pages.
The problem with this is Google will still list the links in the index, just no content with it, and if you use Webmaster tools it'll bug you to death listing all the pages it can't crawl on the errors page. The only proper way is to add
rel="nofollow" to anything you don't want Google to place in their index. I did this for mine, it took awhile to do everything :) Now the only thing Google indexes is topic=###, though a few bad pages are still indexed until they get removed.
Sign up for Google's "Webmasters Tools (http://www.google.com/webmasters/)". You can remove links and see why robots.txt is/isn't working.
Lots of good info in this topic. If this is solved, please mark it solved by clicking the Mark Topic Solved link at the bottom left.