News:

SMF 2.1.4 has been released! Take it for a spin! Read more.

Main Menu

Bug: NoIndex why Google does not crawl and rank your forum !

Started by hartiberlin, December 09, 2009, 04:21:03 PM

Previous topic - Next topic

hartiberlin

We have a very important bug in SMF 2.0RC2 !

There is always the Meta Tag:

<meta name="robots" content="noindex" />

inside the header !

This is why your forum does not get indexed in the Google
search engine.

Can we plase have a fix as soon as possible ?

Where can it be set to:

<meta name="robots" content="index,follow" />

or

<meta name="robots" content="nofollow" />

By the way, what is better for SEO:

index,follow
or
nofollow

Will your PageRank decrease,
if Google will index all your internal links and links to external sites ?
Will this DoFollow method loose you your own Pagerank ?

Many thanks.

Arantor

Show me a forum where you see it doing that in the HTML source please.
Holder of controversial views, all of which my own.



Arantor

Odd, very odd, since it isn't in a fresh RC2 installation.

Take a look at index.template.php.
Holder of controversial views, all of which my own.


hartiberlin

It is a very fresh RC2 installation !

I only use PortaMX and a few other MODs with it,
but I just installed it only a few days ago with all the latest
RC2 compatibe MODs !

These MODs are installed and enabled:

Pmx HighSlide for attachments      0.12
Sitemap    2.1.0
Ad Managment    2.3.6
PortaMx v0.971-1 upgrade    0.971-1
PortaMx v0.971    0.971
RSS Feeder    1.1.4


feline

I say you, that this is a mistake in portamx on action=forum and it's will be fixed in the next upgrade ...

Fel

hartiberlin

Okay I see...
Well better drop and delete the Meta Tag robots=

so this could be handled much better and easily via
robots.txt

So could this Meta Tag please be deleted out of SMF completely ?

Well,
or just include this VBStyle Meta Tag Mod as a Default
in SMF2.0RC3 or final,
so we will have very good dynamic Meta Tags which
will contain the contens of the postings ?

This is still missing and fixed meta tags for the keywords are not good for
SEO.
We really need to get good SEO features into SMF finally.

Many thanks.

Regards, Stefan.

Arantor

Quote from: hartiberlin on December 09, 2009, 06:01:18 PM
so this could be handled much better and easily via
robots.txt

Not everything can, since it ALSO manages duplicate content, which you CANNOT UNDER ANY CIRCUMSTANCES manage through robots.txt given the limited specification, plus as I already advised not everyone is able to modify robots.txt.

QuoteSo could this Meta Tag please be deleted out of SMF completely ?

You can if you wish; it won't be removed from 2.0.

Quotejust include this VBStyle Meta Tag Mod as a Default
in SMF2.0RC3 or final,

No more features are being added to 2.0 at this time.

Quoteso we will have very good dynamic Meta Tags which
will contain the contens of the postings ?

No more features are being added to 2.0 at this time, as it is 'feature locked'. In any case, meta keywords is next to useless, meta description is passingly useful though the increased performance hit is unexpected and unpleasant.
Holder of controversial views, all of which my own.


hartiberlin

I see,
many thanks for the additional infos.

I fixed it now with using the

Meta Data Mod:

http://custom.simplemachines.org/mods/index.php?mod=1871

There you also can set the Robots= Meta tag
to always display index,follow.

This fixes the problem with the robots Meta Tag.

The question now is ,
where should this set to nofollow or noindex on which pages ?

Many thanks for your hard work.

Regards, Stefan.

Arantor

nofollow should be on most of the non-content links, i.e. things like the reply button. Print page really should be nofollow if it isn't already to avoid duplicate content penalties.

Also anything that guests can't normally see should be nofollowed too.
Holder of controversial views, all of which my own.


hartiberlin

How can we set in the robots.txt
the print page and the WAP2 page on
noindex ?

As I have installed now the Meta Tag Mod,
it has all pages on index,follow

so I need now to use
robots.txt
to block the
print pages and the WAP2 pages.

Or will this mix up Googlebot, cause the Meta Tag will tell him
to crawl it ?

Arantor

Quote from: hartiberlin on December 09, 2009, 07:26:55 PM
How can we set in the robots.txt
the print page and the WAP2 page on
noindex ?

You can't. Precisely my point. It *has* to be done via meta tag.

Quote
Or will this mix up Googlebot, cause the Meta Tag will tell him
to crawl it ?

Yes, it'll crawl it and possibly penalise you for duplicate content which it wouldn't always have done before.
Holder of controversial views, all of which my own.


hartiberlin

Hmm,
so it is not good to use the
Meta Data Mod:

http://custom.simplemachines.org/mods/index.php?mod=1871

for this ?

Are you sure, that you can´t forbid it to crawl:

index.php?action=printpage

Arantor

You can try using Disallow: index.php?action=printpage but it isn't always - from experience - heeded properly. Simpler solution is actually the mod I have which would remove the link entirely for guests, meaning no chance of duplicate content.
Holder of controversial views, all of which my own.


hartiberlin

Welll I just checked the difference between
via
http://www.duplicatecontent.net/

from a normal page and the printpage
and it is only about 60 % simular:

   http://www.ruhleben.com/index.php?topic=8.0   http://www.ruhleben.com/index.php?action=printpage;topic=8.0   Similarity
HTML fingerprint:   0000afec1f6fa111   0000018010018791   26.67%
HTML distribution value:   43 00 00 00 28 00 00 00 41 00 00 00 00 00 00 01 01 00 00 00 00 00 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 00 00 00 39 08 00 00 00 00 00 00 00    02 00 00 00 0f 00 00 00 01 00 00 00 00 00 00 01 01 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 01 03 00 00 00 00 00 00 00    7.06%
Total HTML similarity:   16.86%
Standard text similarity:   60.84%
Smart text similarity:   62.21%
Total text similarity   61.52%

===

So the question is, if google would slap that,
as I have heard, that pages needs to be 35 % different
and here it is about 40 % different.

Arantor

Google would - and has in the past - given penalties for duplicate content like that, yes.
Holder of controversial views, all of which my own.


hartiberlin

Many thanks for your Printpage Permission mod.
I installed it now.

Could you also do something like this for
the WAP2 content pages ?

Arantor

I suppose I could look into it in about 3 months time after I finish this paid for mod.
Holder of controversial views, all of which my own.


Advertisement: