News:

SMF 2.1.4 has been released! Take it for a spin! Read more.

Main Menu

Problem with Canonical PHPSESSID

Started by SirLouen, June 14, 2018, 08:20:57 PM

Previous topic - Next topic

SirLouen

For some reason approx a 5% of the URLS in a client site don't have a canonical url for a PHPSESSID match

I'm not sure which is the pattern that makes this issue because I can't see any differences between these topics and other similar nearly identical.

For example
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx
This don't have a canonical pointing to:
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/

But mostly the others do have the canonical adequately.
For example:
https://www.forotoc.com/tratamientos-medicinales/flores-de-bach/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx
https://www.forotoc.com/grupos-de-soporte/encuentro-toc-en-vitoria/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx

What could be happening?

Aleksi "Lex" Kilpinen

The PHPSESSID is not part of the URL structure of SMF usually, and should not need any specific attention really.
You can also tell Google that, using the Google Search Console's URL Parameters if you want.

What you don't want to do, is go around posting session ID's, or get them indexed by Google.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

shawnb61

Quote from: SirLouen on June 14, 2018, 08:20:57 PM
For some reason approx a 5% of the URLS in a client site don't have a canonical url for a PHPSESSID match

I'm not sure which is the pattern that makes this issue because I can't see any differences between these topics and other similar nearly identical.

For example
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx
This don't have a canonical pointing to:
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/

Viewing the page source, the canonical appears to be correct:
<link rel="canonical" href="https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/" />

Not sure I see the problem on that page...
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

SirLouen

QuoteViewing the page source, the canonical appears to be correct:

Ok, I've found the problem now, my mistake:

When you do redirect from one topic to another, since the whole system keeps maintaining the PHPSESSID across the site everytime the crawler go through the redirect, it puts the link without the canonical (obviously)

It think this happens because of the plugin Pretty URLs

I've tested to create a new topic:
index.php?topic=1736.0

And when moved, it does not redirects, it just create a new "MOVED" topic with it's own id, and a link to the old topic in the new section

But since Pretty URL has an original URL lets say: /forum-category/subcategory1, when moving to subcategory2 it creates a 301 from /forum-subcategor1/post to /forum-subcategory2/post.

Therefore the PHPSESSID linking system breaks adequate crawling mechanism (for some reason the crawler keeps refreshing the PHPSESSID through the whole process).

I've noticed this with Screaming Frog. This is not even a problem with Pretty URL, because it takes in consideration the need for a redirection 301.

The problem with this whole system is those big forums literally with PHPSESSID implemented, burn SEO crawling budget, therefore the overall performance it's damaged.

I need to think on a way to make this more optimal... I'm not sure to which degree disabling PHPSESSID will break the forum, so I don't want to take that risk

QuoteWhat you don't want to do, is go around posting session ID's, or get them indexed by Google.

Even by posting them, they should not get indexed because they have canonicals.

shawnb61

Sounds like the initial issue is sorted out, so I'll flag this as solved.

The problem with posting session IDs has more to do with security concerns than seo concerns.  It's a key to a private session. 
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

megon

Hi SirLouen
I have the same problem with screaming frog - how did you manage?


Advertisement: