Advertisement:

Author Topic: Problem with Canonical PHPSESSID  (Read 388 times)

Offline SirLouen

  • Semi-Newbie
  • *
  • Posts: 71
Problem with Canonical PHPSESSID
« on: June 14, 2018, 08:20:57 PM »
For some reason approx a 5% of the URLS in a client site don't have a canonical url for a PHPSESSID match

I'm not sure which is the pattern that makes this issue because I can't see any differences between these topics and other similar nearly identical.

For example
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx
This don't have a canonical pointing to:
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/

But mostly the others do have the canonical adequately.
For example:
https://www.forotoc.com/tratamientos-medicinales/flores-de-bach/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx
https://www.forotoc.com/grupos-de-soporte/encuentro-toc-en-vitoria/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx

What could be happening?
« Last Edit: June 15, 2018, 12:19:58 AM by Aleksi "Lex" Kilpinen »

Offline Aleksi "Lex" Kilpinen

  • A Peculiar Finn
  • Lead Support Specialist
  • SMF Super Hero
  • *
  • Posts: 17,234
  • Gender: Male
  • Don't worry, I'm n00b friendly
    • Aleksi.Kilpinen on Facebook
    • aleksi-kilpinen on LinkedIn
Re: Problem with Canonical PHPSESSID
« Reply #1 on: June 15, 2018, 12:19:23 AM »
The PHPSESSID is not part of the URL structure of SMF usually, and should not need any specific attention really.
You can also tell Google that, using the Google Search Console's URL Parameters if you want.

What you don't want to do, is go around posting session ID's, or get them indexed by Google.
A Finnish Support Specialist
 Happily running multiple SMF 2.0 installations.

How you can help SMF

"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum.
 Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

Offline shawnb61

  • Support Specialist
  • Sr. Member
  • *
  • Posts: 870
    • sbulen on GitHub
Re: Problem with Canonical PHPSESSID
« Reply #2 on: June 15, 2018, 02:35:43 AM »
For some reason approx a 5% of the URLS in a client site don't have a canonical url for a PHPSESSID match

I'm not sure which is the pattern that makes this issue because I can't see any differences between these topics and other similar nearly identical.

For example
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/?PHPSESSID=pcf8fi0ipactrm73xxxxxxxxxxx
This don't have a canonical pointing to:
https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/

Viewing the page source, the canonical appears to be correct:
Code: [Select]
<link rel="canonical" href="https://www.forotoc.com/diagnostico-del-trastorno-obsesivo-compulsivo/toc-sexual/" />
Not sure I see the problem on that page...
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Offline SirLouen

  • Semi-Newbie
  • *
  • Posts: 71
Re: Problem with Canonical PHPSESSID
« Reply #3 on: June 15, 2018, 09:34:00 AM »
Quote
Viewing the page source, the canonical appears to be correct:

Ok, I've found the problem now, my mistake:

When you do redirect from one topic to another, since the whole system keeps maintaining the PHPSESSID across the site everytime the crawler go through the redirect, it puts the link without the canonical (obviously)

It think this happens because of the plugin Pretty URLs

I've tested to create a new topic:
index.php?topic=1736.0

And when moved, it does not redirects, it just create a new "MOVED" topic with it's own id, and a link to the old topic in the new section

But since Pretty URL has an original URL lets say: /forum-category/subcategory1, when moving to subcategory2 it creates a 301 from /forum-subcategor1/post to /forum-subcategory2/post.

Therefore the PHPSESSID linking system breaks adequate crawling mechanism (for some reason the crawler keeps refreshing the PHPSESSID through the whole process).

I've noticed this with Screaming Frog. This is not even a problem with Pretty URL, because it takes in consideration the need for a redirection 301.

The problem with this whole system is those big forums literally with PHPSESSID implemented, burn SEO crawling budget, therefore the overall performance it's damaged.

I need to think on a way to make this more optimal... I'm not sure to which degree disabling PHPSESSID will break the forum, so I don't want to take that risk

Quote
What you don't want to do, is go around posting session ID's, or get them indexed by Google.

Even by posting them, they should not get indexed because they have canonicals.

Offline shawnb61

  • Support Specialist
  • Sr. Member
  • *
  • Posts: 870
    • sbulen on GitHub
Re: Problem with Canonical PHPSESSID
« Reply #4 on: June 16, 2018, 11:04:56 AM »
Sounds like the initial issue is sorted out, so I'll flag this as solved.

The problem with posting session IDs has more to do with security concerns than seo concerns.  It's a key to a private session. 
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp