Advertisement:

Author Topic: mod_rewrite 301  (Read 38843 times)

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
mod_rewrite 301
« on: December 11, 2005, 12:48:01 PM »
Hi

I want to switch on SEO frindly URLs on one of my forums, which works fine on my server. But then I will be left with thousands of spidered 'dynamic' URLs.

So, does anyone know how to do mod_rewrite in the .htaccess file (or elsewhere), so that I can throw a 301 redirect from all the old dynamic urls to the new SEO friendly ones? Obviously, I don't want to have to write a separate redirect for for each url.  :-\

Any help, much appreciated.  :) 
« Last Edit: December 11, 2005, 12:51:30 PM by destalk »

Offline JayBachatero

  • SMF Friend
  • SMF Super Hero
  • *
  • Posts: 19,562
  • Gender: Male
    • @jaycreations on Twitter
    • JayBachatero.com
Re: mod_rewrite 301
« Reply #1 on: December 11, 2005, 10:16:25 PM »
« Last Edit: December 11, 2005, 10:20:47 PM by JayBachatero »
Follow me on Twitter

"HELP!!! I've fallen and I can't get up"
This moment has been brought to you by LifeAlert

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #2 on: December 11, 2005, 11:57:02 PM »
It does this automatically.  If you have http://www.simplemachines.org/community/index.php?topic=60194.0 it will go to http://www.simplemachines.org/community/index.php/topic,60194.html.

Hi JayBachatero

Thanks for that . But I'm sorry, that's not quite accurate. Or, at least, it's not what I was after.

What you are talking about is the way that URLs are displayed. When the SEO option is switched on, the displayed URLs are changed  - e.g. http://www.simplemachines.org/community/index.php/topic,60194.html. Which is great.

But if someone navigates to your site via the old dynamic URL (because it is listed somewhere as that) then it still displays that URL - e.g. http://www.simplemachines.org/community/index.php?topic=60194.0.

Because of that search engines will continue to display the old, incorrect, URL because the header code says that it is 200 OK. What needs to be done is for the old URL to throw a 301 http message in the header and redirect to the new SEO friendly URL, so that the search engines know that the new urls are being used. Otherwise they will just think that it is duplicate content.

I hope I am being clear. :)

Offline JayBachatero

  • SMF Friend
  • SMF Super Hero
  • *
  • Posts: 19,562
  • Gender: Male
    • @jaycreations on Twitter
    • JayBachatero.com
Re: mod_rewrite 301
« Reply #3 on: December 12, 2005, 12:45:43 AM »
Ok got you now.  I'm afraid that I can't help you with this but I'm sure will will get your answer from someone else here.

-JayBachatero
Follow me on Twitter

"HELP!!! I've fallen and I can't get up"
This moment has been brought to you by LifeAlert

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #4 on: December 12, 2005, 05:57:48 AM »
I hope so.

Thanks anyway.

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #5 on: December 14, 2005, 09:18:41 AM »
Actually, I've just discovered this external forum topic specifically about mod_rewite and SMF.

http://www.doriat.com/viewtopic.php?p=648#648

I haven't tried the solution yet, but I'll report back if it works, in case anyone is interested.

If anyone has a simpler solution, I'm still interested. :)

Offline Oldiesmann

  • Developer
  • SMF Super Hero
  • *
  • Posts: 24,814
  • Gender: Male
  • Ask me about the function DB :)
    • oldiesmann on Facebook
    • Oldiesmann on GitHub
    • http://www.linkedin.com/in/michaeleshom on LinkedIn
    • @oldiesmann on Twitter
    • Archie Comics Fan Forum
Re: mod_rewrite 301
« Reply #6 on: December 14, 2005, 02:43:35 PM »
That might work, but there are a couple of problems with that code:

1. It doesn't handle the start parameter (the part after the . that tells SMF what page or message we're viewing), so an attempt to access a specific page of a board or topic or a specific post within a topic would result in being redirected to the first page instead of the actual location.

2. It creates additional search engine friendly URLs for the profile and the search, which might not work properly and would result in extra redirection since SMF doesn't output the URLs this way.

This topic came up on another board a couple months ago. Here's the soultion I came up with (this one works both ways so if you ever decide to disable search engine friendly URLs, it will redirect the old URLs back to the correct ones).

index.php

Find
Code: [Select]
// Check if compressed output is enabled, supported, and not already being done.
Add before that:
Code: [Select]
if(empty($_REQUEST['action']) && (isset($_REQUEST['board']) || isset($_REQUEST['topic'])))
{
if(empty($modSettings['queryless_urls']))
{
// This is surprisingly simple... Figure out whether it's a board or a topic, and replace a few characters
if(strpos(strtolower($_SERVER['REQUEST_URI']), '/board,'))
{
// We're really only interested in what follows "/board,"...
$string = substr($_SERVER['REQUEST_URI'], strpos(strtolower($_SERVER['REQUEST_URI']), '/board,') + 7);

// This is really quite simple - replace every "/" with a ";", every "," with an "=" and get rid of ".html"...
$location = $scripturl . '?board=' . str_replace(array('/', ',', '.html'), array(';', '=', ''), $string);

header('HTTP/1.1 301 Moved Permanently');
header('Location: ' . $location);
}
elseif(strpos(strtolower($_SERVER['REQUEST_URI']), '/topic,'))
{
// We only need what's after "/topic,"...
$string = substr($_SERVER['REQUEST_URI'], strpost(strtolower($_SERVER['REQUEST_URI']), '/topic,') + 7);

// Again, just replace slashes with semicolons, commas with equal signs and get rid of the .html...
$location = $scripturl . '?topic=' . str_replace(array('/', ',', '.html'), array(';', '=', ''), $string);

header('HTTP/1.1 301 Moved Permanently');
header('Location: ' . $location);
}
}
// Only do this if we're just viewing a board or a topic - "board=" or "topic=" could be there in other situations as well...
else
{
// Still just a simple matter of replacing things, although a bit more work is required for topics...
if(strpos(strtolower($_SERVER['REQUEST_URI']), 'board='))
{
// Get whatever follows the "board="
$string = substr($_SERVER['REQUEST_URI'], strpos(strtolower($_SERVER['REQUEST_URI']), 'board=') + 6);

// Reverse of what we did above - replace semicolons with slashes, and equal signs with commas.
str_replace(array(';', '='), array('/', ','), $string);

// Don't forget the .html...
$string .= '.html';

$location = $scripturl . '/board,' . $string;

header('HTTP/1.1 301 Moved Permanently');
header('Location: ' . $location);
}
elseif(strpos(strtolower($_SERVER['REQUEST_URI']), 'topic='))
{
// Get whatever follows the "topic="
$string = substr($_SERVER['REQUEST_URI'], strpos(strtolower($_SERVER['REQUEST_URI']), 'topic=') + 6);

// Split off the anchor string from the rest of it...
if(strpos($string, '#'))
{
// Where does the anchor string start?
$pos = strpos($string, '#');

// Isolate the anchor part from the rest of it
$anchorstring = substr($string, $pos);

// Now we just drop that part from the string...
str_replace($anchorstring, '', $string);
}
else
{
$anchorstring = '';
}

// Replace again
str_replace(array(';', '='), array('/', ','), $string);

// Add the .html
$string .= '.html';

$location = $scripturl . '/topic,' . $string . $anchorstring;

// Redirect
header('HTTP/1.1 301 Moved Permanently');
header('Location: ' . $location);
}
}
}
« Last Edit: December 18, 2005, 02:44:49 PM by Oldiesmann »
Michael Eshom
Webmaster / SMF Lead Developer
oldiesmann@simplemachines.org

Offline JayBachatero

  • SMF Friend
  • SMF Super Hero
  • *
  • Posts: 19,562
  • Gender: Male
    • @jaycreations on Twitter
    • JayBachatero.com
Re: mod_rewrite 301
« Reply #7 on: December 14, 2005, 03:05:23 PM »
Moved it to the [[Tips and tricks]] board.
Follow me on Twitter

"HELP!!! I've fallen and I can't get up"
This moment has been brought to you by LifeAlert

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #8 on: December 15, 2005, 02:45:46 AM »
That might work, but there are a couple of problems with that code:

Thanks Oldiesmann, that's great and seems like a much more elegant solution.

A couple of questions, if you don't mind?

I've noticed that when SE friendly URLs are enabled in SMF, that the pull down menus still point to the default dynamic PHP URLs. Will your solution also redirect the urls generated by the drop-down menus? Although, perhaps, a way to also make the pull down menus generate SE friendly URLs would be a better solution?

Quote
1. It doesn't handle the start parameter (the part after the . that tells SMF what page or message we're viewing), so an attempt to access a specific page of a board or topic or a specific post within a topic would result in being redirected to the first page instead of the actual location.

I suspect that this is intentional on the part of the author, to avoid duplication of content. I think that the idea is that it ensures that the search engines only index one instance of a topic (the beginning, as you said), rather than whatever point in the discussion the SE spider happened to pick up (which could well be an anchor point in the middle of a thread). From that point of view, it's quite clever.

I agree that the search and profiles don't really need to be SE friendly, though.

I'll try your solution though and report back. Will it work in all versions of SMF (I'm using the Beta 1.1.

Thanks JayBachatero, for moving this to Tips and Tricks. And thanks again to both of you for the help. :)
« Last Edit: December 15, 2005, 02:48:46 AM by destalk »

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #9 on: December 15, 2005, 09:11:36 AM »
Hi Oldiesmann

I am getting the following error when I paste that code in (this is with SEO friendly URLs switched off);

Code: [Select]
Warning: Unexpected character in input: ' in /home/domain/public_html/index.php on line 183

If I switch SE Friendly URLs on, the errors look like this;

Code: [Select]
Warning: Unexpected character in input: ' in /home/domain/public_html/index.php on line 183

Notice: Array to string conversion in /home/domain/public_html/index.php on line 171

Warning: Cannot modify header information - headers already sent by (output started at /home/domain/public_html/index.php:183) in /home/domain/public_html/index.php on line 179

Warning: Cannot modify header information - headers already sent by (output started at /home/domain/public_html/index.php:183) in /home/domain/public_html/index.php on line 180
« Last Edit: December 15, 2005, 09:17:38 AM by destalk »

Offline Oldiesmann

  • Developer
  • SMF Super Hero
  • *
  • Posts: 24,814
  • Gender: Male
  • Ask me about the function DB :)
    • oldiesmann on Facebook
    • Oldiesmann on GitHub
    • http://www.linkedin.com/in/michaeleshom on LinkedIn
    • @oldiesmann on Twitter
    • Archie Comics Fan Forum
Re: mod_rewrite 301
« Reply #10 on: December 18, 2005, 02:42:28 PM »
Whoops! Mixed up the order of the values being passed to str_replace :)

Fixed the code above.
Michael Eshom
Webmaster / SMF Lead Developer
oldiesmann@simplemachines.org

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #11 on: December 18, 2005, 05:53:00 PM »
Thank you, as ever.

Very much appreciated.  :D

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #12 on: December 18, 2005, 06:10:45 PM »
Just one minor issue. When the redirect kicks in, it loses the # sign. E.G. it forwards the url to something like this;

http://www.domain.com/index.php/topic,125.msg532.html

When the URL is actually;

http://www.domain.com/index.php/topic,125.msg532.html#msg532

Again, this leaves the possibiity of Google having two urls to deal with.

<---EDIT--->

Also, did you say that this would work the other way around? I.E. SE Friendy to dynamic/original URLs? Because I get the following error when I turn SEO friendly option off;

Fatal error: Call to undefined function: strpost() in /home/domain/public_html/index.php on line 120
« Last Edit: December 18, 2005, 06:20:16 PM by destalk »

Offline Ben_S

  • SMF Friend
  • SMF Super Hero
  • *
  • Posts: 11,702
  • xxx
Re: mod_rewrite 301
« Reply #13 on: December 18, 2005, 06:17:36 PM »
AFAIK google will ignore #'s anyway.
Liverpool FC Forum with 14 million+ posts.

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #14 on: December 18, 2005, 06:25:03 PM »
AFAIK google will ignore #'s anyway.

I was just about to disagree with you, then I went to check. You are quite right. :P

Thanks for that. :)

Offline Oldiesmann

  • Developer
  • SMF Super Hero
  • *
  • Posts: 24,814
  • Gender: Male
  • Ask me about the function DB :)
    • oldiesmann on Facebook
    • Oldiesmann on GitHub
    • http://www.linkedin.com/in/michaeleshom on LinkedIn
    • @oldiesmann on Twitter
    • Archie Comics Fan Forum
Re: mod_rewrite 301
« Reply #15 on: December 20, 2005, 08:56:13 AM »
The code isn't supposed to drop the anchor string, but if Google ignores them anyway then I guess there's no point in fixing them. Also, the strpost() error is due to a typo. It should be strpos().
Michael Eshom
Webmaster / SMF Lead Developer
oldiesmann@simplemachines.org

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #16 on: December 20, 2005, 11:08:50 AM »
Thanks, as ever, Oldiesmann.

Offline SleePy

  • Site Team Lead
  • SMF Master
  • *
  • Posts: 29,928
  • Gender: Male
  • Thats his happy face.
    • jdarwood007 on GitHub
    • @jdarwood on Twitter
    • SleePy Code - My personal site
Re: mod_rewrite 301
« Reply #17 on: December 20, 2005, 05:58:09 PM »
i would like to use this since our forums can't due this because we dont got the mod we need.

i got it to do stuff like

Quote
RewriteEngine on
RewriteRule ^index.htm index.php [L]
RewriteRule ^forums.htm index.php?action=forum [L]
RewriteRule ^search.html index.php?action=search [L]

but soon as i add

RewriteRule ^board([0-9.]*).html index.php?board=$1 [L]

i get a 500 internal server error.

am i doing anything wrong?

in fact if i add anything else it breaks
Quote
RewriteRule ^profile.html index.php?action=profile [L]
RewriteRule ^PrivateMessage.html index.php?action=pm [L]
RewriteRule ^Calendar.html index.php?action=calendar [L]
« Last Edit: December 20, 2005, 06:01:13 PM by SleePy »
Jeremy D — Site Team / SMF Developer
Support the SMF Support team!
Profiles:
GitHub

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #18 on: December 25, 2005, 08:54:24 AM »
What would also be really nice is if, upon first viewing a forum with SEF URLs switched on, is if the PHPSESSIONID stuff could be avoided - Google is full of PHPSESSIONID urls from my forums, so it is indexing this. Also, the first time that the home page is shown to a user, it still shows the PHP dynamic URLs.

Any ideas welcome.

Offline destalk

  • Sr. Member
  • ****
  • Posts: 797
Re: mod_rewrite 301
« Reply #19 on: January 06, 2006, 05:47:10 PM »
Hi

This all works fine, apart from one feature. I've noticed that when an email notification is sent to a member, it is still in the old format. And they click on the link it redirects them to the beginning of a thread, rather than to the *new* post.

Is there any way to fix this?

Thanks.