Aliasing old VB postURLs to SMF URLs

Started by MrMike, August 24, 2011, 09:09:23 AM

Previous topic - Next topic

MrMike

I'm converting a VB forum to SMF and I'm wondering if it's possible to alias or redirect the old URLs to the SMF URLs, which are slightly different.

Old VB style URLs:
http://www.domain.com/staff-area/87846-posting-rules.html

Converted SMF URLs:
http://www.domain.com/forum/index.php?topic=87846.0

Is there a way, perhaps with htaccess to redirect the old URLs to the new URLs? The post IDs are the same and I'm trying to avoid the massive number of 404s that are going to result when the URLs to 250,000 posts suddenly change.

I'm thinking something like Simple SEF might help to mitigate this somewhat by making the URLs similar.

Maybe use htaccess to direct 404 errors to a php script that'll reformat the URLs and redirect the users to the correct post?

Suggestions would be welcome.

Oldiesmann

It should be quite easy to do that with .htaccess, but unfortunately I'm not very knowledgeable on that subject. I'll see if I can point someone who knows more about mod_rewrite in this direction :)
Michael Eshom
Christian Metal Fans

IchBin™

Hmmm... been a long time since I messed with these things. Maybe try something like this:
RewriteRule ^staff-area/([0-9]+)$   forum/index.php?topic=$1 [R=301,L]

Cross your fingers and hope it works. :D
IchBin™        TinyPortal

MrMike

#3
I coed a simple solution for this. I set my 404 page to 404.php, and inside 404.php I placed this code (crude, but it works). It just silently forwards the user to the topic without any fuss.

<?php
// reformat old-style VBulletin links to SMF-style links

$qstring $_SERVER['REQUEST_URI'];
$pattern '|/(\d+)-|';
preg_match($pattern$qstring$matchesPREG_OFFSET_CAPTURE3);

$topic_id $matches[1][0];

// make new link
$new_url "http://www.domain.com/index.php?topic=$topic_id";

print <<<EOM
<html>
<head>
<meta http-equiv="refresh" content="0;url=
$new_url"> 
</head>
<body>
</body>
</html>
EOM;

?>

Oldiesmann

Use this code instead of that big print section... Takes up less space...

header('Location: ' . $new_url);
Michael Eshom
Christian Metal Fans

Ricky.

Quote from: IchBin™ on September 01, 2011, 09:21:37 PM
Hmmm... been a long time since I messed with these things. Maybe try something like this:
RewriteRule ^staff-area/([0-9]+)$   forum/index.php?topic=$1 [R=301,L]

Cross your fingers and hope it works. :D
Above is the best solution, you will not loose traffic either.

IchBin™

Quote from: Ricky. on September 09, 2011, 01:26:43 AM
Quote from: IchBin™ on September 01, 2011, 09:21:37 PM
Hmmm... been a long time since I messed with these things. Maybe try something like this:
RewriteRule ^staff-area/([0-9]+)$   forum/index.php?topic=$1 [R=301,L]

Cross your fingers and hope it works. :D
Above is the best solution, you will not loose traffic either.


I was hoping to know if it worked. lol
IchBin™        TinyPortal

MrMike

Yep, using header() is a better, cleaner way to do it.

The HTML stuff was leftover from an earlier incanation- we were going to print out a message about the site changeover with some additional links and info, but we dropped that idea.



Quote from: Oldiesmann on September 09, 2011, 12:48:22 AM
Use this code instead of that big print section... Takes up less space...

header('Location: ' . $new_url);

MrMike

#8
I may play with this and see if I can get it to go.  :D

EDIT: I fiddled with this for a while but couldn't get it going. It looks like it should work.

Quote from: IchBin™ on September 01, 2011, 09:21:37 PM
Hmmm... been a long time since I messed with these things. Maybe try something like this:
RewriteRule ^staff-area/([0-9]+)$   forum/index.php?topic=$1 [R=301,L]

Cross your fingers and hope it works. :D

IchBin™

Did you have "RewriteEngine On"  set? Either way, I guess it doesn't matter since you are up and working. :)
IchBin™        TinyPortal

MrMike

Yep, RewriteEngine was on. I'm sure it could be done, but sometimes writing a little bit o' code is the simplest solution.

You can see it working here:
http://www.mgkiller.com/application/305470-resolume-avenue-v3-3-2-a.html (the old. VBulletin URL)

...gets re-routed to:
http://www.mgkiller.com/index.php?topic=305470 (the new SMF URL)

MGKiller.com is a large directory of file listings for file-sharing sites like Hotfile.com, Rapidshare.com, WUpload.com, etc.

Currently about 300,000 posts and growing daily. (Most file-sharing sites do NOT let you search for a particular file or name of an upload. MGKiller.com is kind of a solution to let you do that.)

We're using a modified version of the very nice "SilentWave" theme from DZiner Studios.




Quote from: IchBin™ on September 12, 2011, 11:37:35 AM
Did you have "RewriteEngine On"  set? Either way, I guess it doesn't matter since you are up and working. :)

_saiko

#11
I managed to get translation from
'showpost.php?t=<>'
to
'index.php?topic=<>'
using the following in .htacces:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^t=([0-9]+)$
RewriteRule ^showthread\.php$ /index.php?topic=%1 [R=301,L]


Can't think of how to translate
'showthread.php?p=<topic_no>#post<post_no>'
to
'index.php?topic=<topic_no>.msg<post_no>#msg<post_no>'
since vbulletin links don't have the topic_no in them :|

Any ideas?

MrMike

Using htaccess proved problematic for me...I'm fairly sure it could be done, but I found that it was faster, simpler, and more flexible to just write a small "steering" script that intercepted the old URLs and translated them to the new URLs.

I messed with htaccess for a day or two, but was able to write the intercept script in about an hour.



Quote from: _saiko on November 05, 2011, 07:26:24 PM
I managed to get translation from
'showpost.php?t=<>'
to
'index.php?topic=<>'
using the following in .htacces:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^t=([0-9]+)$
RewriteRule ^showthread\.php$ /index.php?topic=%1 [R=301,L]


Can't think of how to translate
'showthread.php?p=<topic_no>#post<post_no>'
to
'index.php?topic=<topic_no>.msg<post_no>#msg<post_no>'
since vbulletin links don't have the topic_no in them :|

Any ideas?

_saiko

Could you share the script?

As I said, I was already able to redirect showthread.php?t=666 to index.php?topic=666
The problem are links pointing to specific posts such as showthread.php?p=1212#message1212
They don't contain the topic id so i can't rewrite it to index.php?topic=999.msg1212#msg1212.

Not sure if I explained this clearly...

MrMike

Quote from: _saiko on November 07, 2011, 09:45:04 AMCould you share the script?

Sure, here you go. One thing that may be different is that VB was using some sort of SEO package to fiddle with the URLs, so they may not be exactly like the ones you have...you may need to do some htaccess magic as well.

All I did was set my 404 Error directive to go to '404.php' and inside '404.php' is this simple bit of code:
<?php

/*
Example VB URL:
http://www.domain.com/application/305470-resolume-avenue-v3-3-2-a.html

Translated SMF URL:
http://www.domain.com/index.php?topic=305470
*/

$qstring $_SERVER['REQUEST_URI'];

$pattern '|/(\d+)-|';
preg_match($pattern$qstring$matchesPREG_OFFSET_CAPTURE3);

$topic_id $matches[1][0];

// make link
$new_url "http://www.domain.com/index.php?topic=$topic_id";

header("Location: $new_url");

?>

MrMike

.htaccess is pretty powerful, but there are things it simply cannot do. You might want to think about routing requests to an active page for processing.

Here's something else that might work. This is something I had to do on CODmb.com, a forum for Call of Duty players. It's not an ideal fix, but it works. When we converted the board we didn't want to lose all the current links into the board. So what we did was save the old post/user IDs, detect that pattern, and reroute the request to a php file to build the right URL.

The SMF board on CODmb.com was converted from a bespoke message board with a very different database structure. When we converted the posts and topics we saved all the post and topic IDs from the original table and mapped them to the new post and topic IDs in the SMF messages table.

We saved all this stuff in a new table, so we have a table that has the before and after IDs to the posts.

The old board used 2 files named 'view_topic.php' and view_posts.php' to view things. We wrote our own versions of those two files and all they do is look up the old post/topic info, build a new URL for SMF, and send the user to it.

The downside is that those old URLs and the table will persist for years. Even they're aren't valid URLs they still work. Keeping the files and table there is no big deal, but it's clutter. Like I said, it's not a perfect solution, but it works. :)

Perhaps you could do something similar on your site?

_saiko

Thanks.

What I figured is the following:
Since vb doesn't use topic_id in it's URL and uses only the post_id I'll have to get the topic_id.
The smf_messages table contains both topic_id and post_id, I'll use the existing smf database to match the topic_id.

I'll just create a vb2smf_url.php like this:

<?php

/*
URL Translation 
From:
(topic_id  = 1234 is inside thread_id = 12)
http://www.domain.com/showthread.php?p=1234

To:
http://www.domain.com/index.php?topic=12.msg1234
*/

$qstring $_SERVER['REQUEST_URI'];

$pattern '|/showthread\.php\?p=(\d+)|';
preg_match($pattern$qstring$matchesPREG_OFFSET_CAPTURE3);

$post_id $matches[1][0];

//connect to db and get topic_id for the read post_id
$db mysql_connect('DB_ADDRESS','DB_USER','DB_PASS') or die("Database error");
mysql_select_db('DB'$db); 
$topic_id mysql_query('SELECT id_topic FROM `smf_messages` WHERE id_msg = $post_id');

// make link
$new_url "http://www.domain.com/index.php?topic=$topic_id.msg$post_id";

header("Location: $new_url");

?>


Does this make any sense?
I realize this approach makes an additional connection to the database, how can I optimize the procedure so I can benefit from the already connected session?

MrMike

#17
Quote from: _saiko on November 10, 2011, 08:35:41 AM
Does this make any sense?
I realize this approach makes an additional connection to the database, how can I optimize the procedure so I can benefit from the already connected session?

Yep, this is more or less what we did.

As for the extra connection, I wouldn't worry about it. It's unlikely to be an issue unless you have overwhelming traffic.

I'm not sure the sql you showed there is right, however.

Advertisement: