Old links with html not working in nginx

Started by spiros, May 29, 2018, 09:26:23 AM

Previous topic - Next topic

spiros

I have links in this format:

https://www.translatum.gr/forum/index.php/topic,2884.0.html

But when switching to nginx instead of apache they will lead to an error page outside the forum.

How can I either add a rule to accept those links in nginx or run a regex in the dB to change to standard format, I.e. from this format:

https://www.translatum.gr/forum/index.php/topic,2884.0.html

to this:

https://www.translatum.gr/forum/index.php?topic=2884.0

GigaWatt

You could probably do it with this script ;). Do a database backup first and leave $doit = 'No' (do a test run first ;)).

Note: It doesn't support wildcards (*).
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

spiros

Quite complex script, not sure if it will fix the style of urls I mention.

I had something like this in mind:

UPDATE smf_messages SET body = REPLACE(body, 'oldURL', 'newURL') WHERE ID_BOARD = 1

but I am not sure of how to implement the REGEXP_REPLACE to match the URL style change (I am using MariaDB 10.2)

GigaWatt

Try the script I posted with these settings.

$oldURL = 'translatum.gr/forum/index.php/topic,';
$oldDir = '/12345';
$newURL = 'translatum.gr/forum/index.php?topic=';
$newDir = '/12345';
$doit = 'No';


Let it run, it'll probably change the URL you posted to something like this.

https://www.translatum.gr/forum/index.php?topic=2884.0.html

The problem is, how to remove the .html at the end now :S. Hmmm...

Maybe try another run of the script, but do it with these settings.

$oldURL = '.0.html';
$oldDir = '/12345';
$newURL = '.0';
$newDir = '/12345';
$doit = 'No';


Now this should fix the links to topics (ending with .0), but if there are some links that lead to a topic's page (.20, .30, .40), you'll probably have to fix them with more runs :S.
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

spiros

#4
The problem is that there are also external URLs matching "0.html". These links will be broken.

Also, there are links to messages like these:

https://www.translatum.gr/forum/index.php/topic,696.msg2729.html#msg2729

For the time being, this is the best I could get at with 3 queries:

UPDATE smf_messages SET body = REPLACE(body, 'translatum.gr/forum/index.php/topic,', 'translatum.gr/forum/index.php?topic=') WHERE ID_BOARD = 27
UPDATE smf_messages SET body = REPLACE(body, '.0.html', '.0') WHERE ID_BOARD = 27
UPDATE smf_messages SET body = REPLACE(body, '.html#msg', '#msg') WHERE ID_BOARD = 27


GigaWatt

Quote from: spiros on May 29, 2018, 11:34:25 AM
The problem is that there are also external URLs matching "0.html". These links will be broken.

Yeah, but how many of them end with a .0.html, not just 0.html. And in those rare cases, you could just copy the message IDs of those messages (the script reports the message IDs), load them one by one and correct them back to the previous value ;).

Quote from: spiros on May 29, 2018, 11:34:25 AM
Also, there are links to messages like these:

https://www.translatum.gr/forum/index.php/topic,696.msg2729.html#msg2729

For the time being, this is the best I could get at with 3 queries:

UPDATE smf_messages SET body = REPLACE(body, 'translatum.gr/forum/index.php/topic,', 'translatum.gr/forum/index.php?topic=') WHERE ID_BOARD = 27
UPDATE smf_messages SET body = REPLACE(body, '.0.html', '.0') WHERE ID_BOARD = 27
UPDATE smf_messages SET body = REPLACE(body, '.html#msg', '#msg') WHERE ID_BOARD = 27


Also doable with the script I posted. The script will also do what your query does, except it will report every message that will be corrected by the script, which will give you a chance to see if there are any external links that will be corrected, but don't need to be corrected.

digger's solution is also good. It only works on a particular server setup, but hey, if you're not planning on changing hosts in the near future, it'll do ;).
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

spiros

It was fixed with the queries I posted, you are right about the .0.html not many non-SMF urls with that. Many thanks for the help!

Just to make it clear (in case someone tries something similar in the future) the ones I posted were a test on a single board only; to apply throughout the forum just drop the

WHERE ID_BOARD = 27

bits.

GigaWatt

"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

Advertisement: