Pretty URLs

Started by SMFHacks.com Team, January 31, 2007, 10:56:43 AM

Previous topic - Next topic

vbgamer45

Community Suite for SMF - Grow your forum with SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com - Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

distante

oh! They don't exist anymore or they are build inside the latest RC5 ?

vbgamer45

If they existed before then they are still there. the extras package is a separate package from the pretty urls site.
Community Suite for SMF - Grow your forum with SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com - Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

distante

Well I had an backup with the extras and it worked :P

I have now an philosophical issue, I'm seen now (well confirming) that my site using with and without index.php in the url are taking by search engines (and facebook) as two different pages (with duplicate content).

So, I was trashing and searching for the way to edit the htaccess so I can redirect index.php to the main site, after a while I found this code:
RewriteCond %{REQUEST_URI} ^/index\.php
RewriteRule ^.*$ http://%{HTTP_HOST} [R=301,L]


It worked...but trash the urls in pretty giving instead of: www.forosperuanos.net/search/     this-->  www.forosperuanos.net/?pretty%3baction=search

If I try to use a simple redirect301 /index.php to / (or the site url)  a infinite redirects loop begins.

Any Idea Of how can I do this change without break PrettyUrls?

Regards ;)

vbgamer45

I think there is a mod on the mod site that removes the index.php call forget the name of it though.
Community Suite for SMF - Grow your forum with SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com - Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

distante

Yeap you are right -> Remove Index.Php From URL

But it just remove the link inside the forum, so if you write the index.php that doesn't redirect to the main url.

I will continue with the search! hehe :P

nend

Quote from: distante on January 31, 2012, 07:39:50 PM
Yeap you are right -> Remove Index.Php From URL

But it just remove the link inside the forum, so if you write the index.php that doesn't redirect to the main url.

I will continue with the search! hehe :P
Not a mod but a hack to pretty urls. I posted it in this topic a few pages back.
http://www.simplemachines.org/community/index.php?topic=146969.msg3203937#msg3203937

distante

Quote from: nend on January 31, 2012, 09:33:34 PM
Quote from: distante on January 31, 2012, 07:39:50 PM
Yeap you are right -> Remove Index.Php From URL

But it just remove the link inside the forum, so if you write the index.php that doesn't redirect to the main url.

I will continue with the search! hehe :P
Not a mod but a hack to pretty urls. I posted it in this topic a few pages back.
http://www.simplemachines.org/community/index.php?topic=146969.msg3203937#msg3203937

Thanks!

I added the line here:

// Stitch everything back together, clean it up and return
$replacement = isset($context['pretty']['cached_urls'][$url_id]) ? $context['pretty']['cached_urls'][$url_id] : $cacheableurl;
$replacement .= (strpos($replacement, '?') === false ? '?' : ';') . (isset($PHPSESSID[0]) ? $PHPSESSID[0] : '') . ';' . (isset($sesc[0]) ? $sesc[0] : '') . (isset($session_var[0]) ? $session_var[0] : '') . (isset($fragment[0]) ? $fragment[0] : '');
$replacement = preg_replace(array('~;+|=;~', '~\?;~', '~\?#|;#|=#~', '~\?$|&$|;$|#$|=$~'), array(';', '?', '#', ''), $replacement);
//REMOVE index.php <<<<<<<<<<<<<---------------------------
$replacement = str_replace('index.php', '', $replacement);
//-----------------
return $matches[1] . ($addQuotes ? '"' : '') . $replacement . ($addQuotes ? '"' : '');


But still I'm able to enter to www.forosperuanos.net/index.php if i write it =\

nend

I didn't know you want to remove all traces of index.php

You can add something similiar to the top of index.php
if (stristr($_SERVER["REQUEST_URI"], 'index.php')) {
header('HTTP/1.0 404 Not Found');
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
   <title>404: Page Not Found</title>
   <meta http-equiv="Content-type" content="text/html; charset=ISO-8859-1" />
</head>
<body>
<h1>404: Page Not Found</h1>
</body>
</html>';
die();
}


Remember to keep that rewrite url in pretty that I showed you or you will have problems, ;)

distante

Oh but that will cut external link to mysite/index.php, maybe can I put a meta refresh to the main site instead!

I will try after have launch :P

Thanks!

distante

#6510
Ok ! sorry for the delay, I had a problem outside the matrix.

So I added the code in the Pretty-Filters and added this in the index.php

if (stristr($_SERVER["REQUEST_URI"], '/index.php')) {
header("HTTP/1.0 301 Moved Permanently");
header("Location: http://www.forosperuanos.nett/");
exit();
die();
}


It work like a charm :D!!!
Thanks nend!




Edit!

No it doesn't work fine. The attachtments aren't showing, I need some way to apply this only to the index.php alone or create some kind of pretty action for the "index.php?action=dlattach" line (and so on :S)


Babadinho

my smf_pretty_urls_cache table have
increased and keeps increasing under two
weeks up to 385mb since i switched hosts. And am
thinking it has caused the server overload
problem i keep getting everyday.
Does anyone think this issue or this mod
having this kind of issue can cause server
overload problems.
I want to disable the mod or anyone have
any idea on how to fix this issue. Thanks

vbgamer45

The mod uses a of database space. Every link in the forum is cached and saved in the database
Community Suite for SMF - Grow your forum with SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com - Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

Babadinho

Ok.. What am actually interested in is if this can cause overload. I have about 8k topics. If it is, i want to disable it. I had issues wit this particular table when i switched hosts, it crashed and couldnt repair... After successful repair, i think it keeps increasing. My db was 500mb b4 i switched hosts, after switch, under two weeks, iys now 800mb.

nend

#6514
Quote from: Babadinho on February 07, 2012, 11:38:53 AM
Ok.. What am actually interested in is if this can cause overload. I have about 8k topics. If it is, i want to disable it. I had issues wit this particular table when i switched hosts, it crashed and couldnt repair... After successful repair, i think it keeps increasing. My db was 500mb b4 i switched hosts, after switch, under two weeks, iys now 800mb.

The cache is bugged up, I just check it out. It is caching all the urls on a site and not only the site urls. Let me put it this way if you have a api that makes calls back and forth quite a bit, it will cache each unique url. These unique URLs for these APIs are only meant to be used once so there is no need to cache them. Bad thing though is that with some API's you can generate a few of these URLs with each page load, times that by your traffic and your going to be in trouble sooner or later. Most of these are URL's that are meant to only be used once.  :-\

LOL funny thing though is Pretty URLs cache table is taking up like 75% of my DB space. That makes sense though since I use allot API's.

Find this line in /Sources/PrettyUrls-Filters.php
// Cache only the URLs which will fit, but replace $boardurl first, that will help!
if (strlen($url_id) < 256 && strlen($url['replacement']) < 256)
$cache_data[] = array($url_id, str_replace($boardurl, '`B', $url['replacement']));


and replace
// Cache only the URLs which will fit, but replace $boardurl first, that will help!
if (strlen($url_id) < 256 && strlen($url['replacement']) < 256 && stristr($url['replacement'], $boardurl))
$cache_data[] = array($url_id, str_replace($boardurl, '`B', $url['replacement']));


This says if the replacement doesn't point to our site then don't cache it. Tonight it looks like I am cleaning up a DB, lol.

*edit
Finished cleaning out my db, strange though there where allot of URLs in there that I don't link to and I don't have a API for. I wonder how pretty is getting a hold of these URLs, maybe referrer or something but doesn't seem like it should. Has to be a explanation so I got to stop speculating. So I started with 16MB used in that area of the DB, now I ended up with just a little over 1MB.

Pretty is still going to phase these URL's though once it runs across them, it just isn't going to cache external urls. I don't think this will be any extra work though because they are all generated authorization/API URLs mainly. I would think this would be more beneficial to the DB and the impact of doing this is not as bad as the impact of caching every url is having on the DB.

oOo--STAR--oOo

#6515
Hey nend you seem to be pretty intelligent and thanks very much for the feedback on this..
I have noticed some bugs with this also and your right about it caching URLS that it shouldn't be...

It is slowing my site down in some instances so I figured after reading this that this is the problem.
Is there any chance the author could update this.

Can I basically empty the cache table or is this a no no your gonna screw everything up.
Then apply your edits so it does cache better?
This table alone is 179mb that is huge... My database is only 400mb in total..
Maybe this could be re-written to solve these issues?

Edit:
Running the maintenance tasks seemed to of help that ;)
Emptied the table its self.. Back to a small size one now :)
You can't fool a sufficiently talented fool.

http://www.uniquez-home.com
In Design Phase!

Mods I am designing,  No refresh Collapse Categories , Poll Redesign , Pure CSS Breadcrumb , Profile Statuses, Profile Views.

vbgamer45

You can empty it but some urls might change which would be bad for search engines.
Community Suite for SMF - Grow your forum with SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com - Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

nend

VB is right, it is better to clean all the bad URLs manually. You can drop all them though but be aware that the URLs might not be generated the same. The site will not break, just different URL.

oOo--STAR--oOo

Quote from: nend on February 08, 2012, 11:19:17 AM
VB is right, it is better to clean all the bad URLs manually. You can drop all them though but be aware that the URLs might not be generated the same. The site will not break, just different URL.

Ahh not bothered about Google.. Google only puts you 1st if you are the only person in the world who has content in your website no one else does or you pay..

Not open to guests anyway lol.. I basically use it for the title Pretty.. So when you share URLS or paste them to friends they know what they..
I applied your edit too and its noticeable the difference after this table was emptied after running the maintenance I didn't empty the table myself the maintenance did...

Thanks for the fix on it only adding URLS that contain $boardurl very helpful
You can't fool a sufficiently talented fool.

http://www.uniquez-home.com
In Design Phase!

Mods I am designing,  No refresh Collapse Categories , Poll Redesign , Pure CSS Breadcrumb , Profile Statuses, Profile Views.

nend

@aljo1985 can you test this out for me, This just disables the cache totally, Just replace the entire pretty_rewrite_buffer function with this. ;)

// Rewrite the buffer with Pretty URLs!
function pretty_rewrite_buffer($buffer)
{
global $boardurl, $context, $modSettings, $smcFunc;

// Remove the script tags now
$context['pretty']['scriptID'] = 0;
$context['pretty']['scripts'] = array();
$buffer = preg_replace_callback('~<script.+?</script>~s', 'pretty_scripts_remove', $buffer);

// Find all URLs in the buffer
$context['pretty']['search_patterns'][] = '~(<a[^>]+href=|<link[^>]+href=|<form[^>]+?action=)(\"[^\"#]+|\'[^\'#]+)~';
$urls_query = array();
$uncached_urls = array();
foreach ($context['pretty']['search_patterns'] as $pattern)
{
preg_match_all($pattern, $buffer, $matches, PREG_PATTERN_ORDER);
foreach ($matches[2] as $match)
{
// Rip out everything that shouldn't be cached
$match = preg_replace(array('~^[\"\']|PHPSESSID=[^;]+|(se)?sc=[^;]+|' . $context['session_var'] . '=[^;]+~', '~\"~', '~;+|=;~', '~\?;~', '~\?$|;$|=$~'), array('', '%22', ';', '?', ''), $match);

// Absolutise relative URLs
if (!preg_match('~^[a-zA-Z]+:|^#|@~', $match) && SMF != 'SSI')
$match = $boardurl . '/' . $match;

// Replace $boardurl with something a little shorter
$url_id = str_replace($boardurl, '`B', $match);

if (substr($url_id,0,7) == 'mailto:')
continue;
if (substr($url_id,0,10) == 'javascript')
continue;

$urls_query[] = $url_id;
$uncached_urls[$url_id] = array(
'url' => $match,
'url_id' => $url_id
);
}
}

// Procede only if there are actually URLs in the page
if (count($urls_query) != 0)
{
$urls_query = array_keys(array_flip($urls_query));
// Retrieve cached URLs
$context['pretty']['cached_urls'] = array();
/* $query = $smcFunc['db_query']('', '
SELECT url_id, replacement
FROM {db_prefix}pretty_urls_cache
WHERE url_id IN ({array_string:urls})',
array('urls' => $urls_query));
while ($row = $smcFunc['db_fetch_assoc']($query))
{
// Put the full $boardurl back in
$context['pretty']['cached_urls'][$row['url_id']] = str_replace('`B', $boardurl, $row['replacement']);
unset($uncached_urls[$row['url_id']]);
}
$smcFunc['db_free_result']($query);
*/
// If there are any uncached URLs, process them
if (count($uncached_urls) != 0)
{
// Run each filter callback function on each URL
$filter_callbacks = unserialize($modSettings['pretty_filter_callbacks']);
foreach ($filter_callbacks as $callback)
$uncached_urls = call_user_func($callback, $uncached_urls);

// Fill the cached URLs array
$cache_data = array();
foreach ($uncached_urls as $url_id => $url)
{
if (!isset($url['replacement']))
$url['replacement'] = $url['url'];
$url['replacement'] = str_replace("\x12", '\'', $url['replacement']);
$url['replacement'] = preg_replace(array('~\"~', '~;+|=;~', '~\?;~', '~\?$|;$|=$~'), array('%22', ';', '?', ''), $url['replacement']);
$context['pretty']['cached_urls'][$url_id] = $url['replacement'];

// Cache only the URLs which will fit, but replace $boardurl first, that will help!
// if (strlen($url_id) < 256 && strlen($url['replacement']) < 256 && stristr($url['replacement'], $boardurl))
// $cache_data[] = array($url_id, str_replace($boardurl, '`B', $url['replacement']));
}

/* // Cache these URLs in the database
if (count($cache_data) != 0)
$smcFunc['db_insert']('replace',
'{db_prefix}pretty_urls_cache',
array('url_id' => 'string', 'replacement' => 'string'),
$cache_data,
array('url_id'));
*/ }

// Put the URLs back into the buffer
$context['pretty']['replace_patterns'][] = '~(<a[^>]+href=|<link[^>]+href=|<form[^>]+?action=)(\"[^\"]+\"|\'[^\']+\')~';
foreach ($context['pretty']['replace_patterns'] as $pattern)
$buffer = preg_replace_callback($pattern, 'pretty_buffer_callback', $buffer);
}

// Restore the script tags
if ($context['pretty']['scriptID'] > 0)
$buffer = preg_replace_callback("~\x14([0-9]+)\x14~", 'pretty_scripts_restore', $buffer);

// Return the changed buffer.
return $buffer;
}


I would test it but I can't since I am on a grid. I can't get accurate benchmarks since every page load is usually a different system handling the request.

I am thinking URLs can't be cached this way efficiently. Performance I think would be the same or better depending on the situation of the page load. IMHO it looks like too much overhead. ;)

Advertisement: