News:

SMF 2.1.6 has been released! Take it for a spin! Read more.

Main Menu

url rewriting mod problem with apostrophe's getting numbers

Started by vinzbomb, December 19, 2012, 05:50:27 PM

Previous topic - Next topic

vinzbomb

hi,

i will try to explain it with my bad english.

when a topic title has apostrophes like this: The New 'Cold War' Grows As Russian Parliament Passes Anti-US Bill

i get in url this : the-new-039cold-war039-grows-as-russian-parliament-passes-anti-us-bill

how to get rid of these numbers, cause its make nosense using url rewrite mode with these numbers into it.


mashby

That's pretty good English actually. :)

Considering that URLs have really no impact on SEO (I am assuming that's why you are using that mod), maybe just uninstall it.

However, if you want to continue using a troublesome and useless mod as that, please ask your question in the support topic for that mod.
Always be a little kinder than necessary.
- James M. Barrie

vinzbomb

Quote from: merry mashby on December 19, 2012, 05:57:07 PM
That's pretty good English actually. :)

Considering that URLs have really no impact on SEO (I am assuming that's why you are using that mod), maybe just uninstall it.

However, if you want to continue using a troublesome and useless mod as that, please ask your question in the support topic for that mod.

well every news website use actually url rewrite mod, its very important for news topics, and for google too.
i understand what you mean but it has direct impact how people find the news topics.

php urls makes nosense for news.

i'v read a lot about this, and the conclusion was i got to use an url rewrite mod.
look on the internet every site with news articles use url rewrite mod, and not only news sites.

its a little bug but i don't like it.

its not a myth or i don't understand nothing in seo


mashby

Always be a little kinder than necessary.
- James M. Barrie


mashby

Always be a little kinder than necessary.
- James M. Barrie


MrPhil

Is it actually 039, or is it &039; ? The &039; is the ASCII 039 code number for an apostrophe. An apostrophe can be an invalid character in a URL, so it should be replaced by %27. The original apostrophe must have been "sanitized" to an HTML entity &039; at some point, then the & and ; characters were stripped out as part of creating an "SEO" form of the URL. That's simply poor coding on the part of the mod author -- the entire entity &039; should have been stripped out, as it was a punctuation character. For named entities (e.g., ' or Ä) you could try to recognize what the resulting character was, and whether to strip it out or try to replace it with something more benign, such as 'Ae' for 'Ä'. It isn't too bad in the ASCII range, but beyond that there are a huge number of accented characters, non-Latin alphabetics, and then all the numeric entity equivalents. However, an entity should never simply have the & and ; stripped off.

vinzbomb

Quote from: MrPhil on December 19, 2012, 09:18:37 PM
Is it actually 039, or is it &039; ? The &039; is the ASCII 039 code number for an apostrophe. An apostrophe can be an invalid character in a URL, so it should be replaced by %27. The original apostrophe must have been "sanitized" to an HTML entity &039; at some point, then the & and ; characters were stripped out as part of creating an "SEO" form of the URL. That's simply poor coding on the part of the mod author -- the entire entity &039; should have been stripped out, as it was a punctuation character. For named entities (e.g., ' or Ä) you could try to recognize what the resulting character was, and whether to strip it out or try to replace it with something more benign, such as 'Ae' for 'Ä'. It isn't too bad in the ASCII range, but beyond that there are a huge number of accented characters, non-Latin alphabetics, and then all the numeric entity equivalents. However, an entity should never simply have the & and ; stripped off.

its just 039, and it occurs only with ' and weird thing is it doesn't happen all time just some urls

vinzbomb

i can post this problem to the the mod maker but he doesn't give shi.... and don't like me, never give an answer.

so that's why i had make this topic in smf support.

hopping someone can help to fix this.

the database is in utf-8, i don't know if it is important

Kindred

the thing is - this area is for SMF support. We do not support custom mods here...

and we, quite honestly, consider the pretty urls mod (and most of the url re-write mods) to be absolutely useless and problematic. They do **NOT** affect your google returns in any significant way. Google focuses on the CONTENT of the page and the appropriate headings and other tags on the page. URL and keywords have little to no effect on google.


that being said apostrophe in a url is not appropriate, so it should be removed.
what it looks like the mod is doing is converting the apostrophe to its ascii code &039; and then removing the other non-url characters, like & and ; - instead, it should be removing the apostrophe all together.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

MrPhil

I'll bet it happens with <, >, ", and & too. I don't think the database or page encoding matters. Your URL is apparently being fed to an HTML "sanitizer" call that converts ' " < > & to numeric or named HTML entities. Those entities, when the title is being converted to a URI, get the & and ; removed, but the numbers (or name) are left. If the topic title handed to the SEO code still has the punctuation in it, I would try removing whatever "html safe" routines (e.g., htmlspecialchars()) are being called to "sanitize" the text, and just let the code strip out the punctuation (leaving letters, numbers, and spaces replaced by hyphens).

I agree that SEO in the case of news articles is probably a bit more useful than with other pages. While search engines won't care all that much (as @Kindred says), humans may scan the URL to get the title/abstract of the material, to decide whether to bring up the whole page for reading.

mashby

Hmm...not sure news articles have any more use for the URL being human readable. If anything, if I see "click here", I'm likely not going to click it because it's just stupid link text even if the URL was human readable. "Learn more about this new product X" is much more readable and something I'd click on if I was interested regardless of what the URL was.
Quotei can post this problem to the the mod maker but he doesn't give shi.... and don't like me, never give an answer.
How about having some patience? Still if it were me, I'd uninstall it. :)
Always be a little kinder than necessary.
- James M. Barrie

MrPhil

If someone makes a news article with URI click-here.html, they would deserve to be ignored. Granted, a useful link is going to have a full title on the page (so you don't have to look at the URL), but often the visible text just gets truncated due to layout size limitations. Just look around cnn.com or news.google.com, etc. If there's enough text to pique my interest, I might look at the URL as I hover over the link, to see if it's worth loading the whole page. So, while SEO rankings may not be affected by nicely human-readable URLs, the final decision by a human on whether to click on that link is the most critically helped along by a readable URL if sufficient text cannot be given (e.g., news headline lists, product category listings, search engine listings,...). Anyway, that's been my experience.

Advertisement: