[3171][SMF 2.X] Titles and Languajes UTF-8

Started by .LORD., February 10, 2009, 11:46:38 AM

Previous topic - Next topic

.LORD.

Hi.

Sorry if this has already been reported, but I have been searching and i haven't found.

When using a language that uses special characters and using the UTF-8 pack, the special characters are double converted to HTML entities.

Only happens with the strings title of the language pack.

Example:
Using spanish_es-utf-8

Go to:
index.php
?action=admin
?action=search
?action=pm
?action=profile;area=statistics;u=1

The reason: These strings are already converted to HTML entities, and SMF also converted the text strings of the title.

Subs.php
Search:
$context['page_title_html_safe'] = $smcFunc['htmlspecialchars'](un_htmlspecialchars($context['page_title']));

Solution?
Replace:
$context['page_title_html_safe'] = $context['page_title'];

Important: I don't know the intention to have this conversion in the title.

I think the reason is:
1.- The forum take titles strings of language pack (which shouldn't change).
2.- The forum take titles: name site's, name board's and titlse topic's (should be converted).

The conversion should be done only in this second case.

But in my tests, the change i noted above is don't prejudice to any of the two cases.  :-\

Thanks  :)

karlbenson

That line is necessary. Ampersands and other symbols need to be encoded to be xhtml valid.

Obviously the bug is if some are double-encoded.  I wonder why un_htmlspecialchars is not working to prevent this?

.LORD.

#2
Yes me too to this question:

1.- Texts: name forum's, name board's, title of topics, must be converted to be valid XHTML.

$context['page_title_html_safe'] = $smcFunc['htmlspecialchars'](un_htmlspecialchars($context['page_title']));

2.- Texts given in the language pack is already converted, then shouldn't be double converted.

$context['page_title_html_safe'] = $smcFunc['htmlspecialchars'](un_htmlspecialchars($context['page_title']));

Perfect  :D

What's wrong?




But, the problem, is this.

un_htmlspecialchars desconvert (&, " , ',  >, <).

And htmlspecialchars convert this characters to HTML entities.

Then:

$txt['forum_index'] = $context['forum_name'] . ' - &Iacute;ndice';

Is "desconverted" by un_htmlspecialchars to:

$txt['forum_index'] = $context['forum_name'] . ' - &Iacute;ndice';   :(

After, htmlspecialchars converted to:

$txt['forum_index'] = $context['forum_name'] . ' - &amp;Iacute;ndice';

un_htmlspecialchars don't desconvert &Iacute; but htmlspecialchars convert & to &amp;




Then replacing this line code corrected the titles of packages lang, but... What about the texts that yes must be converted (string of titles topic's and name board's)?

Nothing, In my tests, these texts are already converted.

This line code seems not to be necessary, as it is.

WestBoard

Hello, this problem also occurs to me, already have a solution?

karlbenson


janitro

I wanted to add that, if Subs.php is modified as said in first post, AEVA mod for embeding videos stops working... :)

Sarge

Just curious, why are you using HTML entities? With a few exceptions (noted in the language files), all strings should contain those special characters in UTF-8 encoding, not as HTML entities.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

janitro

Quote from: janitro on March 25, 2009, 11:45:01 PM
I wanted to add that, if Subs.php is modified as said in first post, AEVA mod for embeding videos stops working... :)

I'm sorry, AEVA is now working flawlessly with new version 6.1.72 :)

Aleksi "Lex" Kilpinen

#8
I was testing a 2.0 installation last night (again - just for the fun of it :P )
and downloaded the finnish-utf8 language files for it, and installed them to the test forum.

Worked ok, BUT for example the "last unread..." links topic that shows on the browser's tab was "P;aumlivittyneet aiheet." instead fo "Päivittyneet aiheet"...

So I was wondering if there is any good reason for usin ;auml and ;ouml in the language files instead of Ä and Ö saved to UTF-8? Or is this just a mistake or something, that they were actually saved in the wrong encoding in the first place....

My (and my test users) problems were simply solved by me opening all the language files in notepad++, and converting all ;auml to ä and ;ouml to ö, and then saving the files in UTF-8 (no bom). :)
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

Aleksi "Lex" Kilpinen

To add: Can reproduce the same problem on this forum as well,
see screenshot
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

karlbenson


karlbenson


live627

This needs moved. The associated issue in Mantis is closed.

Advertisement: