News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

UTF-8 in SMF 2.0 RC3

Started by NomadaPT, June 03, 2010, 05:34:51 PM

Previous topic - Next topic

NomadaPT

Hi,

I've recently installed a forum and did the upgrade to SMF 2.0 RC3 in order to get the editor WYSIWYG, so far everything went fine, but I've failed to make it possible to write non-Latin characters (like Arabic or Greek).
I've digging a little around the net looking for solutions and follow the tutorial found on the topic UTF-8 Readme: In my cPanel changed the database to utf8_unicode_ci and in the forum's Administration in maintenance run the utility to convert html entities to UTF-8 but the practical results were null, when I try to create a topic with texts that involve the use of Unicode characters outside the ANSI the only thing I got are smiles or question marks.

I failed at all levels in my intention, but the simply access to this forum shows that is possible (just see the names of the boards in the Language Specific Support), any suggestions would be more then welcome.

Kays

Hi, are you using UTF-8 language files?

If at first you don't succeed, use a bigger hammer. If that fails, read the manual.
My Mods

Antechinus

Language files don't affect text typed into post forms or the wysiwyg editor. They only affect SMF text strings. Completely separate issues. :)

Kays

Don't the language files set the encoding for the page?

A link to the site might be nice. :)

If at first you don't succeed, use a bigger hammer. If that fails, read the manual.
My Mods

Antechinus

Well on my test site (which is ISO) when I'm testing RTL I can still type in English perfectly well when the site is running only Arabic language files. English characters aren't in the Arabic charset. Must admit I haven't tried it with UTF-8. 

NomadaPT

Hi,

I've found the solution in another forum.

To other's that may have the same issue, the problem is that the simply change of the database from latin1 to unicode it's not enough, that value aren't inherited by the individual tables of the DB, so you must to change manually every table of the DB the field collation from latin1_swedish_ci (equivalent to ISO 8859) to utf8_unicode_ci, there's more then 100 tables but when finished it works, or at least worked to me.

Supplementary info: The SMF doesn't automatic assume the right-to-left direction normal in the arabic and hebrew writings, when you type in these languages the text keeps the normal left-to-right latin direction, you need to insert the BB code [ r t l ] *** [ / r t l ] and because it's not possible to justify the text (that which probably would solve the situation) it's needed to align the text to the right to keep the normal appearance of these writings.

To those who tried to help me: my thanks.

Arantor

Quote from: Antechinus on June 05, 2010, 08:13:36 AM
English characters aren't in the Arabic charset.

Yes they are. All of the primary charsets that are ISO encoded should contain the standard ANSI characters in positions 32 through 126.

Advertisement: