Simple Machines Community Forum

SMF Support => Language Specific Support => Topic started by: ronread on October 07, 2007, 10:01:06 PM

Title: Using non-Latin languages with UTF-8/unicode
Post by: ronread on October 07, 2007, 10:01:06 PM
Like MANY people in MANY threads, I was extremely vexed by not being able to post a non-Latin language in the forum I'm trying to develop. In my case, I want to use Japanese in posts on an English forum; other people have had trouble with Arabic, Polish, Farsi, etc. They also found that just downloading/installing language packs and messing with admin settings was insufficient.

A lot of the super-erudite advice by language wizards like agridoc frankly went over my neophyte head. Some people mentioned using "Maintenance => Convert forum to utf8"; this may have been an option in 1.1.1, but not in 1.1.4, which only has "Maintenance => Convert HTML entities to UTF8 characters", which again is insufficient.

Then I noted one of agricoc's replies: "Check your database with phpMyAdmin to confirm." Hmm...this sounded intriguing. So I went to my C-panel thru my host, and clicked on phpMyAdmin. This opens up the php control panel.

First, I selected the database used for my SMF. I made sure that in the main "localhost" settings "connection collation" was set to utf8-unicode.

Second, I clicked on each 'table' (41 all together) for the database, in the left margin. This opens up the 'Structure' tab. If there were languages listed in the "Collation" column, I clicked "Check all" at the bottom and then the pencil icon for "Change". In the page that opened, I changed every visible language drop down menu (most set for "latin-swedish"!?) to utf8-unicode.

Third (I could have done this at the same time as the second step if I were thinking ahead...), I again clicked on each 'table' but this time opened the "Operations" tab; here, under the "Table Options" section, I changed the "Collation" drop-down (again, mostly latin-swedish!?) to utf8-unicode, then hit the go button.

After closing this page, I crossed my fingers and said a prayer to Thor and Ebisu. Opened up my SMF, and the Japanese I previous tried to post was still "???", but when I wrote a new message in Japanese, voila! Also, J'ese I pasted from some spam e-mail came out perfect. Hugely vexing problem solved!

Perhaps my solution isn't the most elegant or simple way to get non-latin/2-byte languages to show, but it worked for me! If there's an easier way, please explain it in SIMPLE terms.

Hope this was helpful to some others.
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: agridoc on October 07, 2007, 11:37:40 PM
ronread, as I had posted in Re: Japanese input in English Forum (http://www.simplemachines.org/community/index.php?topic=196431.msg1251059#msg1251059), with SMF you can post and read any language either with UTF-8 or not. The difference is that with a non UTF-8 SMF non latin chars languages and special chars are stored as entities. Plus with UTF-8 you can have two or more non latin languages interfaces, not normally possible without UTF-8 (there is a solution but I don't think it worths mentioning) .

You were correctly advised to convert your forum to UTF-8 with  Forum Maintenance - General Maintenance: Convert the database and data to UTF-8

This corresponds to .../index.php?action=convertutf8. It appears in a non UTF-8 SMF and is present in SMF 1.1.4 and 2.01 Beta 1.

You mention Convert HTML-entities to UTF-8 characters. This appears in an SMF forum already set to UTF-8 in place of Convert the database and data to UTF-8 and corresponds to .../index.php?action=convertentities.

You said that you you changed the tables collation in the database. This is not enough. Text fields must have their collation changed in every table. Check them in PhpMyAdmin to see the table structure.

I don' t know what happened with your install and you didn't have the proper choice. As I can suppose from your posts you don' t have an active forum.

I would suggest a new "clean" install of SMF in UTF-8 from the beginning.

I wouldn't suggest to anyone doing collation changes, like you did, or other database changes directly unless he knows well what he is doing and, of course, has a recent database backup available.
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: nstarz on October 10, 2007, 12:23:52 AM
I created my forum using fantastico. I click on Convert the database and data to UTF-8 and it said 100% complete. But when posting I still get
(https://www.simplemachines.org/community/proxy.php?request=http%3A%2F%2Fmg.onsalehost.net%2Fforum%2FSmileys%2Fdefault%2Fhuh.gif&hash=3b627059a40e9ea156de98a18d2003dafe9c91e9)(https://www.simplemachines.org/community/proxy.php?request=http%3A%2F%2Fmg.onsalehost.net%2Fforum%2FSmileys%2Fdefault%2Fhuh.gif&hash=3b627059a40e9ea156de98a18d2003dafe9c91e9)(https://www.simplemachines.org/community/proxy.php?request=http%3A%2F%2Fmg.onsalehost.net%2Fforum%2FSmileys%2Fdefault%2Fhuh.gif&hash=3b627059a40e9ea156de98a18d2003dafe9c91e9)


What else do you suggest?
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: agridoc on October 10, 2007, 01:01:10 AM
nstarz what languages do you use?

From what you write, I understood that old messages display correctly but new messages in a non latin chars language are not. Is that so?

It's quite probable that you haven't added the UTF-8 version of the language(s) that should be used after conversion.

However

With PhpMyAdmin check collation of SMF's database tables and text fields inside them.

A link would help. Make sure that you have enabled user-selectable language support in  Features and Options: Basic Features.
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: vmgus on October 29, 2007, 11:11:21 AM
hi agridoc,
Can you tell me how to convert latin1_swedish to UTF in myPHPadmin. I'm very frustrated with ??? letter on my forum. I'm trying to type Vietnamese. Could you help me step by step? I'm not an expert in HTML and have no ideas about forum.
Thanks
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: kanza on November 11, 2008, 11:13:21 AM
hi, same problem here..
do i need to set those collation with "blank" to utf too?
or just set those in sweedish to utf??
thanks

and it works fine after change them all..
but some mods still can't being displayed in the admin panel.. it shows blank!
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: giorgi_tsiklauri on January 17, 2009, 01:01:39 PM
hello ,
i want to make my forum 2 language supported one,
i mean when user is logged and he/she wants to write something down as a post, i have seen on manny forums that they have some switch radio option.. when you press "~" button it changes the input languages.. i downloaded my language with UTF-8 and installed it.. but anyway there is not in my forum's message board this "~ SWITCHING " option..

how do I enable this thing ?..

thanks .


PLEASE HELP ME   
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: agridoc on January 17, 2009, 01:33:26 PM
There are many ways to have language selection, for a quick solution look at Language Drop Down (http://custom.simplemachines.org/mods/index.php?mod=598) mod.

BTW I saw the previous messages. Fantastico installations caused some trouble, Fantastico does a pseudo-UTF-8 install. Sarge has published a solution (http://www.simplemachines.org/community/index.php?topic=166743.msg1151417#msg1151417) for deleting the force UTF-8 setting. Care should be taken and backup before proceeding.

I wrote a guide in Greek, having in mind a guide for all, that would also cover multilingual approaches. You can see a machine translation from Greek here
http://translate.google.com/translate?langpair=el|en&u=http://www.simplemachines.org/community/index.php?topic=285256.0 (http://translate.google.com/translate?langpair=el%7Cen&u=http://www.simplemachines.org/community/index.php?topic=285256.0)

Be careful machine translation is not exact.
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: giorgi_tsiklauri on January 17, 2009, 02:21:05 PM
Quote from: agridoc on January 17, 2009, 01:33:26 PM
There are many ways to have language selection, for a quick solution look at Language Drop Down (http://custom.simplemachines.org/mods/index.php?mod=598) mod.

BTW I saw the previous messages. Fantastico installations caused some trouble, Fantastico does a pseudo-UTF-8 install. Sarge has published a solution (http://www.simplemachines.org/community/index.php?topic=166743.msg1151417#msg1151417) for deleting the force UTF-8 setting. Care should be taken and backup before proceeding.

I wrote a guide in Greek, having in mind a guide for all, that would also cover multilingual approaches. You can see a machine translation from Greek here
http://translate.google.com/translate?langpair=el|en&u=http://www.simplemachines.org/community/index.php?topic=285256.0 (http://translate.google.com/translate?langpair=el%7Cen&u=http://www.simplemachines.org/community/index.php?topic=285256.0)

Be careful machine translation is not exact.



thanks alot, i did 100% correctly step by step, but unfortunatelly it is not what i've been asking for..

it has created a drop down menubox, with a pre installed languages.. okay.. but when i'm changing the languages it changes FORUM content.. I didn't ask for this..

I want user to Change his INPUT LANGUAGE..

when he is posting a new post he could be able to change his input text format,
the way you provided is not doing this.. it is changing full forum interface with a different language
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: agridoc on January 17, 2009, 03:10:23 PM
In a proper SMF installation, language files change the SMF interface and maybe character codepage, if not UTF-8, so that it can be understood and work correctly with specific languages.

You are asking for user keyboard input selection. This can't be done by SMF as is and I don't know any mod. A mod can be made for specific language. It's use is only for those who have not the proper keyboard selection when posting.

My Greeklish to Greek mod (//http://) is for doing conversion by incorporating a Google gadget. vkot, another Greek moderator has done something more like what you want, for Greek language, in a new window. The scope is different, greeklish are a bit more complex problem.

There are pages that do this for Greek language, there must be for other languages too.
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: giorgi_tsiklauri on January 17, 2009, 03:17:06 PM
Quote from: agridoc on January 17, 2009, 03:10:23 PM
In a proper SMF installation, language files change the SMF interface and maybe character codepage, if not UTF-8, so that it can be understood and work correctly with specific languages.

You are asking for user keyboard input selection. This can't be done by SMF as is and I don't know any mod. A mod can be made for specific language. It's use is only for those who have not the proper keyboard selection when posting.

My Greeklish to Greek mod (//http://) is for doing conversion by incorporating a Google gadget. vkot, another Greek moderator has done something more like what you want, for Greek language, in a new window. The scope is different, greeklish are a bit more complex problem.

There are pages that do this for Greek language, there must be for other languages too.

it must be more simply that you are explaining.. sorry but maybe you didnt understand what i'm talkingi aboout..

go here

www.iveria-tv.ge/forum   and try to post something.

before you post something you are haveing "~" symbol.. it indicates, the language change,

the INPUT Keyboard LANGUAGE ,,

thanks anyway..
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: agridoc on January 17, 2009, 03:48:58 PM
Unfortunately I can't use the link, Foxlingo auto translate guided to Google translate and it says that translation from Georgian is not possible.

I understand what you mean. It's not as easy as it looks. Probably they have developed or found some modification and are using it. It's better to ask them.

A language translation is quite different from a mod affecting the keyboard input.
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: giorgi_tsiklauri on January 18, 2009, 03:03:12 PM
i'm talking about this, when you will switch this box it types GEORGIAN, when it is off, it types ENGLISH


(https://www.simplemachines.org/community/proxy.php?request=http%3A%2F%2Fs61.radikal.ru%2Fi171%2F0901%2F39%2F1c03d2bda9d1.jpg&hash=78f7f2a2e0f4db1bb097d1d3e3127d70802ba73a) (http://www.radikal.ru)
Title: Re: Using non-Latin languages with UTF-8/unicode
Post by: agridoc on January 19, 2009, 01:37:17 AM
By clicking control is given to a script that does the job of character change.

This requires additional code, it possibly uses a JS add-on.

In a similar request, How to enable ARABIC and PERSIAN language along with English? (http://www.simplemachines.org/community/index.php?topic=269459.msg1761992#msg1761992), Oldiesmann answered
Quote from: Oldiesmann on October 21, 2008, 12:08:15 PM
That's beyond our control. They will need to change the keyboard settings in Windows to do that.

XP: http://tlt.its.psu.edu/suggestions/international/keyboards/winkey.html
Vista: http://www.vistax64.com/tutorials/103844-keyboard-input-language.html

Try to contact the administrators of sites that use this script. It is interesting, it might challenge and help for similar scripts for other languages.