SMF Support > SMF 2.0.x Support

Another UTF-8 question

(1/5) > >>

Square:
Hi,

SMF 2.0.2
My forum is in UTF-8.
But I think that my database is in UTF-8 BOM. (Is that possible?)
So I checked http://wiki.simplemachines.org/smf/UTF-8_Readme and it says
"Go to Forum Maintenance > Convert the database and data to UTF-8"
but I simply can't find it in the admin menu.
The only thing I found is "Convert HTML-entities to UTF-8 characters" which is not what I need right now.

Thanks in advance. :)

MrPhil:
"UTF-8 BOM" does not make any sense for a database. Where do you see this? BOM is Byte Order Mark, which is a three byte cow patty dropped at the very beginning of a UTF-8 file by overly "helpful" editors. BOMs should be ruthlessly exterminated wherever they're found in files, but I can't think of any case where you'd find them in the database. In a file, when editing in UTF-8 mode your editor should offer you the option to "save without BOM" -- always use it, and if you can, configure the editor to use that as the default. Or, edit in non-UTF-8 mode and remove the three strange characters at the very beginning of the file.

If your database is already UTF-8, you shouldn't see the option to convert it to UTF-8, as it's already done. Converting HTML entities to characters is a separate, and optional, step (replacing all &name; and &#nnn; characters by their native UTF-8 codes). It's not mandatory, but will slightly reduce the size of your database (native codes are 2 to 4 bytes, while entities can be 7 or 8 bytes, sometimes more).

Arantor:
Ok, better question... What problems are you having abd what needs to be changed about how your forum works, exactly?

Square:
Thanks.

I had SMF 1.1.18 and everything was going well.
I've installed 1.1.18 on a new place, restored my database on that new place, upgraded to 2.0.2. Then installed my own theme.
Everything is ok but now the non-latin characters have problems (question mark instead of some characters).
This is true for both old and new posts.

MrPhil:
When you say "non-latin" do you mean accented Latin alphabet, or truly non-Latin (Greek, Cyrillic, Hebrew, etc.)? Is this just with the posts, or also with text from language support files (prompts, labels, etc.)? When you installed 1.1.18 (there is no 18; do you mean 1.1.8 or 1.1.16?) and created the database for it, did you specifiy that it's UTF-8 (I'm not sure the encoding always comes over in the backup). Was the backup created by phpMyAdmin, or by SMF (bad!). Anyway, check your database to confirm it's actually UTF-8. Check your forum pages to confirm they're actually UTF-8 encoding. When you say "question mark", is this a ? within a black diamond, or just a regular question mark? Most browsers use ?-in-diamond to indicate an invalid character encoding, which usually means your database is serving up Latin-1 while you're trying to display the page in UTF-8.

Navigation

[0] Message Index

[#] Next page

Go to full version