Simple Machines Community Forum

SMF Support => SMF 2.0.x Support => Topic started by: orktown on June 10, 2017, 05:43:34 AM

Title: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: orktown on June 10, 2017, 05:43:34 AM
Hi,

We recently tried to switch to php7 after upgrading to 2.0.14. But when we do that, german umlauts in posts(smf_messages) become broken. German texts in menus and general forum texts are still correct. e.g. the german translation for "posts" is "Beitr├Ąge". It is still displayed correctly.

When I change the encoding in the browser manually to UTF-8, the posts become correct again, but the umlauts in menus become broken. When we switch back to php5 the forum is fine again. So, I guess some default, maybe in php has changed? Are messages stored as UTF-8 and later converted?

Forum language: German (ISO-8859-1)
Collation of smf_messages  table: latin1_swedish_ci

Any hints/ideas would be helpful.
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: Arantor on June 10, 2017, 07:02:33 AM
This changed in PHP 5.4 to be UTF-8 by default. Ideally the forum should be converted, it will be forced in SMF 2.1 anyway.
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: orktown on June 10, 2017, 09:54:47 AM
What's "this"? What changed? The collation/default encoding of the tables?
How should I convert it? I tried to click that convert button somewhere in administration. But it "did nothing". Or at least, I didn't notice anything.

What do you recommend?
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: Arantor on June 10, 2017, 10:38:25 AM
No, the default in the PHP language itself when it talks to your database.

Conversion requires firstly going through your database, then using UTF-8 language files which you are currently not doing.
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: shawnb61 on June 10, 2017, 11:14:35 AM
I think the SMF admin function to convert to UTF8 may help here; it does a good job converting latin1 to utf8 (no matter what you ended up getting stored in that latin1 database).
I'd backup the system first, to be safe.  A backup is a hard requirement for this type of activity.
Then convert the database, and choose the proper language files as Arantor said. 

More here:
https://wiki.simplemachines.org/smf/UTF-8_Readme
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: orktown on June 10, 2017, 01:18:41 PM
No, the default in the PHP language itself when it talks to your database.
Conversion requires firstly going through your database, then using UTF-8 language files which you are currently not doing.

Um, you said there was a change from Php 5.4, but we switched to Php 5.6 years ago. Why does it affect us now when switching to php 7? I also compared the php apache2 ini files. Apart from some unrelated changes they are quite equal. Especially, default_charset was UTF-8 in php5.6 too.

I think the SMF admin function to convert to UTF8 may help here; it does a good job converting latin1 to utf8 (no matter what you ended up getting stored in that latin1 database).
https://wiki.simplemachines.org/smf/UTF-8_Readme


I have tried that admin function already, it didn't change anything. I just checked on the frontend, didn't check if the collation in the db changed. I also tried to change the collation manually too, to no avail.
Maybe the table content is already UTF-8. Then I could just follow the steps in the link, switch the language packs and update all members langugae settings. I will try that on a backup.
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: orktown on June 10, 2017, 01:54:21 PM
Ok, that did it!
I have downloaded the -utf8 language packs and then I updated all users. I had to be a bit more careful since we already had some utf8 languages installed and the query in the documentation would have updated them to polish-utf8-utf8 ...   :P

Is it save to delete the old language files? They have 0 users according to that table in the admin area.
I guess it is best practice to change the collation of the tables too?
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: shawnb61 on June 10, 2017, 02:18:20 PM
Can you confirm tables are no longer latin1-swedish-ci?
They should now be utf8-general-ci.
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: orktown on June 11, 2017, 01:41:35 PM
I did try the conversion now, but it has an interesting effect: Afterwards the umlauts are broken again.
I believe, the content of the tables is already utf8, only the collation is wrong. I had that before with other software.

I am unsure what the best way to proceed is now. Alter the tables to utf8? Leave them "as is"? I tend to the first one. Well, no need to rush it.
Title: Re: Switching to php7 causes german umlauts to be displayed incorrectly
Post by: shawnb61 on June 11, 2017, 02:27:59 PM
Can you confirm the collation of your tables?

Did you check prior to running the utf8 conversion as I asked? 

Pretty important, as converting utf8 tables to utf8 corrupts data.

(Converting latin1 tables to utf8, even with utf8 content in them, is ok....)