Simple Machines Community Forum

SMF Support => SMF 2.0.x Support => Topic started by: cenourinha on May 21, 2020, 09:55:40 AM

Title: Convert data on database to UTF-8
Post by: cenourinha on May 21, 2020, 09:55:40 AM
Hi there folks!

Recently i converted a forum from IPB 3.3.x to SMF 2.0.x and now I'm trying to convert the data in the database to UTF-8. In the forum settings I've the Language set to English (Character Set = UTF-8) and everything is being displayed ok, but when I go to the database via phpmyadmin, I see the data displayed like this:


"
This makes me suspect that data isn't in UTF-8 and needs to be converted. If I go to "Maintenance > Database > Convert the database and data to UTF-8" I see this:



But it I try to convert, the data in the database doesn't seem to be fixed and the forum starts displaying invalid characters. I also tried to run the "upgrade.php" and "repairsettings.php", but nothing seems to help.

As you can see, I don't really have a problem at this moment, but I would like to make sure all the data is in UTF-8 in order to prevent future incompatibilities.

Best regards!
Title: Re: Convert data on database to UTF-8
Post by: Diego Andrés on May 21, 2020, 04:48:45 PM
Since it isn't in UTF-8 you are meant to choose a different data character set in the dropdown options (the one that supposedly the database has). I think it defaulted on UTF-8 because your language files are already in UTF-8 I'm not sure.
Title: Re: Convert data on database to UTF-8
Post by: Sir Osis of Liver on May 21, 2020, 04:59:29 PM
Have you tried changing collation in phpmyadmin?
Title: Re: Convert data on database to UTF-8
Post by: cenourinha on May 21, 2020, 05:21:27 PM
Quote from: Diego Andrés on May 21, 2020, 04:48:45 PM
Since it isn't in UTF-8 you are meant to choose a different data character set in the dropdown options (the one that supposedly the database has). I think it defaulted on UTF-8 because your language files are already in UTF-8 I'm not sure.

I tried previously to convert selecting "ISO-8859-1" as actual charset, but it didn't worked.
Title: Re: Convert data on database to UTF-8
Post by: cenourinha on May 21, 2020, 05:23:02 PM
Quote from: Sir Osis of Liver on May 21, 2020, 04:59:29 PM
Have you tried changing collation in phpmyadmin?

The collation is set to "utf8_general_ci" on every table.
Title: Re: Convert data on database to UTF-8
Post by: Sir Osis of Liver on May 21, 2020, 05:41:23 PM
If the characters are displayed correctly on the forum, don't think you should worry about what's in the database.  Those are html character entities and they're being interpreted correctly by utf-8 language files.
Title: Re: Convert data on database to UTF-8
Post by: cenourinha on May 21, 2020, 05:59:07 PM
I tried to update to SMF 2.1 RC 2 and the characters are now all messed up. I think the "upgrade" process tries to convert to UTF-8.
Title: Re: Convert data on database to UTF-8
Post by: Sir Osis of Liver on May 21, 2020, 06:13:32 PM
Well, yes, the upgrader includes function ConvertUtf8() that converts database to utf-8, but it first checks to see if it's already utf-8, in which case don't think it's supposed to run the conversion.  In your 2.0 _settings table, is global_character_set set to UTF-8?
Title: Re: Convert data on database to UTF-8
Post by: cenourinha on May 21, 2020, 06:20:23 PM
Yes, it's set to UTF-8.

Title: Re: Convert data on database to UTF-8
Post by: Sir Osis of Liver on May 21, 2020, 06:38:20 PM
You could disable the utf-8 conversion in upgrade.php, but don't think that's the best idea I'll have today.  Would be better to post this problem in 2.1 support, let the devs have a look at it.
Title: Re: Convert data on database to UTF-8
Post by: cenourinha on May 22, 2020, 11:19:12 AM
I've made a restore and i'm using now SMF 2.0.17. It works perfect when is set to English (utf-8), but when I change it to Portuguese-pt (utf-8), the index page doesn't work for guests, displaying a white page. Already tried to debug (display_errors, error_reporting, printing a hello world in index.php) but nothing is displayed. Any clue? I think this may be related to the database characters.
Title: Re: Convert data on database to UTF-8
Post by: Sir Osis of Liver on May 22, 2020, 12:43:22 PM
White page is usually a server error.  Anything in server errorlog?
Title: Re: Convert data on database to UTF-8
Post by: cenourinha on May 22, 2020, 11:03:52 PM
Nothing was recorded to servers logs. I went to "Admin > Configuration > Edit Languages" and the "Character Set" for "Portuguese Pt" was set to "UTF-8". I changed it to "utf-8" (lowercase) and the problem seems to be fixed.

Now, if I go to "Convert the database and data to UTF-8", select "ISO-8859-1" and click convert, I get this error:

Data too long for column 'word' at row 11679
File: /home/webtuga/forum.webtuga.com/Sources/ManageMaintenance.php
Line: 664
Title: Re: Convert data on database to UTF-8
Post by: Sir Osis of Liver on May 22, 2020, 11:27:02 PM
Per this -

https://www.simplemachines.org/community/index.php?topic=572993.msg4054223#msg4054223

you should be able to truncate _log_search_subjects table (export a backup first), then you can rebuild search index after utf-8 conversion is completed.
Title: Re: Convert data on database to UTF-8
Post by: Deaks on June 09, 2020, 03:54:16 PM
cenourinha, did the link Sie Osis of Liver help? its been a few weeks and no updates.  I have marked this as solved until you get back to us, if you are still recieving the issue please let us know what you tried outwith what has been suggested.