News:

SMF 2.1.4 has been released! Take it for a spin! Read more.

Main Menu

Convert data on database to UTF-8

Started by cenourinha, May 21, 2020, 09:55:40 AM

Previous topic - Next topic

cenourinha

Hi there folks!

Recently i converted a forum from IPB 3.3.x to SMF 2.0.x and now I'm trying to convert the data in the database to UTF-8. In the forum settings I've the Language set to English (Character Set = UTF-8) and everything is being displayed ok, but when I go to the database via phpmyadmin, I see the data displayed like this:


"
This makes me suspect that data isn't in UTF-8 and needs to be converted. If I go to "Maintenance > Database > Convert the database and data to UTF-8" I see this:



But it I try to convert, the data in the database doesn't seem to be fixed and the forum starts displaying invalid characters. I also tried to run the "upgrade.php" and "repairsettings.php", but nothing seems to help.

As you can see, I don't really have a problem at this moment, but I would like to make sure all the data is in UTF-8 in order to prevent future incompatibilities.

Best regards!

Diego Andrés

Since it isn't in UTF-8 you are meant to choose a different data character set in the dropdown options (the one that supposedly the database has). I think it defaulted on UTF-8 because your language files are already in UTF-8 I'm not sure.

SMF Tricks - Free & Premium Responsive Themes for SMF.

Sir Osis of Liver

Have you tried changing collation in phpmyadmin?
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

cenourinha

Quote from: Diego Andrés on May 21, 2020, 04:48:45 PM
Since it isn't in UTF-8 you are meant to choose a different data character set in the dropdown options (the one that supposedly the database has). I think it defaulted on UTF-8 because your language files are already in UTF-8 I'm not sure.

I tried previously to convert selecting "ISO-8859-1" as actual charset, but it didn't worked.

cenourinha

Quote from: Sir Osis of Liver on May 21, 2020, 04:59:29 PM
Have you tried changing collation in phpmyadmin?

The collation is set to "utf8_general_ci" on every table.

Sir Osis of Liver

If the characters are displayed correctly on the forum, don't think you should worry about what's in the database.  Those are html character entities and they're being interpreted correctly by utf-8 language files.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

cenourinha

I tried to update to SMF 2.1 RC 2 and the characters are now all messed up. I think the "upgrade" process tries to convert to UTF-8.

Sir Osis of Liver

Well, yes, the upgrader includes function ConvertUtf8() that converts database to utf-8, but it first checks to see if it's already utf-8, in which case don't think it's supposed to run the conversion.  In your 2.0 _settings table, is global_character_set set to UTF-8?
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

cenourinha


Sir Osis of Liver

You could disable the utf-8 conversion in upgrade.php, but don't think that's the best idea I'll have today.  Would be better to post this problem in 2.1 support, let the devs have a look at it.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

cenourinha

I've made a restore and i'm using now SMF 2.0.17. It works perfect when is set to English (utf-8), but when I change it to Portuguese-pt (utf-8), the index page doesn't work for guests, displaying a white page. Already tried to debug (display_errors, error_reporting, printing a hello world in index.php) but nothing is displayed. Any clue? I think this may be related to the database characters.

Sir Osis of Liver

White page is usually a server error.  Anything in server errorlog?
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

cenourinha

Nothing was recorded to servers logs. I went to "Admin > Configuration > Edit Languages" and the "Character Set" for "Portuguese Pt" was set to "UTF-8". I changed it to "utf-8" (lowercase) and the problem seems to be fixed.

Now, if I go to "Convert the database and data to UTF-8", select "ISO-8859-1" and click convert, I get this error:

Data too long for column 'word' at row 11679
File: /home/webtuga/forum.webtuga.com/Sources/ManageMaintenance.php
Line: 664

Sir Osis of Liver

Per this -

https://www.simplemachines.org/community/index.php?topic=572993.msg4054223#msg4054223

you should be able to truncate _log_search_subjects table (export a backup first), then you can rebuild search index after utf-8 conversion is completed.
Ashes and diamonds, foe and friend,
 we were all equal in the end.

                                     - R. Waters

Deaks

cenourinha, did the link Sie Osis of Liver help? its been a few weeks and no updates.  I have marked this as solved until you get back to us, if you are still recieving the issue please let us know what you tried outwith what has been suggested.
~~~~
Former SMF Project Manager
Former SMF Customizer

"For as lang as hunner o us is in life, in nae wey
will we thole the Soothron tae owergang us. In truth it isna for glory, or wealth, or
honours that we fecht, but for freedom alane, that nae honest cheil gies up but wi life
itsel."

Advertisement: