Advertisement:

Author Topic: Convert data on database to UTF-8  (Read 382 times)

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Convert data on database to UTF-8
« on: May 21, 2020, 09:55:40 AM »
Hi there folks!

Recently i converted a forum from IPB 3.3.x to SMF 2.0.x and now I'm trying to convert the data in the database to UTF-8. In the forum settings I've the Language set to English (Character Set = UTF-8) and everything is being displayed ok, but when I go to the database via phpmyadmin, I see the data displayed like this:


"
This makes me suspect that data isn't in UTF-8 and needs to be converted. If I go to "Maintenance > Database > Convert the database and data to UTF-8" I see this:



But it I try to convert, the data in the database doesn't seem to be fixed and the forum starts displaying invalid characters. I also tried to run the "upgrade.php" and "repairsettings.php", but nothing seems to help.

As you can see, I don't really have a problem at this moment, but I would like to make sure all the data is in UTF-8 in order to prevent future incompatibilities.

Best regards!

Offline Diego Andrés

  • Customizer
  • SMF Hero
  • *
  • Posts: 3,536
  • Gender: Male
    • DiegoAndresCortes on GitHub
    • @bihgetter on Twitter
    • SMF Tricks - Free & Premium Themes
Re: Convert data on database to UTF-8
« Reply #1 on: May 21, 2020, 04:48:45 PM »
Since it isn't in UTF-8 you are meant to choose a different data character set in the dropdown options (the one that supposedly the database has). I think it defaulted on UTF-8 because your language files are already in UTF-8 I'm not sure.

SMF Tricks - Free & Premium Responsive Themes for SMF.

Offline Sir Osis of Liver

  • SMF Super Hero
  • *******
  • Posts: 10,143
  • Hoarding Budweiser in NY
Re: Convert data on database to UTF-8
« Reply #2 on: May 21, 2020, 04:59:29 PM »
Have you tried changing collation in phpmyadmin?

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Re: Convert data on database to UTF-8
« Reply #3 on: May 21, 2020, 05:21:27 PM »
Since it isn't in UTF-8 you are meant to choose a different data character set in the dropdown options (the one that supposedly the database has). I think it defaulted on UTF-8 because your language files are already in UTF-8 I'm not sure.

I tried previously to convert selecting "ISO-8859-1" as actual charset, but it didn't worked.

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Re: Convert data on database to UTF-8
« Reply #4 on: May 21, 2020, 05:23:02 PM »
Have you tried changing collation in phpmyadmin?

The collation is set to "utf8_general_ci" on every table.

Offline Sir Osis of Liver

  • SMF Super Hero
  • *******
  • Posts: 10,143
  • Hoarding Budweiser in NY
Re: Convert data on database to UTF-8
« Reply #5 on: May 21, 2020, 05:41:23 PM »
If the characters are displayed correctly on the forum, don't think you should worry about what's in the database.  Those are html character entities and they're being interpreted correctly by utf-8 language files.

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Re: Convert data on database to UTF-8
« Reply #6 on: May 21, 2020, 05:59:07 PM »
I tried to update to SMF 2.1 RC 2 and the characters are now all messed up. I think the "upgrade" process tries to convert to UTF-8.

Offline Sir Osis of Liver

  • SMF Super Hero
  • *******
  • Posts: 10,143
  • Hoarding Budweiser in NY
Re: Convert data on database to UTF-8
« Reply #7 on: May 21, 2020, 06:13:32 PM »
Well, yes, the upgrader includes function ConvertUtf8() that converts database to utf-8, but it first checks to see if it's already utf-8, in which case don't think it's supposed to run the conversion.  In your 2.0 _settings table, is global_character_set set to UTF-8?

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Re: Convert data on database to UTF-8
« Reply #8 on: May 21, 2020, 06:20:23 PM »
Yes, it's set to UTF-8.


Offline Sir Osis of Liver

  • SMF Super Hero
  • *******
  • Posts: 10,143
  • Hoarding Budweiser in NY
Re: Convert data on database to UTF-8
« Reply #9 on: May 21, 2020, 06:38:20 PM »
You could disable the utf-8 conversion in upgrade.php, but don't think that's the best idea I'll have today.  Would be better to post this problem in 2.1 support, let the devs have a look at it.

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Re: Convert data on database to UTF-8
« Reply #10 on: May 22, 2020, 11:19:12 AM »
I've made a restore and i'm using now SMF 2.0.17. It works perfect when is set to English (utf-8), but when I change it to Portuguese-pt (utf-8), the index page doesn't work for guests, displaying a white page. Already tried to debug (display_errors, error_reporting, printing a hello world in index.php) but nothing is displayed. Any clue? I think this may be related to the database characters.

Offline Sir Osis of Liver

  • SMF Super Hero
  • *******
  • Posts: 10,143
  • Hoarding Budweiser in NY
Re: Convert data on database to UTF-8
« Reply #11 on: May 22, 2020, 12:43:22 PM »
White page is usually a server error.  Anything in server errorlog?

Offline cenourinha

  • Semi-Newbie
  • *
  • Posts: 38
  • Gender: Male
    • WebTuga
Re: Convert data on database to UTF-8
« Reply #12 on: May 22, 2020, 11:03:52 PM »
Nothing was recorded to servers logs. I went to "Admin > Configuration > Edit Languages" and the "Character Set" for "Portuguese Pt" was set to "UTF-8". I changed it to "utf-8" (lowercase) and the problem seems to be fixed.

Now, if I go to "Convert the database and data to UTF-8", select "ISO-8859-1" and click convert, I get this error:

Code: [Select]
Data too long for column 'word' at row 11679
File: /home/webtuga/forum.webtuga.com/Sources/ManageMaintenance.php
Line: 664

Offline Sir Osis of Liver

  • SMF Super Hero
  • *******
  • Posts: 10,143
  • Hoarding Budweiser in NY
Re: Convert data on database to UTF-8
« Reply #13 on: May 22, 2020, 11:27:02 PM »
Per this -

https://www.simplemachines.org/community/index.php?topic=572993.msg4054223#msg4054223

 you should be able to truncate _log_search_subjects table (export a backup first), then you can rebuild search index after utf-8 conversion is completed.