Problem with EMOJI When switching from SMF 1.1.21 on SMF 2.1 RC3

Started by Rusam, March 04, 2021, 08:42:52 PM

Previous topic - Next topic

Rusam

This error is preserved when updating from SMF 1.21 to SMF_2-1-RC3

*  +++  Adding index on total_time_logged_in... done.
+++  Adding index on id_theme... done.
+++  Dropping index on real_name(30) ... done.
+++  Adding index on real_name... done.
+++  Dropping index member_name(30)... done.
+++  Adding index on member_name... done.
Successful.
*  +++  Adding index id_member_msg... done.
+++  Adding index current_topic... done.
+++  Adding index related_ip...Unsuccessful!  Database error message:
Specified key was too long; max key length is 1000 bytes


shawnb61

Make sure your db has a default collation of utf8_general_ci, nothing utf8mb4.

Start over & you should be ok.

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

Shawnb61, thank you, it is very valuable.
We even in settings.php row:
$ db_character_set = 'utf8mb4';

Rusam

shawnb61, after converting a table from UTF8MB4 to UTF8, all the emoji that were in messages are transformed into question marks.  :'(

shawnb61

SMF has never supported mb4.  I'm surprised you could get the extended chars in the SMF table to begin with, & curious how you did it. 

SMF would convert them to html entities if you had entered them thru SMF.  That would have been properly converted.

Sounds like you need a script to convert 4-byte chars to entities & retry.

Or...

Modify the DB for all of the indexes that need changing (there's 3) & leave everything mb4.  That might be simpler.  But you are in an unusual configuration atm.

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

As I said above, the configuration file records:
$ db_character_set = 'utf8mb4';
instead
$db_character_set = 'utf8';

smf_messages Table in UTF8MB4

Accordingly, the fields of the table also:

body mediumtext


SMF 1.21 is a powerful, easy, simple and fast forum. I am very sorry to part with him.


shawnb61

What are you upgrading from?

Do you need assistance trying either of the two solutions I proposed above?
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

Dear shawnb61, I am grateful to your participation.

I am interested in the second way. But I did not quite understand what exactly change.

Here are the indices of this table:






I now tried to upgrade to 2.0.18 (to refresh 2.1 (3)))
Work constantly stops on this:
+++  Adding index on member_name... done.
Successful.
*  +++  Adding index id_member_msg... done.
+++  Adding index current_topic... done.
. +++  Adding index related_ip...Unsuccessful!  Database error message:
1071




shawnb61

What version are you upgrading from? 

What version are you upgrading to?

Do you have backups from when you started?

If you are going to attempt a 2.1 upgrade, you're doing this for testing purposes, correct?
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

Quote from: shawnb61 on March 08, 2021, 12:47:40 AM

If you are going to attempt a 2.1 upgrade, you're doing this for testing purposes, correct?
Of course, I just test on Open Server locally.
On the forum more than 3 million messages.
I am updated with SMF 1.21 to SMF 2.1_RC3.
Update is normal, the forum works.
But many mistakes and non-working areas (in admin, for example).
But now I would like to solve the problem of Emoji.

shawnb61

Just confirming - Your 1.21 db has utf8mb4 & utf8 tables?
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

Quote from: shawnb61 on March 08, 2021, 09:55:02 AM
Just confirming - Your 1.21 db has utf8mb4 & utf8 tables?
Yes. Pictures that I left - they are not visible? These are SMF 1.21 tables.

shawnb61

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

I apologize. Yes, I so call SMF 1.1.21.
I thought it was understandable. Since 1.21 does not exist. :)

shawnb61

Is smf_messages the only table that is utf8mb4?  Or are there more tables currently utf8mb4?
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Rusam

Quote from: shawnb61 on March 08, 2021, 10:58:21 AM
Is smf_messages the only table that is utf8mb4?  Or are there more tables currently utf8mb4?
smf_messages is the only utf8mb4 table.

shawnb61

Your DB is configured in a unique way - utf8mb4 within a 1.1.x SMF DB - that took some effort to make that work. It needs special treatment.  Please follow the instructions below exactly, capitalization is important. 

It may take a couple of tries, since this configuration is so different...

1) Change the DATABASE default collation to utf8_general_ci, either in phpmyadmin or at the command line (not an SMF function).
2) In Settings.php, change $db_character_set to 'utf8'.
3) In the settings table, ensure the setting for 'global_character_set' has a value of 'UTF-8'.
4) Before running the upgrader, change the following two lines in upgrade_2-0_mysql.sql.  We need to shorten certain columns for the messages table to 191 bytes:
      'posterIP' => 'posterIP poster_ip varchar(191) NOT NULL default \'\'',      (~line 250+)
      'poster_ip' => 'poster_ip poster_ip varchar(191) NOT NULL default \'\'',   (~line 2500+)

5) If testing 2.1, also change the following line in upgrade_2-1_mysql.sql.  We need to shorten certain columns for the messages table to 191 bytes:
   upgrade_query("ALTER TABLE {$db_prefix}messages CHANGE poster_ip poster_ip_old varchar(191);");      (~line 2300+)

6) Run the upgrader.
7) AFTER the upgrade is complete, change your $db_character_set in Settings.php back to utf8mb4.


Notes:
* There are pretty significant performance costs associated with using different DB engines within the same database.  I would suggest converting all your tables to InnoDB.

* SMF does not support utf8mb4, it is written for utf8...  The above is a bit of trickery, you will probably find some issues...

* If you want to understand what it would take to fully cut over to utf8mb4, you can read this: https://github.com/SimpleMachines/SMF2.1/pull/6409
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

live627

Quote from: shawnb61 on March 07, 2021, 08:45:40 PM
SMF has never supported mb4.  I'm surprised you could get the extended chars in the SMF table to begin with, & curious how you did it. 
I can answer that. I believe SMF 1.1.21 doesn't have the 4-byte entity fix, so it just send 4-byte  characters (emojis, some  Chinese, etc) straight to the database where the messages table was manually widened to accommodate them.

(The display side was never an issue)

Rusam

Quote from: shawnb61 on March 08, 2021, 01:15:17 PM
1) Change the DATABASE default collation to utf8_general_ci, either in phpmyadmin or at the command line (not an SMF function).
The problem arose at the first stage. I specifically made screenshots in db of the same message before and after collation to UTF8_GENERAL_CI

before collation


after collation

Rusam

Can you split the topic starting from this message? https://www.simplemachines.org/community/index.php?topic=573832.msg4084022#msg4084022

Because the title does not speak about the discussion.
Better new topic with heading "Problem with EMOJI When switching from SMF 1.1.21 on SMF 2.1 RC3":)

Advertisement: