Simple Machines Community Forum

SMF Support => SMF 2.1.x Support => Topic started by: zselby on April 17, 2023, 11:41:07 AM

Title: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 11:41:07 AM
Hi,

I recently upgraded to 2.1.3 from 1.1.x.  Things seemed to go fine, however I'm occasionally seeing the following errors in the log and a few users are getting database errors when trying to post. 

Errors are:

Database Error: Incorrect string value: '\xE2\xEF\xBF\xBD\xEF\xBF...' for column `larrivee_forumtemp`.`smf_messages`.`body` at row 1

and

Database Error: Incorrect string value: '\xC2 ...' for column `larrivee_forumtemp`.`smf_messages`.`body` at row 1

Collation of all tables and columns is utf8_general_ci which I updated via phpMyAdmin.  Only mod installed is EzPortal.  I searched and tried everything I could find, but continue getting these errors.  Any help would be greatly appreciated.
Title: Re: Database Error: Incorrect string value
Post by: Doug Heffernan on April 17, 2023, 12:39:55 PM
Quote from: zselby on April 17, 2023, 11:41:07 AMCollation of all tables and columns is utf8_general_ci which I updated via phpMyAdmin. 

Can you make a screenshot of the structure area of the smf_messages table and post it here?
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 01:36:02 PM
Thanks Doug, image is attached.
Title: Re: Database Error: Incorrect string value
Post by: Doug Heffernan on April 17, 2023, 01:58:13 PM
What about the language, what is the default language set to and what language are those members that get the error using?

Can you try something? Make first a backup of your database and then go to your Acp->Forum Maintenance->Database and run the following option: Convert HTML-entities to UTF-8 characters?

I also noticed that the Type for the body column is set to mediumtext. That should be set to text.
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 02:08:57 PM
Quote from: Doug Heffernan on April 17, 2023, 01:58:13 PMWhat about the language, what is the default language set to and what language are those members that get the error using?

Can you try something? Make first a backup of your database and then go to your Acp->Forum Maintenance->Database and run the following option: Convert HTML-entities to UTF-8 characters?

I also noticed that the Type for the body column is set to mediumtext. That should be set to text.

Only language I have installed in English and all users (I think) are using English.  Is there a place to check that?

I don't have Convert HTML-entities to UTF-8 characters as an option in ACP->forum maintenance->database.  Only two options I have are Optimize All Tables and Convert to Text.
Title: Re: Database Error: Incorrect string value
Post by: Arantor on April 17, 2023, 02:09:59 PM
How *exactly* did you do the conversion to UTF-8? Did you do it manually before the upgrade?
Title: Re: Database Error: Incorrect string value
Post by: Doug Heffernan on April 17, 2023, 02:21:31 PM
Quote from: zselby on April 17, 2023, 02:08:57 PMOnly language I have installed in English and all users (I think) are using English.  Is there a place to check that?

You can check it by going to your Admin Panel->Languages. What does it say under the Character Set for the language that you are using?

Quote from: zselby on April 17, 2023, 02:08:57 PMI don't have Convert HTML-entities to UTF-8 characters as an option in ACP->forum maintenance->database.  Only two options I have are Optimize All Tables and Convert to Text.

It should be there, the last option. Can you post a screenshot of how it looks like at your Admin Panel?

Additionally, check out the following link and make sure that your conversion to UTF-8 has been done properly/thoroughly.

https://wiki.simplemachines.org/smf/UTF-8_Readme#:~:text=Go%20to%20Forum%20Maintenance%20%3E%20Convert,of%20your%20default%20language%20file.
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 02:24:31 PM
Quote from: Arantor on April 17, 2023, 02:09:59 PMHow *exactly* did you do the conversion to UTF-8? Did you do it manually before the upgrade?

To be honest I don't remember.  I vaguely remember converting to UTF-8 via admin when I was running 1.x.x, but maybe that's not possible.  The forum was running 1.x.x for almost 19 years.

I didn't make the change in phpMyAdmin until after the upgrade when I starting seeing these errors.
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 02:28:43 PM
Quote from: Doug Heffernan on April 17, 2023, 02:21:31 PM
Quote from: zselby on April 17, 2023, 02:08:57 PMOnly language I have installed in English and all users (I think) are using English.  Is there a place to check that?

You can check it by going to your Admin Panel->Languages. What does it say under the Character Set for the language that you are using?

Quote from: zselby on April 17, 2023, 02:08:57 PMI don't have Convert HTML-entities to UTF-8 characters as an option in ACP->forum maintenance->database.  Only two options I have are Optimize All Tables and Convert to Text.

It should be there, the last option. Can you post a screenshot of how it looks like at your Admin Panel?

Additionally, check out the following link and make sure that your conversion to UTF-8 has been done properly/thoroughly.

https://wiki.simplemachines.org/smf/UTF-8_Readme#:~:text=Go%20to%20Forum%20Maintenance%20%3E%20Convert,of%20your%20default%20language%20file.

English is the only language listed and the character set is UTF-8.

Screenshot is attached.
Title: Re: Database Error: Incorrect string value
Post by: Doug Heffernan on April 17, 2023, 02:32:51 PM
Quote from: zselby on April 17, 2023, 02:28:43 PMEnglish is the only language listed and the character set is UTF-8.

It appears that you are using a custom theme. Try switching to the default one and see if the Convert HTML-entities to UTF-8 characters will appear then.

Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 02:35:31 PM
Quote from: Doug Heffernan on April 17, 2023, 02:32:51 PM
Quote from: zselby on April 17, 2023, 02:28:43 PMEnglish is the only language listed and the character set is UTF-8.

It appears that you are using a custom theme. Try switching to the default one and see if the Convert HTML-entities to UTF-8 characters will appear then.



Thanks.  I found that suggestion via search the other day and just tried it again.  I'm still not seeing it listed.
Title: Re: Database Error: Incorrect string value
Post by: Doug Heffernan on April 17, 2023, 02:39:09 PM
Quote from: zselby on April 17, 2023, 02:35:31 PMThanks.  I found that suggestion via search the other day and just tried it again.  I'm still not seeing it listed.

That 's very strange. That option should be there.

What about this?

Quote from: Doug Heffernan on April 17, 2023, 02:21:31 PMAdditionally, check out the following link and make sure that your conversion to UTF-8 has been done properly/thoroughly.

https://wiki.simplemachines.org/smf/UTF-8_Readme#:~:text=Go%20to%20Forum%20Maintenance%20%3E%20Convert,of%20your%20default%20language%20file (https://wiki.simplemachines.org/smf/UTF-8_Readme#:~:text=Go%20to%20Forum%20Maintenance%20>%20Convert,of%20your%20default%20language%20file).
Title: Re: Database Error: Incorrect string value
Post by: Arantor on April 17, 2023, 02:42:48 PM
Quote from: Doug Heffernan on April 17, 2023, 02:39:09 PMThat option should be there.

Only if SMF thinks it's in UTF-8 mode. Evidence suggests it does not think this because it's trying to write ISO-encoded content into UTF-8 tables.
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 02:50:22 PM
Quote from: Doug Heffernan on April 17, 2023, 02:39:09 PM
Quote from: zselby on April 17, 2023, 02:35:31 PMThanks.  I found that suggestion via search the other day and just tried it again.  I'm still not seeing it listed.

That 's very strange. That option should be there.

What about this?

Quote from: Doug Heffernan on April 17, 2023, 02:21:31 PMAdditionally, check out the following link and make sure that your conversion to UTF-8 has been done properly/thoroughly.

https://wiki.simplemachines.org/smf/UTF-8_Readme#:~:text=Go%20to%20Forum%20Maintenance%20%3E%20Convert,of%20your%20default%20language%20file (https://wiki.simplemachines.org/smf/UTF-8_Readme#:~:text=Go%20to%20Forum%20Maintenance%20>%20Convert,of%20your%20default%20language%20file).


3. Go to Forum Maintenance > Convert the database and data to UTF-8 (this option will only be available if SMF detects a database version which supports UTF-8).

That's what I believe I did when running 1.1, and I don't have it as an option now.

UPDATE smf_members
SET lngfile = CONCAT(lngfile, '-utf8')
WHERE lngfile != ''


I had not run that query but just did.
Title: Re: Database Error: Incorrect string value
Post by: shawnb61 on April 17, 2023, 02:53:16 PM
Some more questions:
- What is the value of $db_character_set in your Settings.php file?  (It should be: 'utf8')
- What is the value of the 'global_character_set' entry in your smf_settings table?  (It should be: 'UTF-8')
- Are they getting the errors when updating existing posts, or adding new posts?
- Do you know what the source of the material is?

xC2 is not a valid utf8 character.  (Is there any more to that string?)

Neither is  xE2EFBFBDEFBF. 

SMF2.1 will strip out/cleanse invalid utf8 characters when posting, so I don't think these are fresh posts.  My first suspicion is that this invalid data was introduced back in SMF1.x, and these messages are popping up as the records are being updated directly or indirectly somehow.

Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 03:03:25 PM
Quote from: shawnb61 on April 17, 2023, 02:53:16 PMSome more questions:
- What is the value of $db_character_set in your Settings.php file?  (It should be: 'utf8')
'utf8'
Quote- What is the value of the 'global_character_set' entry in your smf_settings table?  (It should be: 'UTF-8')
I don't have a global_character_set in smf_settings
Quote- Are they getting the errors when updating existing posts, or adding new posts?
I don't know for all cases, but in at least one it was replying to a recent post that was created after upgrading.
Quote- Do you know what the source of the material is?
I don't
QuotexC2 is not a valid utf8 character.  (Is there any more to that string?)
No - Database Error: Incorrect string value: '\xC2 ...'
QuoteNeither is  xE2EFBFBDEFBF. 

SMF2.1 will strip out/cleanse invalid utf8 characters when posting, so I don't think these are fresh posts.  My first suspicion is that this invalid data was introduced back in SMF1.x, and these messages are popping up as the records are being updated directly or indirectly somehow.

Title: Re: Database Error: Incorrect string value
Post by: Sesquipedalian on April 17, 2023, 03:08:55 PM
In the first error message, the sequence of bytes indicated by "\xE2\xEF\xBF\xBD\xEF\xBF..." is not valid UTF-8. The substring "\xEF\xBF\xBD" is the sequence of bytes for the Unicode replacement character, "�", but the single byte before it, "\xE2", can never occur alone in a valid UTF-8 string. The database is rejecting the string because it is malformed.

My suspicion is that the values of $context['utf8'] and/or $context['character_set'] somehow have gotten messed up, and that this is causing problems in the sanitize_chars() function in Subs.php. In particular, if these values are incorrect in just the right way, and the input string contains some disallowed control characters, it might be possible that sanitize_chars() could end up trying to use the wrong method to replace those disallowed characters.

@zselby:

1. Can you give us the entire error message from the error log, and especially the backtrace information?

2. Please open the ./Themes/default/languages/index.english.php file and verify that the following line appears, written exactly as shown:

$txt['lang_character_set'] = 'UTF-8';
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 03:13:23 PM
Quote from: Sesquipedalian on April 17, 2023, 03:08:55 PMIn the first error message, the sequence of bytes indicated by "\xE2\xEF\xBF\xBD\xEF\xBF..." is not valid UTF-8. The substring "\xEF\xBF\xBD" is the sequence of bytes for the Unicode replacement character, "�", but the single byte before it, "\xE2", can never occur alone in a valid UTF-8 string. The database is rejecting the string because it is malformed.

My suspicion is that the values of $context['utf8'] and/or $context['character_set'] somehow have gotten messed up, and that this is causing problems in the sanitize_chars() function in Subs.php. In particular, if these values are incorrect in just the right way, and the input string contains some disallowed control characters, it might be possible that sanitize_chars() could end up trying to replace those disallowed characters with "�" rather than "?".

@zselby:

1. Can you give us the entire error message from the error log, and especially the backtrace information?

Error
Type of error
Database
Error message
Database Error: Incorrect string value: '\xF0\xEF\xBF\xBD\xA4\xEF...' for column `larrivee_forumtemp`.`smf_personal_messages`.`body` at row 1
File
/home/larrivee/public_html/forum/Sources/Subs-Post.php
Line
1145
URL of page causing the error
http://www.larriveeforum.com/forum/index.php?action=pm;sa=send2
Backtrace information
#0: smf_db_error()
Called from /home/larrivee/public_html/forum/Sources/Subs-Db-mysql.php on line 494
#1: smf_db_query()
Called from /home/larrivee/public_html/forum/Sources/Subs-Db-mysql.php on line 811
#2: smf_db_insert()
Called from /home/larrivee/public_html/forum/Sources/Subs-Post.php on line 1145
#3: sendpm()
Called from /home/larrivee/public_html/forum/Sources/PersonalMessage.php on line 2560
#4: MessagePost2()
Called from /home/larrivee/public_html/forum/Sources/Subs.php on line 6016
#5: call_helper()
Called from /home/larrivee/public_html/forum/Sources/PersonalMessage.php on line 220
#6: MessageMain()
Called from /home/larrivee/public_html/forum/index.php on line 192

----------------------------------------------------------------------------------------------

Error
Type of error
Database
Error message
Database Error: Incorrect string value: '\xC2 ...' for column `larrivee_forumtemp`.`smf_messages`.`body` at row 1
File
/home/larrivee/public_html/forum/Sources/Subs-Post.php
Line
1897
URL of page causing the error
http://www.larriveeforum.com/forum/index.php?action=post2;start=0;board=5
Backtrace information
#0: smf_db_error()
Called from /home/larrivee/public_html/forum/Sources/Subs-Db-mysql.php on line 494
#1: smf_db_query()
Called from /home/larrivee/public_html/forum/Sources/Subs-Db-mysql.php on line 811
#2: smf_db_insert()
Called from /home/larrivee/public_html/forum/Sources/Subs-Post.php on line 1897
#3: createPost()
Called from /home/larrivee/public_html/forum/Sources/Post.php on line 2380
#4: Post2()
Called from /home/larrivee/public_html/forum/index.php on line 192

Quote2. Please open the ./Themes/default/languages/index.english.php file and verify that the following line appears, written exactly as shown:

$txt['lang_character_set'] = 'UTF-8';

Yes:  $txt['lang_character_set'] = 'UTF-8';
Title: Re: Database Error: Incorrect string value
Post by: shawnb61 on April 17, 2023, 03:18:52 PM
First suggestion is to set global_character_set...
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 03:33:02 PM
Quote from: shawnb61 on April 17, 2023, 03:18:52 PMFirst suggestion is to set global_character_set...

Done, thanks.
Title: Re: Database Error: Incorrect string value
Post by: Sesquipedalian on April 17, 2023, 03:35:16 PM
Thanks.

Hm. I see that "\xEF\xBF\xBD" appears in that error message as well. I suspect that the value of $context['utf8'] is somehow getting set to false on your forum, causing unexpected behaviour in the sanitize_chars() function in Subs.php. The question is, how is that happening?

Quote from: shawnb61 on April 17, 2023, 03:18:52 PMFirst suggestion is to set global_character_set...

In theory, SMF should fall back to the value defined in $txt['lang_character_set'] if $modSettings['global_character_set'] is empty. But yes, if $modSettings['global_character_set'] is missing, that should definitely be fixed, and it might solve this issue.

Unfortunately, @zselby will need to do that in the database directly, because SMF's administration control panel UI doesn't offer a way to do that—it is meant to be done during install/upgrade, and never changed or removed again. Looks like you already did it while I was writing.

At any rate, it might be the case that because $modSettings['global_character_set'] is missing, the value of $context['utf8'] is ending up false where it should be true, thereby causing bad behaviour in sanitize_chars(). But this is all still conjecture at this point.
Title: Re: Database Error: Incorrect string value
Post by: Sesquipedalian on April 17, 2023, 03:42:25 PM
Can you open Subs.php and make the following temporary change, @zselby?

Code (find) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Code (replace) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

file_put_contents($sourcedir . '/temp.txt', var_export(array('utf8' => $context['utf8'], 'character_set' => $context['character_set']), true));

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Once you've done that, try sending a personal message to someone in order to trigger this code. Then you can undo the change. Finally, open the newly created ./Sources/temp.txt file and paste its contents into a reply here. You can delete the file afterward.
Title: Re: Database Error: Incorrect string value
Post by: Sesquipedalian on April 17, 2023, 03:46:44 PM
I need to step away for a while. But if the results of my request to @zselby above indicate any discrepancy between the expected values of $context['utf8'] and $context['character_set'], it will be worthwhile tracking down the cause of that.

In particular, if $context['utf8'] is false, then the wrong code will be called for cleaning up illegal characters in sanitize_chars(). Or if $context['character_set'] is wonky, then the wrong behaviour might happen in the section for cleaning up illegal byte sequences.
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 03:51:24 PM
Quote from: Sesquipedalian on April 17, 2023, 03:42:25 PMCan you open Subs.php and make the following temporary change, @zselby?

Code (find) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Code (replace) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

file_put_contents($sourcedir . '/temp.txt', var_export(array('utf8' => $context['utf8'], 'character_set' => $context['character_set']), true));

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Once you've done that, try sending a personal message to someone in order to trigger this code. Then you can undo the change. Finally, open the newly created ./Sources/temp.txt file and paste its contents into a reply here. You can delete the file afterward.

array (
  'utf8' => true,
  'character_set' => 'UTF-8',
)
Title: Re: Database Error: Incorrect string value
Post by: Sesquipedalian on April 17, 2023, 04:30:52 PM
Quote from: zselby on April 17, 2023, 03:51:24 PMarray (
  'utf8' => true,
  'character_set' => 'UTF-8',
)

Well, that rules out my theory.
Title: Re: Database Error: Incorrect string value
Post by: Sesquipedalian on April 17, 2023, 04:40:47 PM
Have you seen any more errors since adding global_character_set to the settings table?
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 17, 2023, 04:49:38 PM
Quote from: Sesquipedalian on April 17, 2023, 04:40:47 PMHave you seen any more errors since adding global_character_set to the settings table?

I haven't, but I usually only see 1 or 2 a day.  I'll follow up tomorrow with an update.

Thanks to everyone so far for the help.  I really appreciate it.
Title: Re: Database Error: Incorrect string value
Post by: zselby on April 19, 2023, 03:55:02 PM
Quote from: zselby on April 17, 2023, 04:49:38 PM
Quote from: Sesquipedalian on April 17, 2023, 04:40:47 PMHave you seen any more errors since adding global_character_set to the settings table?

I haven't, but I usually only see 1 or 2 a day.  I'll follow up tomorrow with an update.

Thanks to everyone so far for the help.  I really appreciate it.

No errors since setting global_character_set.  Thanks again for the help!