Database Error: Incorrect string value

Started by zselby, April 17, 2023, 11:41:07 AM

Previous topic - Next topic

Sesquipedalian

Thanks.

Hm. I see that "\xEF\xBF\xBD" appears in that error message as well. I suspect that the value of $context['utf8'] is somehow getting set to false on your forum, causing unexpected behaviour in the sanitize_chars() function in Subs.php. The question is, how is that happening?

Quote from: shawnb61 on April 17, 2023, 03:18:52 PMFirst suggestion is to set global_character_set...

In theory, SMF should fall back to the value defined in $txt['lang_character_set'] if $modSettings['global_character_set'] is empty. But yes, if $modSettings['global_character_set'] is missing, that should definitely be fixed, and it might solve this issue.

Unfortunately, @zselby will need to do that in the database directly, because SMF's administration control panel UI doesn't offer a way to do that—it is meant to be done during install/upgrade, and never changed or removed again. Looks like you already did it while I was writing.

At any rate, it might be the case that because $modSettings['global_character_set'] is missing, the value of $context['utf8'] is ending up false where it should be true, thereby causing bad behaviour in sanitize_chars(). But this is all still conjecture at this point.
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

Sesquipedalian

Can you open Subs.php and make the following temporary change, @zselby?

Code (find) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Code (replace) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

file_put_contents($sourcedir . '/temp.txt', var_export(array('utf8' => $context['utf8'], 'character_set' => $context['character_set']), true));

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Once you've done that, try sending a personal message to someone in order to trigger this code. Then you can undo the change. Finally, open the newly created ./Sources/temp.txt file and paste its contents into a reply here. You can delete the file afterward.
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

Sesquipedalian

I need to step away for a while. But if the results of my request to @zselby above indicate any discrepancy between the expected values of $context['utf8'] and $context['character_set'], it will be worthwhile tracking down the cause of that.

In particular, if $context['utf8'] is false, then the wrong code will be called for cleaning up illegal characters in sanitize_chars(). Or if $context['character_set'] is wonky, then the wrong behaviour might happen in the section for cleaning up illegal byte sequences.
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

zselby

Quote from: Sesquipedalian on April 17, 2023, 03:42:25 PMCan you open Subs.php and make the following temporary change, @zselby?

Code (find) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Code (replace) Select
function sanitize_chars($string, $level = 0, $substitute = null)
{
global $context, $sourcedir;

file_put_contents($sourcedir . '/temp.txt', var_export(array('utf8' => $context['utf8'], 'character_set' => $context['character_set']), true));

$string = (string) $string;
$level = min(max((int) $level, 0), 2);

Once you've done that, try sending a personal message to someone in order to trigger this code. Then you can undo the change. Finally, open the newly created ./Sources/temp.txt file and paste its contents into a reply here. You can delete the file afterward.

array (
  'utf8' => true,
  'character_set' => 'UTF-8',
)

Sesquipedalian

Quote from: zselby on April 17, 2023, 03:51:24 PMarray (
  'utf8' => true,
  'character_set' => 'UTF-8',
)

Well, that rules out my theory.
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

Sesquipedalian

Have you seen any more errors since adding global_character_set to the settings table?
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

zselby

Quote from: Sesquipedalian on April 17, 2023, 04:40:47 PMHave you seen any more errors since adding global_character_set to the settings table?

I haven't, but I usually only see 1 or 2 a day.  I'll follow up tomorrow with an update.

Thanks to everyone so far for the help.  I really appreciate it.

zselby

Quote from: zselby on April 17, 2023, 04:49:38 PM
Quote from: Sesquipedalian on April 17, 2023, 04:40:47 PMHave you seen any more errors since adding global_character_set to the settings table?

I haven't, but I usually only see 1 or 2 a day.  I'll follow up tomorrow with an update.

Thanks to everyone so far for the help.  I really appreciate it.

No errors since setting global_character_set.  Thanks again for the help!

Advertisement: