Some message show blank/empty

Started by Guy Verschuere, January 06, 2021, 02:43:24 AM

Previous topic - Next topic

Guy Verschuere

Hi SMF board members,

It's been a long while since I posted something here, just because everything went smoothly.

Recently tough, I discovered an issue in my SMF forum. Can't say exactly when or why it started.
Some messages are not shown. Well, the message poster, title, date, quick moderation buttons are all there, just the message isn't.
The text is still in the database in the smf_messages table.
I guess something goes wrong with the translation of html/bbcode or word sensoring but can't figure it out.

There aren't any error's in the log, also not in the nginx error log.

Any idea where to start to solve this? I'm using SMF 2.0.17.

Thanks!

Guy Verschuere

I've had some time to dig into this.
I tried to analyze the problem by copying the body from the database to the message and save it again. Then look for changes between the original body and the updated body in the database.
It has something to do with character encoding.
For example: UPDATE smf_messages SET body = REPLACE(body, '€', 'â,¬') fixed all messages where a € was in it.

Kindred

doing a replace may have been a poor idea...   the presence of those characters usually indictyes a failed or double conversion to UTF8 and may have other repercussions -- because those characters are typically REAL characters that would have been displayed in UTF8, but that the system didn't get right during the conversion (or because of a double conversion) -- so posts with those characters will probably have missing letters...
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Guy Verschuere

The posts I tested were shown good.
What would have been a better solution?

Aleksi "Lex" Kilpinen

Generally finding out and fixing the cause instead of the symptom is a better long term solution.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

Guy Verschuere

Could it be caused by running "Convert HTML-entities to UTF-8 characters" while it has run in the past?

GigaWatt

Based on the description of the problem... not likely.
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

shawnb61

The data appears to be double-encoded, and this can happen if the utf8 conversion is run multiple times.  The utf8 conversion is designed to only run once (settings get saved), but if something bad happens, it can run another time.  And the result is seeing all those nordic font characters in unexpected places (â, ǣ, Æ).

For safety, I would make sure the utf8 conversion has really completed properly.

The basic outline is:
1) make sure you've fully converted to utf8
2) make sure your smf settings are OK
3) if the above are both true, and you're still seeing the unexpected characters, you need to run some sort of data fix

Are all the columns utf8_general_ci?  The easiest way to make sure you're fully converted is to look at the collation column when you view your database in phpmyadmin. 

The settings to confirm:
- Confirm you have an entry in your Settings.php file for
      $db_character_set = 'utf8';
- Confirm you have an entry in your smf_settings table for:
      variable:  global_character_set
      value: UTF-8

If all tables/columns are utf8_general_ci, and the two settings are set, your utf8 conversion is complete.

If you are still seeing garbled data & unexpected nordic characters, we may need to look into a more thorough data fix.
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

GigaWatt

Quote from: shawnb61 on January 07, 2021, 11:49:51 AM
If you are still seeing garbled data & unexpected nordic characters, we may need to look into a more thorough data fix.

I thought the problem was not seeing whole posts ???.
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

shawnb61

Look at his 2nd post above. 

Encoding problems can sometimes appear to be - or be - lost data.  If it's confused at display or store time, it gives up...

Fortunately, it doesn't appear there was data loss here.
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

GigaWatt

Quote from: shawnb61 on January 07, 2021, 07:47:22 PM
If it's confused at display or store time, it gives up...

Mhm, yep, you're right ;).
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

shawnb61

I'm thinking we need a sticky post with this somewhere...  Seems to come up a lot...
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Advertisement: