Major PITA on the utf8 and utf-8 and $db_charset etc.

Started by richardwbb, March 12, 2012, 07:00:36 PM

Previous topic - Next topic

richardwbb

Hi, I'm stuck again

I found this topic:

http://www.simplemachines.org/community/index.php?topic=443853.0

It explains the issue I have, I was updating 1.x to 2.x however, a while ago I converted the 1.x db to utf-8

It adds this in Settings.php:

$db_character_set = 'utf-8';

Now it produces this error

Unknown character set: 'utf'

And it does this before running the upgrade.php from 2.x, when all files are uploaded and to be installed.

That all wouldnt't matter much but now, no matter what I do, alle euro signs are broken and the two dots on the letter e too

I hardly can imagine I'm the only one with this issue and searching yielded 0 results :(
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

richardwbb

I was able to narrow it down further, there is a setting needed to tell the browser it is UTF-8, in a index.template.php you will find similar to this:

<meta http-equiv="Content-Type" content="text/html; charset=', $context['character_set'], '"/>

But where I can set 'character_set'? After upgrading it became ISO-8859-1 again while it should be UTF-8

Quick (and bad) fix is replacing the rule above found in index.template.php to:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

But of course, I need to set character_set somewhere? But how? And where? And what did happen really when one has made the edit in Settings.php, from utf-8 to utf8? This must have went wrong somewhere...

If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

richardwbb

MY BAD MY BAD MY BAD

I mixed the live forum with my test forum, and made a mistake on editing the index.template.php to UTF-8

This does not help

I'm am so darn lost now, I remember converting the 1.x forum to UTF-8 as being recommended, and now I so wish I didn't convert.

Please, anyone know what is going on here? I know the db is good, the live forum is prooving, also in Settings.php, setting a value from 'utf-8' to 'utf', does make the forum run, but did anyone look at the eurosign or special characters in general?  :o ???
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

richardwbb

People,

Anyone understanding what I did and what I am doing to fix it?

It really is not proper, this, I read it was recommended to convert database to utf-8, so I did. Now I know, I would have been better off not doing so.

However, in my test environment, with English-utf8 language and the following in Settings.php:

$db_character_set = 'utf-8';

Which was added by the conversion to utf-8, no matter what I did, upgrading 1.x to 2.0.2 with upgrade.php made the euro sign go weird character and also the umlaut went weird.

So what do I do, I export the live (and proper showing special characters) in ISO-8859-1, I removed the  $db_character_set = 'utf-8'; from Settings.php and upgraded

Now the unknown characterset 'utf' error also won't show of course.

Then I relooked for the special characters and it made the posts go bad as in, all euro signs are gone, no special characters anymore but I can not use this DB version, it is f'd up.

My question, how do I let the installer accept installing in UTF-8????

I am 100% sure this is something with the installer/ upgrading from 1.x to 2.0.2, but *nothing* I can find about it, nor how I might be able to convert the DB back to ISO-8859-1, it is mangled that way

HELP!!!
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

richardwbb

Example on the problem, which makes the SMF software, mangle the posts, this is not good:

Club Shirts �`� 15,00 per stuk

This is good:

Club Shirts  € 15,00 per stuk

However, deleting the first '�' in the post, makes all the contents of that posts show, but it also adds � where special characters are, resulting in a mangled database.

I can't convert my DB this way to run 2.x SMF, stuck on 1.1.16 here :(
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

The problem came to be when you upgraded? Did you update the utf-8 language packs as well? Are you using an utf-8 language pack?

Converting the database is not exactly recommended, unless you actually need the larger characterset for something, and have not installed the forum as utf-8 originally.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

Hi,

What I just did was converting the DB back to ISO-8859-1, by exporting it in that format. I am not sure if that actually did something.

What I do know, narrowing down on the language, if it is 'language-utf8' or 'language', this does not matter for the DB contents.

What did happen is that setting in Settings.php on the db_characterset, this was added by SMF 1.x after converting the DB to utf-8, removing it doesn't seem to make things different, actually that whole parameter being set in Settings.php looks ambigious.

However, the upgrade.php from any 2.x SMF upgrade for 1.x fora, trips on the setting "utf-8", it does seem to accept "utf8", but I do not know what that does, what I do know is that I do not understand the php contents of index.php and so on, where it is going a good while about something being utf-8 or not.

What I do know, I can not have the error as I learned being there, when the contents of the 2.x upgrade package is extracted and upgrade.php not run yet, but also after it ran, 'Unknown characterset: 'utf'

Also, I do know, it seems to trip on the '-' in 'utf-8' as the $db_characterset contains.

I am so fuked.
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

Let's start over. You had a 1.1 forum, you converted that db to utf8, and evertyhing worked correctly - or was already broken after that?
Did you upload utf8 language packs after the conversion?
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

You are correct.

Also, the utf8 language files where in place, I use Dutch and keep English for maintenance and did not see a good reason to use English-utf8, however, also tried that.

AFAIK, the language files have no influence on the output, they are in Unix Ansi anyway when uploaded I noticed.

Please let me know what you think, before I start blabbering again...  :-X
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

The language files do affect output, but on some languages the coding may remain the same. If you are using utf8, the language pack should be utf8.
If you look at the codepage your browser is using, while viewing the forum, what code page is it trying to use?
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

Strangely, my 2.0.2 forum is outputting ISO-8859-1 when I have set the db_charset to utf8 instead of the utf-8 that worked on the 1.x forum

Also, it seems the problem I had before (I went to UTF-8 not voluntaraly) Now it looks like the DB holds the correct data, the ISO-8859-1 setting the template has, seems to be showing the odd characters?

Then I wonder, how that is possible, also, that error I mentioned on the unknown charset doesn't make me feel comfortable and I tried the language packs available.

What I know for sure, after upgrading from 1.x to 2.0.2 the DB looks like it is in UTF-8 and the SMF still 'stuck' in ISO-8859-1

Similar to the issue I had long time ago, same werid char on the same place

On the other hand, forcing the template UTF-8, hardcoding it, does not change anything.

I won't argue when you say there is difference in the language files being utf-8 or not, I just not have been there yet

But, I wonder, how can I tell the SMF forum to run UTF-8?

Also, what do you want me to do, upgrading is not really an option with character malformations (which instantly fill my PM box with complains by users) also knowing, there is no way back when I went to 2.0.2, that would make posts and PM's go lost

I am really lost on where to start to pinpoint this issue.
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

Quote from: richardwbb on March 15, 2012, 11:39:18 PM
Strangely, my 2.0.2 forum is outputting ISO-8859-1 when I have set the db_charset to utf8 instead of the utf-8 that worked on the 1.x forum

Also, it seems the problem I had before (I went to UTF-8 not voluntaraly) Now it looks like the DB holds the correct data, the ISO-8859-1 setting the template has, seems to be showing the odd characters?

Then I wonder, how that is possible, also, that error I mentioned on the unknown charset doesn't make me feel comfortable and I tried the language packs available.

What I know for sure, after upgrading from 1.x to 2.0.2 the DB looks like it is in UTF-8 and the SMF still 'stuck' in ISO-8859-1
Upgrading should not really affect at all - so let's try to forget you even did it for now.

Work with me now, and please don't try any random tricks - just do what I ask, and perhaps we can get to the bottom of this.

1) Please open your Settings.php, and make sure to check that the line
$db_character_set = 'utf8';
is inside the opening and closing tags of the file, preferably within the Database Info section of the file.

2) If it is there, make sure it is correct, and no typos in it. The Character set defined here is utf8, NOT utf-8.

3) If it is there, and correct, continue on to make sure you have uploaded UTF8 versions of ALL your language packs, for the SAME version of SMF that you are using currently.

4) If you have, and all language packs should be updated and UTF8, make sure to check your forum settings from Admin -> Configuration -> Languages and please make sure that your Default selection is an UTF8 language, not an ISO language.

5) After all these steps, please let me know of your progress, and if possible give me a link to your forum, preferably to a topic with wrongly coded text on it, so I can check it out.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

Hi,

I understand your response and that is really what I tried before asking here.

It is all set, really.

I would give you access to a test forum, if that helps?

Really, I took those steps.

Regards,b
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

I'm not sure if I could do any good with a test forum, because I couldn't simply be sure that the conditions on the live forum and the test forum are alike.
Could you give me a link to the actual live forum we are talking about?
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

I have sent you a PM, because I rather not post a link on the internet.
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

Some of my findings:

On the real live 1.1.16 forum, your forum frontpage works correctly for me, and the source code states

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

And at least my own browser correctly switched to Unicode UTF8. Could not find much unicode there though, to see if they work OK -
But I'm guessing the live forum is not the one having problems at the moment.

Sadly I could not check the example topics you sent me, because I couldn't access them as a guest on either of the forums, test or live.

Could you craft me a temporary login ( A simple user, no moderator or admin powers needed ) for both forums, so I can take a look?
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

Sorry man, I was under the assumption those links would be accesible. I had to close down some stuff indeed and I should have checked before sending you a PM, I was already in a different stage of processing the goals I have, such as making a outdated and messed up template working

Indeed the live forum is in UTF-8 and you are right on the Unicode, not being around. TBH I don't know what Unicode really is, I tried befor te learn and I know it is not relevant to my issue, but it turned out to be some people where using characters, that displayed as little domino blocks, with 2 hex numbers above eachother.

After asking, and since this forum is for Japanese car enthusiasts, users that are really at the level of being able to use Windows so to say, it displays Japanese characters for them, while I still see those blocks and really have no clue how to display what they are looking at.

This too is true for MySQL, I know the database is in latin_swedish_ci, not sure why it is swedish, but it means utf8

The sad thing was, the old provider wouldn;t let me copy the database myself (phpmyadmin restrictions) so they put a file in my www space. And it was in UTF-8 format, it took me days to understand what the heck was going on.

Also, the damage had become permanent and it is not so severe for the two links I sent to you, those pages are quite static but do contain special chars, it made the users complain instant on their PM's and their topics where the offer second hand goods, using the euro sign.

also, some people, me not included, I just type as fast as I can, know how to put all those special characters with the alt and alt gr I suppose.

Then they read their topic and a single special char 'converted' to a a few characters like the world 'Belgium, in Dutch, 'België'

Which shows as BelgiÃÆ'Įââ,¬â,,¢ÃƒÆ'ââ,¬Â ÃÂ.. right now.

This scares me since, that looks to me as multiple conversion error, thus unrepairable.

However, that was only for the last 3 months of topics since I dared to overwrite current database with a 3 months old one and really have no clue how I did that, besides I did not use MySQL INSERT, more like WRITE or something and it didn't happen like MySQL manual I believed explained to me.

So I say, I am in deep ****** now.

This, because I was able to find a handful of topics similar, but nothing like what happened to me. Reading the MySQL manual doesn't feel helpful here, it appears as being huge, and me no able to know where to start, similar to Apache manual or PHP manual.

All above does not matter, however I received not much help (not even a screenshot) from the members that where able to use Japanese chars, showing properly on their screen, but apparently not on mines, where I even posted in the forum, what are you posting, strange signs?

anyway, above wasn't very on topic I guess

However, I have a working live forum (with also characters still wrong but no one seems to overcome those no more or not often) and I let the upgrade.php of 2.0.2 do it's job and what I see is the *same* character problem in the 2 links of the topics I sent to you via PM, where the 'switch' in settings.php what charset to use, does not work anymore.

I also wasn't able to hardcode the charactersetting in the template file, to atleast make it display properly in the browser.

Here I'm 100% lost on what I might have done wrong and also on what I should do.

I have sent you a reg, being a moderator.
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Aleksi "Lex" Kilpinen

Quote from: richardwbb on March 19, 2012, 10:09:02 PM
Sorry man, I was under the assumption those links would be accesible. I had to close down some stuff indeed and I should have checked before sending you a PM, I was already in a different stage of processing the goals I have, such as making a outdated and messed up template working

Indeed the live forum is in UTF-8 and you are right on the Unicode, not being around. TBH I don't know what Unicode really is, I tried befor te learn and I know it is not relevant to my issue, but it turned out to be some people where using characters, that displayed as little domino blocks, with 2 hex numbers above eachother.

After asking, and since this forum is for Japanese car enthusiasts, users that are really at the level of being able to use Windows so to say, it displays Japanese characters for them, while I still see those blocks and really have no clue how to display what they are looking at.

This too is true for MySQL, I know the database is in latin_swedish_ci, not sure why it is swedish, but it means utf8
Actually, latin_swedish_ci = latin1, the default charset for many hosts, and this is not the same as utf8.

Quote from: richardwbb on March 19, 2012, 10:09:02 PM
The sad thing was, the old provider wouldn;t let me copy the database myself (phpmyadmin restrictions) so they put a file in my www space. And it was in UTF-8 format, it took me days to understand what the heck was going on.

Also, the damage had become permanent and it is not so severe for the two links I sent to you, those pages are quite static but do contain special chars, it made the users complain instant on their PM's and their topics where the offer second hand goods, using the euro sign.

also, some people, me not included, I just type as fast as I can, know how to put all those special characters with the alt and alt gr I suppose.

Then they read their topic and a single special char 'converted' to a a few characters like the world 'Belgium, in Dutch, 'België'

Which shows as BelgiÃÆ'Įââ,¬â,,¢ÃƒÆ'ââ,¬Â ÃÂ.. right now.

This scares me since, that looks to me as multiple conversion error, thus unrepairable.

However, that was only for the last 3 months of topics since I dared to overwrite current database with a 3 months old one and really have no clue how I did that, besides I did not use MySQL INSERT, more like WRITE or something and it didn't happen like MySQL manual I believed explained to me.

So I say, I am in deep ****** now.

This, because I was able to find a handful of topics similar, but nothing like what happened to me. Reading the MySQL manual doesn't feel helpful here, it appears as being huge, and me no able to know where to start, similar to Apache manual or PHP manual.

All above does not matter, however I received not much help (not even a screenshot) from the members that where able to use Japanese chars, showing properly on their screen, but apparently not on mines, where I even posted in the forum, what are you posting, strange signs?

anyway, above wasn't very on topic I guess

However, I have a working live forum (with also characters still wrong but no one seems to overcome those no more or not often) and I let the upgrade.php of 2.0.2 do it's job and what I see is the *same* character problem in the 2 links of the topics I sent to you via PM, where the 'switch' in settings.php what charset to use, does not work anymore.

I also wasn't able to hardcode the charactersetting in the template file, to atleast make it display properly in the browser.

Here I'm 100% lost on what I might have done wrong and also on what I should do.

I have sent you a reg, being a moderator.
Let me check these out, and I'll try to make some sense in to this. I'll get back to you soon.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

Aleksi "Lex" Kilpinen

#18
If you have phpmyadmin, could you check the following from both the databases ( the live one, and the test one ) - Sorry, my phpmyadmin is in finnish, but I'm sure you can recognise the views. :)

Edited to add: Don't change anything - Just check what they say.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

richardwbb

Hmm, I should stop doing things out of my bare head. It used to be latin_swedish_ci and I always wondered, why it wasn't saying English or Dutch.

I have included a screenshot of both the upgraded forum and the live forum.
If my post in this topic looks ambiguous to you, then I'm with Murphy's law and General Stupidity. In other words, trial and error.

Advertisement: