How i can convert a mixed collation database to UTF-8 ?

Started by Yngwiedis, February 04, 2008, 01:11:27 AM

Previous topic - Next topic

Yngwiedis

You mean which table's field is the latest one that is converted, because the tables itself have again latin1_swedish_ci

So i am not exactly sure. Lets begin from the top to bottom

smf_log_online is the last table with utf8_general_ci inside the fields.
smf_log_polls dont have any field collation.
smf_log_search_messages dont have any field collation.
smf_log_search_results dont have any field collation.
smf_log_search_subjects dont have any field collation.
smf_log_search_topics dont have any field collation.
smf_log_topics dont have any field collation.
smf_membergroups table. The fields of this table are in greek_general_ci

So the latest one is sure between smf_log_online and smf_membergroups tables...
AmityHost.com - The Affordable Host

Sarge

Those tables don't have any collations because they don't have any text fields. int and mediumint are integer (numeric) types. Character sets and collations apply only to text and text-like fields.

Copy the backup database over to the test database again and empty (don't drop) these tables from phpMyAdmin:

smf_log_search_messages
smf_log_search_results
smf_log_search_subjects
smf_log_search_topics

and try the conversion again.

Another member reported the same issue:
http://www.simplemachines.org/community/index.php?topic=209038

Note that SMF conversion to UTF-8 causes the display character encoding to be switched to UTF-8 as well. This means that you will need to install the greek-utf8 language pack as default (there's a way to reset it for all members), and your forum members might have to switch encoding manually to UTF-8 in their browsers.

I should mention that SMF specifies the encoding explicitly in the forum pages, so all modern browsers should switch automatically, unless the user has "frozen" (locked) the encoding to Greek, but I don't think a lot of people do that.

Anyway, if you want to keep the Greek character set (like you have it now), there's another way. Let me know what character set you'd like: Greek or UTF-8. Either of them should work, although UTF-8 is more universal and is better supported by database servers.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

#22
Quote from: Sarge on February 05, 2008, 02:12:45 PM
Those tables don't have any collations because they don't have any text fields. int and mediumint are integer (numeric) types. Character sets and collations apply only to text and text-like fields.

Copy the backup database over to the test database again and empty (don't drop) these tables from phpMyAdmin:

smf_log_search_messages
smf_log_search_results
smf_log_search_subjects
smf_log_search_topics

and try the conversion again.

Another member reported the same issue:
http://www.simplemachines.org/community/index.php?topic=209038

OK i did it and i am trying the conversion again now...

New error...

QuoteColumn 'body' cannot be part of FULLTEXT index
Αρχείο: /home/alliance/public_html/commtest/Sources/Admin.php
Γραμμή: 1329
:(

Quote from: Sarge on February 05, 2008, 02:12:45 PM
Note that SMF conversion to UTF-8 causes the display character encoding to be switched to UTF-8 as well. This means that you will need to install the greek-utf8 language pack as default (there's a way to reset it for all members), and your forum members might have to switch encoding manually to UTF-8 in their browsers.

I should mention that SMF specifies the encoding explicitly in the forum pages, so all modern browsers should switch automatically, unless the user has "frozen" (locked) the encoding to Greek, but I don't think a lot of people do that.

Anyway, if you want to keep the Greek character set (like you have it now), there's another way. Let me know what character set you'd like: Greek or UTF-8. Either of them should work, although UTF-8 is more universal and is better supported by database servers.

You dont have to write so many details. I know all the above things you said :)
AmityHost.com - The Affordable Host

Sarge

Quote from: Yngwiedis on February 06, 2008, 06:52:37 AM
New error...

QuoteColumn 'body' cannot be part of FULLTEXT index
Αρχείο: /home/alliance/public_html/commtest/Sources/Admin.php
Γραμμή: 1329

It looks like you have fulltext search indexing enabled.

Copy the backup database over to the test database again. Just to be sure, also empty the search tables (as mentioned in my previous post).

In the test forum, go to Admin > Search > Search Method tab. Click on the link [remove fulltext index] and switch "Search index" to "No index".

Then try the conversion again. You should be able to recreate the search index afterwards.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

New error again :

QuoteYou have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc desc blob NOT NULL,
CHANGE COLUMN module module tinyblob NOT NULL,
' at line 3
Αρχείο: /home/alliance/public_html/commtest/Sources/Admin.php
Γραμμή: 1329
AmityHost.com - The Affordable Host

Sarge

Quote from: Yngwiedis on February 06, 2008, 09:21:29 AM
New error again :

QuoteYou have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc desc blob NOT NULL,
CHANGE COLUMN module module tinyblob NOT NULL,
' at line 3
Αρχείο: /home/alliance/public_html/commtest/Sources/Admin.php
Γραμμή: 1329

I've never seen that error message before... It looks like the SMF Shop tables.

List of mods installed on the main forum?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

1.     TinyPortal      0.983      
2.    SMF Shop Lottery    1.0.1    
3.    Google Member Map    0.1    
4.    Enhanced Quick Reply Box 2.0.2    
5.    Additional Membergroups on Profile    1.0
6.    Googlebot & Spiders Mod    2.0.3    
9.    YouTube BBCode    1.9    
10.    Thank-O-Matic    1.1.9
11.    Thank-O-Matic Greek Translation    1.0.1    
12.    Spoiler Tag    0.3.2    
13.    SMFShop    3.0
AmityHost.com - The Affordable Host

Sarge

Can you post the database structure for your main forum? Just the structure, not the data.

To do this, open phpMyAdmin, select your main forum database (do not select any tables) and click on the Export menu on the upper right side:



Uncheck the "Data" checkbox, check the "Save as file" option, select "zipped" for the compression and click the Go button to download the file onto your computer. Verify that it doesn't contain any sensitive data and attach it to your reply here.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Quote from: Sarge on February 06, 2008, 11:56:46 AM
Can you post the database structure for your main forum? Just the structure, not the data.

To do this, open phpMyAdmin, select your main forum database (do not select any tables) and click on the Export menu on the upper right side:



Uncheck the "Data" checkbox, check the "Save as file" option, select "zipped" for the compression and click the Go button to download the file onto your computer. Verify that it doesn't contain any sensitive data and attach it to your reply here.

There is no need to write so many details.
I know how to do it. I am not so n00b :P

The structure of the database is attached.
AmityHost.com - The Affordable Host

Sarge

I have already figured out that you aren't such a n00b ;) but I want to be on the safe side and type it all so that, when someone else has the same problem, I can simply point them to these posts instead of retype everything. So please bear with me. ;)

The 'desc' related error is a known SMFShop issue. Rename it to 'description' and rename it back to 'desc' after the conversion.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

#30
The conversion finished succesfully. After that i change the SMF Shop description field back to desc.
Then i upload the greek-UTF8 language files and i change the default language it in the Server Settings to greek-utf8.
It seems that is working correct.
[EDIT] Not exactly correct... The subjects of the topics are not readable :( [/EDIT]

I have 2 questions now...
The option "Convert HTML-entities to UTF-8 characters" is something i need to use after the conversion ?
The conversion change all the fields collation but not the tables collation. The collation of the tables is latin1_swedish_ci again.
Is this correct ?
AmityHost.com - The Affordable Host

Sarge

Quote from: Yngwiedis on February 08, 2008, 04:17:06 AM
[EDIT] Not exactly correct... The subjects of the topics are not readable :( [/EDIT]

In SMF, every topic gets its subject from the first message (post) for that topic.

Search for smf_messages in the dump file you uploaded and look at the `subject` field definition:

  `subject` tinytext character set utf8 NOT NULL,


The problem is that its not clear whether rows (data) are all in greek, all in utf8, all in latin1, or mixed (some rows in utf8 and/or some in latin1 and/or some in greek). However, most probably it's already in greek.

Try setting that field as greek_general_ci (which is the default collation for the greek character set) and try the conversion again. Do this after copying the backup database over to the test database, emptying the smf_log_search_* tables, renaming `desc` to `description` etc as before.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Thank you very very much Sarge. All is ok now :)

Can you answer my 2 questions now ?
Because i have 2 more forums with mixed collations and character sets to convert...
AmityHost.com - The Affordable Host

Yngwiedis

#33
For a strange reason that i dont know, the quote function is not working in Greek-UTF8 language.
Only in English...  :-\

[EDIT]
Is not only the quote function. All the Ajax functions is not working
Also some old topics with quotes in the message body are semi "vanished".
I check the database and the half messages body are not there !!!
[/EDIT]
AmityHost.com - The Affordable Host

Sarge

Did you check smf_messages in the database? Does it contain the full message bodies? If not, can you compare them to the original contents and see where is the content cut off?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Yes i check the smf_messages table and the message bodys are not complete there.
The problem is that after the conversion i delete the backup database because i thought that all went ok :(
Is not a big problem anyway, because this is happened to a small amount of messages.
My big problem is the Ajax functions...
AmityHost.com - The Affordable Host

Sarge

If you have any recent backups (which you should), you can import those incomplete messages back into the original database. Since the problem happens with just a few messages, I suspect it's because the sizes of those messages are at least half of the message body size limit. Try increasing the limit in the database and then import one of those incomplete messages. More specifically, increase the type of the `body` column in smf_messages from text (about 65K) to mediumtext (16M), which should be more than enough.

As mentioned in the other topic I linked to above, UTF-8 uses more than 1 byte for some characters. If a message consists mostly of non-English characters, the effective size can easily get doubled (2 bytes are needed for Greek letters in Unicode).

As for the AJAX functions, how do they work with the default theme and English?

I can also take a closer look if you want. You can PM me your forum URL and a test account (regular member), but I would need to be able to select the default theme and English. I think there's a mod by Rudolf to install a theme change permission, and you can apply that to specific membergroups. I don't need admin access. You might want to clear this with the other admins as well. :)

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Quote from: Sarge on February 11, 2008, 10:55:08 AM
If you have any recent backups (which you should), you can import those incomplete messages back into the original database. Since the problem happens with just a few messages, I suspect it's because the sizes of those messages are at least half of the message body size limit. Try increasing the limit in the database and then import one of those incomplete messages. More specifically, increase the type of the `body` column in smf_messages from text (about 65K) to mediumtext (16M), which should be more than enough.

As mentioned in the other topic I linked to above, UTF-8 uses more than 1 byte for some characters. If a message consists mostly of non-English characters, the effective size can easily get doubled (2 bytes are needed for Greek letters in Unicode).

Like i said before, i dont have any backup anymore. Yes i am stupid :(
Anyway i change the body field in smf_messages from TEXT to MEDIUMTEXT for future need.

Quote from: Sarge on February 11, 2008, 10:55:08 AM
As for the AJAX functions, how do they work with the default theme and English?

I can also take a closer look if you want. You can PM me your forum URL and a test account (regular member), but I would need to be able to select the default theme and English. I think there's a mod by Rudolf to install a theme change permission, and you can apply that to specific membergroups. I don't need admin access. You might want to clear this with the other admins as well. :)

Ajax functions are working with the theme i use, but only in english language.
I will send you a PM with the login details.
You can change the language from profile.

AmityHost.com - The Affordable Host

Sarge

There shouldn't be any need to change the field type after the conversion, of course unless you need bigger messages. However, a larger type might cause problems with the search function.

I checked your forum and couldn't find anything at the moment. What exactly is the language pack you're using in your forum? Was the same pack working before?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

The language pack i use now is Greek-UTF8.
Before the conversion i use Greek.
AmityHost.com - The Affordable Host

Advertisement: