How i can convert a mixed collation database to UTF-8 ?

Started by Yngwiedis, February 04, 2008, 01:11:27 AM

Previous topic - Next topic

Yngwiedis

Hello...

I have a very large database with 2 different characters sets and collations in various tables.
I want to make it compatible with UTF-8 because i have a lot of problems because of the difference in character sets.
I am trying to convert it trought the SMF relative function but is not working.
You now any other way to do it ?

Thank you.
AmityHost.com - The Affordable Host

Sarge

Which character sets and collations?

Is just the database mixed (with some tables in one charset and other tables in another charset) or are the tables mixed too (with some fields in one charset and other fields in the other charset)?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Character Sets : latin1 and greek
Collations : latin1_general_ci  and greek_general_ci

Both of the situations you said is happening...
AmityHost.com - The Affordable Host

Sarge

What is the character set and collation for the database?

It should be mentioned in phpMyAdmin: select your database (but not the tables) and look at the bottom. It should show something like this:



[edit]
Also click on the tables and check their collations too.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Yes the whole database is latin1_swedish_ci, and some of the tables are in latin1_swedish_ci and some are greek_general_ci
AmityHost.com - The Affordable Host

Sarge

When you go in the tables that use latin1_swedish_ci, are there any fields/columns with collation greek_general_ci?

Also, when you go in the tables that use greek_general_ci, are there any fields/columns with collation latin1_swedish_ci?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

AmityHost.com - The Affordable Host

Sarge

It's a bit hard to say what exactly you should do without looking at the forum and the database directly, but you can try putting the forum in maintenance mode, then open phpMyAdmin and start switching all tables and fields that are in greek_general_ci to latin1_general_ci, or vice-versa. What's important is that all table and field collations should match the collation of the data itself.

Can you post a link to your forum?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Quote from: Sarge on February 05, 2008, 01:35:00 AM
Can you post a link to your forum?

My main forum is private. Sorry about this but it has to do with a game community and i cant give you access because if i did the other admins will cut my head off.
I can give you a link to another forum with the exact same problem.
http://community.hellas-alliance.gr/

Right now the data are displaying correctly because i add this query to the index.php
Quotemysql_query("SET NAMES 'greek'");

If i remove it i see only question marks.

If this helps the problem start because another fellow admin of the forum add the query above in the index.php for a reason that i dont know...

Quote from: Sarge on February 05, 2008, 01:35:00 AMIt's a bit hard to say what exactly you should do without looking at the forum and the database directly, but you can try putting the forum in maintenance mode, then open phpMyAdmin and start switching all tables and fields that are in greek_general_ci to latin1_general_ci, or vice-versa.

If i do it with phpMyAdmin i will spend many many hours. There is not another easiest way ?

Quote from: Sarge on February 05, 2008, 01:35:00 AM
What's important is that all table and field collations should match the collation of the data itself.

Sorry but i dont think i understand exactly what you mean  :-\
AmityHost.com - The Affordable Host

Sarge

It's important to know how the data are saved -- in other words, whether posts etc. are saved as greek_general_ci or latin1_swedish_ci (regardless of the collation of the database, tables, fields etc).

Where exactly is mysql_query("SET NAMES 'greek'"); in index.php?

I suggest the following:

Create a test forum in a separate directory and on a separate database. For example, if your forum is in www.yourforum.com, you can create the test forum at www.yourforum.com/test/ and install SMF there, using fresh files from the SMF 1.1.4 Full Install package, without any mods.

Then export the database of your main forum and import it into the database of the test forum, overwriting the original tables for the test forum. Because of the different character sets, you need to use "SET NAMES insert_character_set_here". It's quite possible that "SET NAMES greek" will do the job for both export and import. Then see if the test forum displays everything correctly.

How exactly you do the data export and import depends on the kind of access you have on the server. Do you have shell access? If you do, you should be able to use the mysqldump and mysql commands directly, for speed. If not, you would have to use some export/import tool and specify the character set there. You could use create_backup.php and restore_backup.php, with some necessary modifications.

In any case, get backup copies of the databases before you start. Even backups might not be saved correctly, though, because of different charsets, which is why I suggest working on a test forum first.

Let us know if you need more info. :)

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Quote from: Sarge on February 05, 2008, 08:50:32 AM
It's important to know how the data are saved -- in other words, whether posts etc. are saved as greek_general_ci or latin1_swedish_ci (regardless of the collation of the database, tables, fields etc).

Where exactly is mysql_query("SET NAMES 'greek'"); in index.php?

Right here :

// Connect to the MySQL database.
if (empty($db_persist))
$db_connection = @mysql_connect($db_server, $db_user, $db_passwd);
else
$db_connection = @mysql_pconnect($db_server, $db_user, $db_passwd);
mysql_query("SET NAMES 'greek'");
// Show an error if the connection couldn't be made.
if (!$db_connection || !@mysql_select_db($db_name, $db_connection))
db_fatal_error();



Quote from: Sarge on February 05, 2008, 08:50:32 AM
I suggest the following:

Create a test forum in a separate directory and on a separate database. For example, if your forum is in www.yourforum.com, you can create the test forum at www.yourforum.com/test/ and install SMF there, using fresh files from the SMF 1.1.4 Full Install package, without any mods.

Then export the database of your main forum and import it into the database of the test forum, overwriting the original tables for the test forum. Because of the different character sets, you need to use "SET NAMES insert_character_set_here". It's quite possible that "SET NAMES greek" will do the job for both export and import. Then see if the test forum displays everything correctly.

If i just copy the database to a new one with phpMyAdmin i will have a problem ?

Quote from: Sarge on February 05, 2008, 08:50:32 AM
How exactly you do the data export and import depends on the kind of access you have on the server. Do you have shell access? If you do, you should be able to use the mysqldump and mysql commands directly, for speed. If not, you would have to use some export/import tool and specify the character set there. You could use create_backup.php and restore_backup.php, with some necessary modifications.

In any case, get backup copies of the databases before you start. Even backups might not be saved correctly, though, because of different charsets, which is why I suggest working on a test forum first.

Let us know if you need more info. :)

No i dont have shell access :(

I will install the test forum now and then i will tell you news...

AmityHost.com - The Affordable Host

Sarge

Quote from: Yngwiedis on February 05, 2008, 09:35:36 AM
If i just copy the database to a new one with phpMyAdmin i will have a problem ?

There's only one way to find out... Try and see. :)

I forgot to tell you before, but after you copy/import the database, you need to upload repair_settings.php to the test forum directory and visit it with your browser. Check the database info and click on the blue "Recommended" links in the Paths & URLs section. Delete repair_settings.php immediately after saving, for security reasons.

Quote from: Yngwiedis on February 05, 2008, 09:35:36 AM
No i dont have shell access :(

I will install the test forum now and then i will tell you news...

OK then, let us know how copying the database goes. :)

How big is the database?

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Quote from: Sarge on February 05, 2008, 10:24:51 AM
Quote from: Yngwiedis on February 05, 2008, 09:35:36 AM
If i just copy the database to a new one with phpMyAdmin i will have a problem ?

There's only one way to find out... Try and see. :)

I forgot to tell you before, but after you copy/import the database, you need to upload repair_settings.php to the test forum directory and visit it with your browser. Check the database info and click on the blue "Recommended" links in the Paths & URLs section. Delete repair_settings.php immediately after saving, for security reasons.

OK i did it.

Quote from: Sarge on February 05, 2008, 10:24:51 AM
Quote from: Yngwiedis on February 05, 2008, 09:35:36 AM
No i dont have shell access :(

I will install the test forum now and then i will tell you news...

OK then, let us know how copying the database goes. :)

How big is the database?

The database is about 250 MBs...

The test forum is at : http://hellas-alliance.gr/commtest/index.php
I add again in the index.php the query to show correct the data...
AmityHost.com - The Affordable Host

Sarge

Quote from: Yngwiedis on February 05, 2008, 11:36:34 AM
The test forum is at : http://hellas-alliance.gr/commtest/index.php
I add again in the index.php the query to show correct the data...

Guest access is disabled, so I can't see anything. I noticed that English is still the default language. You might want to add the Greek language pack if you use that in your main forum.

Are the collations for the database of your test forum different or the same as for your main forum? You can check the test database from phpMyAdmin.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Quote from: Sarge on February 05, 2008, 11:54:25 AM
Quote from: Yngwiedis on February 05, 2008, 11:36:34 AM
The test forum is at : http://hellas-alliance.gr/commtest/index.php
I add again in the index.php the query to show correct the data...

Guest access is disabled, so I can't see anything.

Guest access is now enabled.

Quote from: Sarge on February 05, 2008, 11:54:25 AM
I noticed that English is still the default language. You might want to add the Greek language pack if you use that in your main forum.

I will do it in 5 minutes...

Quote from: Sarge on February 05, 2008, 11:54:25 AM
Are the collations for the database of your test forum different or the same as for your main forum? You can check the test database from phpMyAdmin.

The default collation of the test database is : latin1_swedish_ci
All the tables inside the test database have collation : latin1_swedish_ci
Some fields inside the tables hae mixed collations latin1_swedish_ci and greek_general_ci
AmityHost.com - The Affordable Host

Sarge

After installing the Greek language pack as the forum default language, make a backup copy of the database and try to convert it to UTF-8.

Choose windows-1253 from the list for "Data character set".

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

I did it and i receive the following error :

QuoteDuplicate entry '4549' for key 1
Αρχείο: /home/alliance/public_html/commtest/Sources/Admin.php
Γραμμή: 1347
AmityHost.com - The Affordable Host

Sarge

Is "Enable error logging" selected in Admin > Features and Options? If it is, check Admin > Forum Error Log. What do you see there?

If error logging is not enabled, enable it, then copy the backup database over to the test database (you may have to delete the test database first) and try the conversion to UTF-8 again. When you get the error again, go to the error log and tell us what do you see there.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Yngwiedis

Error Logging it was enabled.

The error from there is :

QuoteΕφαρμογή φίλτρου: Προβολή μόνο των ίδιων λαθών
Σφάλμα βάσης δεδομένων: Duplicate entry '4549' for key 1
Αρχείο: /home/alliance/public_html/commtest/Sources/Admin.php
Γραμμή: 1347

P.S. Do you want to translate it for you ?
AmityHost.com - The Affordable Host

Sarge

Can you check in phpMyAdmin and see which table is the last one converted to UTF-8?

I think it's one of the smf_log_ tables...

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Advertisement: