News:

Wondering if this will always be free?  See why free is better.

Main Menu

Illegal mix of collations

Started by shawnb61, June 19, 2015, 01:33:51 PM

Previous topic - Next topic

shawnb61

A couple of lessons, and a couple of questions... 

My forum has been around for years.  I am new to this, trying to help out & learning SMF as I go.   I recently successfully patched the forum to 2.0.10. 

I've been seeing the following error in my error log, associated with some searches:
Database Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like'

Looking in phpMyAdmin, my production DB has a mix of collations throughout:  latin1_swedish_ci, utf8_general_ci, utf8_unicode_ci, and utf8_bin.   The primary SMF tables, such as smf_messages, are all in latin1_swedish_ci. 

I have created two test forums on my PC (built up from EasyPHP, which was... easy...).   One is a complete vanilla install, current 2.0.10 & some of my mods.  This was built as native UTF-8.  The second TEST forum is a clone of my production DB.   

Learned a few lessons...   
  -  My vanilla install has a mix of collations!   Mod installations added tables that were all in latin1_swedish_ci.   So my vanilla DB already has mixed collations.  Yet... I do NOT get the 'illegal mix of collations' error in it no matter what I do.
  -  On my prod clone, I migrated everything to latin1_swedish_ci using a php script I found online & modified.  This went well and everything works great.  There is NO mix of collations on this DB now and it tests OK.   However..   I CAN reproduce the 'illegal mix of collations' error.

Clearly, in my case, this message has nothing to do with mixed collations in the DB, since I can reproduce the error on a forum with no mixed collations, but I do NOT get the error on a forum with mixed collations. 

Where my error message REALLY came from...   After much experimentation, I found I can replicate the 'illegal mix of collations' error by searching for Turkish words, e.g.:
öğretmenim

I.e., in my case, the error message comes from the difference between the search term and the DB, not a discrepancy between the collations for different tables within the DB.

Question 1:  If a table is added by a mod, is it included in core SMF features such as search?   Or do core SMF functions (such as search) only access core SMF tables such as smf_messages?  I'm trying to gauge how safe it is to change all collations, even those set by mods.  One mod created tables in UTF8_bin.   Makes me think that was deliberate, and that changing the collation may change mod behavior in ways I just haven't seen yet in TEST. 

Question 2:  When SMF builds a vanilla DB, shouldn't SMF set the default collation for the DB it just built to be in sync with the tables it just created?  This feels like an SMF bug to me.  This, I think, is where many of us get mixed collations in the DB.  Even after a vanilla SMF UTF8 install, the DB default collation for me remained latin1_swedish (MySQL default???).  Most mods just created tables using the DB default.  (To prove, I de-installed the mods, changed the DB default collation, then reinstalled, and the collations were then consistent.)

Question 3: Before I change the collations in PROD, I'd like to know - in general terms - how safe is it to change collations?   Related: Is it important that they all be in sync?

It is very easy to change all columns & tables to a common collation with a script.  It even tests out well in TEST.  I'm just trying to understand any potential hidden risks before taking the plunge in PROD.
Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

青山 素子

Quote from: shawnb61 on June 19, 2015, 01:33:51 PM
Question 1:  If a table is added by a mod, is it included in core SMF features such as search?   Or do core SMF functions (such as search) only access core SMF tables such as smf_messages?  I'm trying to gauge how safe it is to change all collations, even those set by mods.  One mod created tables in UTF8_bin.   Makes me think that was deliberate, and that changing the collation may change mod behavior in ways I just haven't seen yet in TEST. 

If the modification is using the SMF functions for building things, it should match the collation that SMF is using. You may be running a higher version of MySQL than on the server, and the default collation may be different between the two.

There might also be an issue with using SMF in Unicode/UTF8 mode. I haven't looked much into how exactly it handles things, but there may be some issues in "converting" the tables. SMF should default to latin1 on the table collations for the 2.0 and below editions.


Quote from: rexall9000 on June 19, 2015, 10:47:15 PM
Question 2:  When SMF builds a vanilla DB, shouldn't SMF set the default collation for the DB it just built to be in sync with the tables it just created?  This feels like an SMF bug to me.  This, I think, is where many of us get mixed collations in the DB.  Even after a vanilla SMF UTF8 install, the DB default collation for me remained latin1_swedish (MySQL default???).  Most mods just created tables using the DB default.  (To prove, I de-installed the mods, changed the DB default collation, then reinstalled, and the collations were then consistent.)

It should, but I'm not aware if it does. Oracle has changed a lot in MySQL, and 2.0 hasn't really been fully tested for the absolute newest versions. As far as I can tell, it works fine. I'm also not feeding extended characters into it, so I might not be seeing an issue for that reason.


Quote from: rexall9000 on June 19, 2015, 10:47:15 PM
Question 3: Before I change the collations in PROD, I'd like to know - in general terms - how safe is it to change collations?   Related: Is it important that they all be in sync?

You may "lose" extended characters in that they may become corrupted or change compared to what they are supposed to be. I don't think MySQL converts existing data, so changing the collation may result in weird extended character mappings.

I'd advise that all tables should share the same collation. There are very few reasons to have a mix in a single database, and none of those apply to SMF.

One of the support team or someone else with more experience may offer some better answers.
Motoko-chan
Director, Simple Machines

Note: Unless otherwise stated, my posts are not representative of any official position or opinion of Simple Machines.


shawnb61

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Advertisement: