One of my guys is seeing this error when trying to do utf-8 conversion on 2.0.12 forum -
http://domain.org/index.php?action=admin;area=maintain;sa=database;activity=convertutf8
Database Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'key key varbinary(255) NOT NULL default '',
CHANGE COLUMN member_name mem' at line 2
Apply Filter: Only show the errors from this file
File: /homepages/139/n285406/htdocs/csa/Sources/ManageMaintenance.php
Line: 666
Error is in conversion script. Database is already at utf-8.
Also having a problem displaying special characters. Forum was originally at 2.0.8, this character - ® - displayed correctly. Now at 2.0.12, it displays this - �. Other characters may be affected. Collation is utf8_general_ci.
A database in utf-8 will show black diamonds if the character encoding has been wrong. It is possible the data inside the database itself has to be converted with a converter. On linux a quick way to see if a database is encoded properly is to use iconv.
If you have let MySQL import with INSERT INTO, then a black diamond means the original character encoding has been lost.
It is also possible that messages get truncated, then the specific black diamond isn't there, but at the first incorrect encoded character, the rest of that message has been dropped. From the error you posted this doesn't seem to be the case though. Just be aware that exporting and importing from MySQL even within PHP can result in database-losses.
I wonder if your forum has been set with ISO-8859-15 or UTF-8 encoded translations/ English files. SMF treats differently the database. Also, phpMyAdmin will tell you what encoding it sees with the MySQL. If you can get to the MySQL prompt it can tell what encoding is in use [this can differ from what phpMyAdmin is telling].
i would like to know for sure the original database, without conversion, I mean, will be inserted in MySQL without error. There is no point using SMF for conversion, a proper encoded file can be converted to UTF-8 without problems with iconv.
Also, if you have mangled characters in the database or users with different languages or codepages, you will end up with a UTF-8 file MySQL can or can not INSERT properly.
hth.
Seeing these - ’ - for apostrophe, and these - ® - for ® in sql dump. Both are replaced with � when db is imported. The post is being truncated, apparently at an arbitrary point, most of the text is missing. Happens when db is dumped as sql or gz zip.
This is getting stranger. We found a post on the clean install running on my host that has several paragraphs missing, just the titles of the first three are strung together in one line. What's strange is that if we click 'Modify', the complete post loads into the post editor, and when saved, the missing paragraphs are restored.
Any ideas?
Did they use the SMF UTF-8 conversion tool/option or a 3rd party MySQL editor such as phpmyadmin?
Either way it sounds as though special characters (ie. apostrophe's) were converted to their literal characters as opposed to HTML character codes.
Similar issue regarding posts as to what is described here: SMF Post (http://www.simplemachines.org/community/index.php?topic=247602.msg1602596#msg1602596)
and also possibly what is described here: Stackoverflow Post (http://stackoverflow.com/questions/9407834/mysql-convert-latin1-characters-on-a-utf8-table-into-utf8)
There is also a collation tool located here: Webdev Tool (http://web-develop.ca/index.php?topic=32.0) ... if necessary.
(adjust smf settings table to utf-8 collation with phpmyadmin and then try using v1.0)
AFAIK, the forum was originally set up as utf-8, it's bilingual (english/spanish). Took a quick look at those links, seems it may be possible to fix the backup. Can't do anything to the production database, as it is currently the one and only. The missing text and corrupt characters are all present in the dump, it's complete, just doesn't display correctly, but I can work on that.
Might be a long shot but is there variable global_character_set in settings table with value of UTF-8?
-Reko
From what you have written before my post [this], I can tell you, it isn't arbitrary what is going on, and, somewhere is a setting [or settings], not proper. Also, it looks to me that any damage in the database already, has to be converted manually.
From here I wonder what your tools are, do you have linux or Windows. And, what kind of access do you have, can you get to the MySQL prompt or not? [I assume you are using phpMyAdmin and exported databases. I am not sure if phpMyAdmin will report what MySQL says after importing the database, but it can be something like TRUNCATED 'AC813F' <- and hundreds of those. There is not written what post this was. And this problem I had and this one isn't easy to overcome if you don't know what to type to get useable Google results.
I also learn from you that you still have a proper file [I mean by that, non-truncated, non <?> characters in there. I propose you take a text editor and open the dumped sql file. It can be a good idea, if you are able to reproduce [but I not advise this yet as the way to go from here, since I am not understanding what you have done exactly, and more important what happened before this, how late you noticed something is wrong with the database/ setting etc.] But, writing a personal message to yourself, with the characters you are aware of, those end up truncated or mangled, changing settings, reviewing your old message, and, posting the same characters again, for example but not bound to, 'à á ä ò ó ö è é ë ï', you might learn where things break down. And from this point I ask you, if you are able and willing to post a piece of dumped sql from your text editor here. A hint is that you can find the broken part in the dump file by searching a unique pattern, or just 'é' [but that way you might end up with a part of a old message you have never seen before] and for debugging/ finding the cause of all this, it is better to have found a part in the sql dumped file that you have typed yourself.
And finally, please explain what you are doing now to prevent further damage. I learn from what you have written, that, you are aware of the truncation and I am not 100% sure yet but you will have to repair manually what already is damaged and this can be by a click of a script if you are lucky and if you are not so lucky, then you still need a text-editor and proper conversion tools and quite some time to spend probably.
Quote from: Sir Osis of Liver on November 06, 2016, 10:40:52 PM
Seeing these - ’ - for apostrophe, and these - ® - for ® in sql dump. Both are replaced with � when db is imported. The post is being truncated, apparently at an arbitrary point, most of the text is missing. Happens when db is dumped as sql or gz zip.
’ is a classic sign of a two byte UTF-8 character (i.e. data is physically UTF-8) being interpreted as ISO. When the DB is imported, you're taking that data that is now thought to be ISO and trying to put it into a UTF-8 table. Those bytes are not valid UTF-8 if it's gone through the process of being thought of as ISO because they end up decoded to high-bit characters which when they go back into MySQL look like the second byte of a UTF-8 character without a preceding first-byte of a UTF-8 character, hence the truncation.
Step 1: what is the actual encoding of the data as far as the database is concerned - not what SMF thinks it might be? (Because SMF can and does get this wrong depending on a variety of factors)
The production database is fine, everything displays correctly. The database dump contains all missing text, with scrambled codes for special characters. When imported, it does not display special characters, and blocks of text are missing in display template, but are included in post editor. In phpmyadmin, the production database shows all tables as utf8_general_ci. The bottom (Totals) line shows latin1_general_ci. I don't see any way to get to MySQL prompt, this is a basic hosting package with few bells and whistles. I can edit the dump, there's a script in one of the links posted above that replaces the borked codes with html character codes, but don't know if it will get all of them. Will try that tonight (if I don't run out of beer).
Quote from: Sir Osis of Liver on November 09, 2016, 09:12:11 PM
Ran SHOW FULL COLUMNS FROM smf_members, all text fields are utf8_general_ci, all int fields are NULL. Same in smf_messages.
This is what's in display template -
Saltwater IchSymptomsIdentification
This is what's in post editor -
[b]Saltwater Ich[/b] (Cryptocaryon irritans)
Cryptocaryon irritans (Cryptocaryon) is a fully ciliated, protozoan parasite present in all saltwater environments, which causes a disease known as marine �ich� or marine �white spot� disease in wild and cultured marine fishes at temperatures between 59 and 86�F. Cryptocaryon infections in fish are a significant disease problem for marine aquarists and commercial mariculture worldwide. Cryptocaryon is known to infect many different fish species, although there appear to be differences in susceptibility.
This is what phpmyadmin shows in body field -
[b]Saltwater Ich[/b] (Cryptocaryon irritans)<br />Cryptocaryon irritans (Cryptocaryon) is a fully ciliated, protozoan parasite present in all saltwater environments, which causes a disease known as marine "ich" or marine "white spot" disease in wild and cultured marine fishes at temperatures between 59 and 86°F. Cryptocaryon infections in fish are a significant disease problem for marine aquarists and commercial mariculture worldwide. Cryptocaryon is known to infect many different fish species, although there appear to be differences in susceptibility.
This is what's in the dump -
[b]Saltwater Ich[/b] (Cryptocaryon irritans)<br />Cryptocaryon irritans (Cryptocaryon) is a fully ciliated, protozoan parasite present in all saltwater environments, which causes a disease known as marine “ich†or marine “white spot†disease in wild and cultured marine fishes at temperatures between 59 and 86°F. Cryptocaryon infections in fish are a significant disease problem for marine aquarists and commercial mariculture worldwide. Cryptocaryon is known to infect many different fish species, although there appear to be differences in susceptibility.
The script in Chen Zhen's link does not work because it doesn't find the borked character codes. They're not in the imported database message body, characters are correct there, but they're in the dump. Basically, phpmyadmin is telling me the characters have imported correctly, display template has lost entire paragraph, post editor shows the paragraph with special characters missing.
Quote from: lwiz on November 09, 2016, 12:15:30 PM
Might be a long shot but is there variable global_character_set in settings table with value of UTF-8?
Yes there is.
This is what breaks it - should be "ich", but in body field it's "ich", in dump it's “ichâ€.
In the original production database, it's "ich". Nothing wrong with the dump or import, it's being exported/imported exactly as it is on production forum, except production forum displays it correctly as "ich", imported db breaks it. Both running 2.0.12, on different hosts.
Then the dump is broken somehow. How exactly are you checking the dump's correctness?
so http://www.simplemachines.org/community/index.php?topic=549809.0 and this topic are the same issue/topic? if so please select one topic and lock the other so we dont get confused as to what is going on and what has been done where.
The dump is importing exactly the same characters from production db, as viewed in phpmyadmin. Post displays correctly in production forum, but not in import forum. Same characters in body field. This happens on both hosts.
Quote from: Illori on November 10, 2016, 05:11:02 AM
so http://www.simplemachines.org/community/index.php?topic=549809.0 and this topic are the same issue/topic? if so please select one topic and lock the other so we dont get confused as to what is going on and what has been done where.
Locked the other one.
Production db body field -
Saltwater Ich (Cryptocaryon irritans)<br />C. irritans (Cryptocaryon) is a fully ciliated, protozoan parasite present in all saltwater environments, which causes a disease known as marine "ich" or marine "white spot" disease in wild and cultured marine fishes at temperatures between 59 and 86°F.
Imported db body field -
Saltwater Ich (Cryptocaryon irritans)<br />C. irritans (Cryptocaryon) is a fully ciliated, protozoan parasite present in all saltwater environments, which causes a disease known as marine "ich" or marine "white spot" disease in wild and cultured marine fishes at temperatures between 59 and 86°F.
They are identical. If I replace " and " with ", it breaks at °.
Something isn't identical between the two, clearly.
Yes, but how to find it?
This is an unusual forum, 30mb content consists of 5400 technical articles posted in 5400 topics by approx. 40 contributors, all using one admin account (the only member), over 2 1/2 years. Most, if not all, of the articles were probably written in various word processors. It's also possible some of the posts were copied/pasted from several WP installs the owner also runs. Everything works fine from production database, but it can't be backed up or moved because exports are somehow damaged.
I'd personally get a hex editor out and verify what actually is in that dump because I'm suspect of that.
So the dump is UTF-8 encoded. Seems like whatever is importing the dump is saying the dump is thinking it is ISO which would explain the truncation because if the dump were imported under the guise of being ISO (though internally it isn't) MySQL would try to put it in and choke because it would be expecting bytes that have different meanings.
But that returns to the fact that the post is not truncated in the body field, and loads complete into post editor, except for special characters, which are replaced with �. If it's saved in post editor, missing text is restored to display template. I've exported/imported the db on production forum's host, same phpmyadmin, same problem, and I've duplicated the glitch in my host account.
how are you attempting to export the database? phpmyadmin, hosts control panel or ?
phpmyadmin, also tried MySQLDumper. There is no host backup utility, and it's a basic package that doesn't include SSH.
are you able to run cron jobs? if so you could put in the commandline command there and try that backup.
I suspect a problem with the actual table definition. I had a very hosed SMF installation & DB.
This may help diagnose:
Using phpMyAdmin, I created a full export, all tables, WITHOUT the data (structure only). That enabled me to look at how each table was built in the entire DB - now easily readable in Notepad++. Not only did I have several different collations throughout my SMF install, I even had some tables in different charsets somehow.
This was the only way I could actually see the charset of each individual table to understand the issues I was having. I couldn't find it displayed, at the table level, anywhere in cPanel or phpMyAdmin. These tools all seem to assume your charsets are all consistent...
To fix the charsets:
I had to export the problem tables (with data this time, and with the DROP/CREATE syntax) and manually edit the CREATE command to change the DEFAULT CHARSET clause to DEFAULT CHARSET=utf8. After re-importing the table, everything was fine.
I had very little multi-char data, it was garbled, and when I was done, it remained garbled. HOWEVER... After I did this, all posts going forward were perfect, and I was finally able to enable and provide multi-language support.
How did I get multiple charsets? I don't know. This site was maintained and patched by several people over years, starting with SMF 1.1. None of whom were around anymore.
If you're brave (/silly) enough to try this, do it on a local or test DB a few times first... I created a local working copy of my SMF site specifically for this purpose using EasyPHP. I also have a test copy on my host. By the time I did this change in production, I had executed the scripts and tested the results 2 full times.
I think the basic problem here is that any backup method I use will be based on same mysql function, and that's what's causing the problem. Mysql seems to be having a problem with collation. The db exports and imports exactly the same characters, but SMF does not display them correctly. I can copy/paste text from body field in import db, and this is what you see here -
Saltwater Ich (Cryptocaryon irritans)<br />C. irritans (Cryptocaryon) is a fully ciliated, protozoan parasite present in all saltwater environments, which causes a disease known as marine "ich" or marine "white spot" disease in wild and cultured marine fishes at temperatures between 59 and 86°F.
Display template in this forum is able to display the text in it's entirety, including special characters. The only discrepancy is "ich". I was originally seeing

in both databases, but now it's changed to "ich" in both, for reasons unknown.
I will check the table definitions.
Edit: Checked the dump, all tables are imported with DEFAULT CHARSET=utf8.
I'm fairly certain collation doesn't affect display. Only whether special characters are treated the same for searches, sorts, string compares, etc. (E.g., whether they're case insensitive or not...) You do want collation consistent across your DB, but more for consistency/cpu.
Good news that it's all utf8 already.
First thought... Are there particular languages impacted? MySql has a weird partial implementation of utf8... Some languages need utf8mb4...
Second thought... elsewhere in the stack isn't supporting utf8.... Webserver? Browser?
It's a bilingual forum (english/spanish), but the borked posts I'm working with are english. Don't understand why the body fields from original and imported databases appear identical, but don't work the same, why post editor can show text that display template cannot, and why I can copy text directly from import db, post it here, and it's correct.
Actually, it's not entirely correct, why is it showing <br /> after the title?
Saltwater Ich (Cryptocaryon irritans)<br />C. irritans
I don't think I picked up... Is the behavior is different across two websites?
I would next audit php.ini in both environments to make sure they're consistent.
And your Apache httpd.conf as well...
Both of those config files need utf8 specified...
The original db works fine on production website, but when exported/imported to new db on owner's host or my host and connected to clean 2.0.12 install, posts do not display correctly, missing text and special characters. Same problems on both hosts, so unlikely it's a server-side problem. Don't think I can access php.ini on owner's host, and don't want to make any changes there until I know what's causing the problem. There is currently no way to get an intact db backup, and we've been unsuccessful asking host to copy the db, so there's only the original.
I'm able to determine by trial&error which characters are causing the problems in the posts I've looked at, but it would be very difficult to find/replace all the troublesome characters in 30mb/5400 posts by editing the dump or running sql queries in phpmyadmin.
Arantor seems to think the dump is ok, so the export is good, but something's going wrong with the import, and it's not visible in the imported body field.
I thought smf had a settings dump tool somewhere around here... I would definitely ensure environment alignment...
Comparing db dump from one of my forums with the dump from problem forum, the latter contains following which are not in former -
SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
Short version: If your tables were built correctly, I don't think that code is the problem.
You should see code like this as bookends to any phpMyAdmin export, as it saves off some environment variables at the beginning and restores them at the end. The idea is to faithfully import your data without screwing up your environment settings.
My suspicion is still environment...
The utility that displays your environment info is phpinfo.php. You need to get a copy & put it in your smf directory. Run it in both environments. I would first look at the mbstring section, looking for differences.
http://wiki.simplemachines.org/smf/What_is_a_phpinfo()_file
There are a few articles out there that explain configuring your environment for utf8, such as this, that will explain some of the settings:
https://allseeing-i.com/How-to-setup-your-PHP-site-to-use-UTF8
Hope this helps...
I think maybe we're looking at the wrong end of the horse. The dump apparently is utf-8, and is correct as per quick sample in hex editor. It's being imported into same host, same php settings, same phpmyadmin, same mysql, but it's not importing same as db it's exported from, even though they look identical. I've duplicated the problem on different host. The basic problem seems to be some incompatible character codes that were inserted by various word processors (possibly even Mac

), and for reasons unknown display correctly from original db, but are borked on import. But if the dump is valid utf-8, and body fields are identical in original and imported db, why is SMF displaying the text differently?
But if you have confirmed the content matches, byte for byte, it must be how these new environments are configured.
Quote from: shawnb61 on November 13, 2016, 01:38:41 AM
But if you have confirmed the content matches, byte for byte, it must be how these new environments are configured.
I agree with this.
This I have written somewhere else, please read this;
http://www.simplemachines.org/community/index.php?topic=548245.msg3898297#msg3898297
Can you run with phpMyAdmin SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; on both setups?
Please have a look at this and decide if this is helpful to you or not;
http://www.simplemachines.org/community/index.php?topic=548245.msg3898964#msg3898964
I've noticed the encoding on 'ich' and I suspect that MySQL might be encoding this character. It does for ' This character is for begin and end with MySQL but people put that in their post too and that one is encoded differently. For " that character is preceded with \ to make it appear and for " Ã" so it is preceded with Ã
Note that
""" is not
'"'For the encoding of 'ich' I've seen it somewhere, that is reversible encoding and you should consider trying a conversion script that is helpful.
For the phpMyAdmin export, avoid this if possible. phpMyAdmin exports are not the same as mysqldump exports. For my host, is running DirectAdmin, this has a menu option mysqldump, that dump is litterally MySQL talking. A phpMyAdmin dump is phpMyAdmin talking.
And for most uses phpMyAdmin is sufficient. But you explained that there are fourty users under the same account and those users must have had different keyboard settings, and I wasn't able myself to drill down on this but it appeared to me that different keyboard harware generates different byte sequences for the same character that is displayed and how and where this is handled is also there where you have to take the proper steps for repair, and for correct configuring. And if phpMyAdmin shows on both setups the same settings, maybe it is possible for you to keep using phpMyAdmin, though I have told you it exports differently and you could ask both providers to get the MySQL response to the charset and collation from there you can tell not what is wrong but you will be able to exclude what isn't wrong.
For the proposal of cron, I've never came around/ nor found the need for this but if you can mysqldump with that, I think you should. Atleast till you know for sure phpMyAdmin is exporting everything correctly. And that is important because it would be hard to learn keyboard setups with characters that are not in your native language. First hit with Google but there are better ones; http://www.comentum.com/mysqldump-cron.html
Thanks for the info, will check through all that soon as I have time. Meanwhile, forum owner did another export/import yesterday and new database works fine. Waiting for him to tell me how he imported/exported, as we'd already done this several times, and db was damaged every time.
While waiting to hear from forum owner, I've again dumped the production db as sql and gz zip, and imported on both his host and mine, with same result, both are damaged. I've confirmed that he did successfully import a dump of same db yesterday. No idea how he did it, except that he thought I had fixed it.
Re: the quotes around ich, that's one of the character errors that's causing post text to go missing. I was originally seeing this -

- which won't display here. When I copy it from Notepad, where it's displayed as in the image, you see it here as "ich", which is how it's displayed on production forum, and in the production database and all imported databases. Sadly, I've been tinkering with this over so many variations that I can't remember where I saw it with curly quotes, and can no longer find it. This is what's in the sql dump - “ichâ€.
Why not just create a secondary query to replace all the special characters you are having issues with?
My previous post in this thread pointed to an example that will fix some of them where you just need to figure out your specifics to replace.
Create a multidimensional array with your specifics, then loop through it to replace what is needed in any relevant table.
The problem is to find all the borked characters in 5400+ bilingual topics. It appears forum owner is now able to export/import the original database successfully using default phpmyadmin settings, he's done it twice. Have no idea why he can now do it, and I still can't. Working on it tonight.
Well, here's something new. Am now able to export/import db successfully on production forum's host, doing it same way we've been doing it for days, but same dump imports with same problems on my host. Since the goal here is to move forum to a different host, we're only halfway there. AFAIK, nothing's been changed on production host.
Production host: phpMyAdmin - 4.1.14.8 MySQL - 5.5.50-0+deb7u2-log
Import has been successfully finished, 864 queries executed.
My host: phpMyAdmin - 2.8.0.1 MySQL - 5.6.32-78.1-log
Import has been successfully finished, 857 queries executed.
Do you have VPS or dedicated from your host?
Can you update phpmyadmin to something more recent?
From what you just described it sounds as though updated changes/fixes to the phpmyadmin code may have corrected some issues such as this.
Just for kicks
$string = '“ichâ€';
$multiDim = array(
'’' => '’',
'‘' => '‘',
'â€"' => '—',
'â€"' => '–',
'…' => '...',
'â€' => '”',
'“' => '“'
);
$string = strtr($string, $multiDim);
We're both running on basic host packages, so not much I can do. I don't believe phpmyadmin or mysql versions on production host have changed since we started working on this.
Wondering why the import on my host is running 7 fewer queries from same dump.
I wasn't able to understand out of your [previous] words in a two or three of your posts in this topic, where exactly, that mangled character, showed; and; if this was in the exported database sql file or not.
But, I'll take this literally;
QuoteThis is what's in the sql dump - “ichâ€
So is that a exported file for testing purpose, or is it from the new host, or are you stuck with this and does it show the same on screen on; a. the old host and/ or b. the new host?
If you run ' “ichâ€' it will convert this to '"ich' with this utf8-decoder I posted a link (https://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder) of.
Is there a 'œ' lost somewhere; and if so, did you do that, or was that how it showed in that sql file you have there, and if so, or not so, is there a blank space after the second Euro sign?
I recognize all of the above and while not everything wasn't reversible encoding, I recommend Perl or Python. Without for me knowing more about what characters are mangled I can not be sure on where to start repairing and with what [how].
Your best bet [based on what I learned from you in this topic], is to make sure this is a configuration problem or not. It stops here, I wasn't able to figure out on my own, what is what and how stuck you are with that exported sql you have spoken about.
That's what's in the dump from production database. As per my previous post, we're both able to export/import successfully on production host, but not on my host. Nothing appears to have changed. My host will not update phpmyadmin, but suggested I can install current version in my account and run it from there. No clue so far how to configure it, not a lot of time to tinker with it.
QuoteIf you run ' “ichâ€' it will convert this to '"ich' with this utf8-decoder I posted a link of.
Btw there is supposed to be a hidden character for the curly closing quote which is why it did not show in his post properly.
“ichâ€
should be
"ich"
For testing purposes if you quote the post it may show but copy & paste from the page itself and it will not.
reasoning;I've been thinking what might have happened to the database you have there and I tried to find an example website decoder/ encoder but I couldn't find any that showed working order. That has a reason I'll explain later on.
I believe that what has happened, that Latin-1 encoded database was imported in a, made aware, UTF-8 SMF.
So, what you see is that; '"' became '“' and that '“' is how '"' gets written in a Latin-1 database, that is storing a UTF-8 character.
This means that, on a UTF-8 aware database, that '“' does not become '"' again. This holds that reason I spoke about, there is no such thing as 'wrong' encoding in a UTF-8 database-file. [For this example, that is]
For Linux'ers, it means that iconv can not translate UTF-8 to anything but UTF-8 or up. That is because the computer can't tell that '“' [it stays '“' for this computer] must mean '"'.
to know be forehand;What might give good result [depends if your database has no damage, and this was a single step misconversion], you will need a MySQL database that is in Latin-1 encoding and a SMF forum that is in ISO-8859-1. [That means, not setting the UTF-8 checkbox, what I suspect that has happened].
Then, in the template, the meta-charset has to be 'utf-8'. What? Yes, because, if that browser, expecting something that must be utf-8, bada-bing, '“' becomes '"' again.
Is there something wrong? Not in this example, based on a non-damaged database-file that has only received a single misconversion. Also, keep in your head, I didn't speak about win-1252 encoding. I do not know if there is another step to take here, but when you have imported that database again in a ISO-8859-1 SMF forum with a Latin-1 talking MySQL, you can make sure yourself that this is over for you, by checking *every* post with [previous] misencoding.
And, if that database receives a Latin-1 encoded file, it can't tell that the data had been stored in a UTF-8 aware forum. Actually there is more I should tell you, for example, if that database you got there still is live, you might have different situation. That depends on what you and your college [might] have done.
my advise;For starters, try bigdump.php or use PHP import to your target ISP and MySQL and you might want to try putting a '#" before the '$db_character_set = 'utf8';' in Settings.php of the source SMF forum, to see what happens with the '“' <- BUT DO NOT DO THIS ON A LIVE FORUM IF YOU DON'T FULLY UNDERSTAND WHAT YOU ARE DOING <- because SMF will start encoding differently and then you will have mixed encoding and you are stuck with UTF-8 which doesn't seem to be the case right now and then you are luckier then I was with all this. [And you still have to look in to converting to UTF-8 properly but that will be chapter II [or III if something erronous will show up from this point] for you].
- I hope this is clear to you before you do anything, and if not, feel free to ask me about what I wrote in this post.
What's odd is that export/import now works successfully on production host, afaik nothing's been changed. Last night I moved all of my stuff from my longtime host (20 yrs.) due to the apparent collapse of their support system (enough is enough :P), new host is running phpmyadmin 4.0.10.14. Will try importing the sql dump there, soon as I clean up a few loose ends.
Quote from: Sir Osis of Liver on November 22, 2016, 03:53:02 PM
new host is running php 4.0.10.14.
that version of PHP is no longer supported, i would not use a host that uses a php version without support.
Sorry, that's phmyadmin 4.0.10.14. Didn't solve the problem, import is still borked.
Did you run phpinfo? Do the mbstring settings match on your two environments?
Will check that tonight. I've now imported the same dump on three different hosts, it only imports correctly on the production forum's host, which it did not do previously (that was the original problem).

Matched php settings on my host with production forum's host by downgrading to php 5.5. Both are configured as follows -
PHP Version 5.5.38
mbstring.http_input pass pass
mbstring.http_output pass pass
default_charset no value no value
Import still screwed up on my host.
Previous settings -
PHP Version 5.6.28
mbstring.http_input no value no value
mbstring.http_output no value no value
default_charset UTF-8 UTF-8
Quote from: richardwbb on November 22, 2016, 03:40:13 PM
For starters, try bigdump.php or use PHP import to your target ISP and MySQL and you might want to try putting a '#" before the '$db_character_set = 'utf8';' in Settings.php of the source SMF forum, to see what happens with the '“' <- BUT DO NOT DO THIS ON A LIVE FORUM IF YOU DON'T FULLY UNDERSTAND WHAT YOU ARE DOING <- because SMF will start encoding differently and then you will have mixed encoding and you are stuck with UTF-8 which doesn't seem to be the case right now and then you are luckier then I was with all this. [And you still have to look in to converting to UTF-8 properly but that will be chapter II [or III if something erronous will show up from this point] for you].
Ok, looks like I fixed it.
Settings.php in production forum contains this line -
$db_character_set = 'utf8';
Settings.php in my test forum does not. If I add the line, posts from imported db display correctly. Apparently that's been the problem all along. Don't know why it's in the original forum's settings and not in my test forum settings.
"Even a blind pig occasionally finds a mushroom."
Quote from: Sir Osis of Liver on November 23, 2016, 10:07:38 PM
Settings.php in my test forum does not. If I add the line, posts from imported db display correctly. Apparently that's been the problem all along. Don't know why it's in the original forum's settings and not in my test forum settings.
"Even a blind pig occasionally finds a mushroom."
then your test forum was never converted to UTF-8, so that line was never added to it.
Quote from: Sir Osis of Liver on November 23, 2016, 10:07:38 PM
Ok, looks like I fixed it. Settings.php in production forum contains this line -
$db_character_set = 'utf8';
Settings.php in my test forum does not. If I add the line, posts from imported db display correctly. Apparently that's been the problem all along. Don't know why it's in the original forum's settings and not in my test forum settings.
"Even a blind pig occasionally finds a mushroom."
It looks like that both MySQL configurations are in Latin-1 and it seems so that this setting you menitioned on the production server, never changed [wasn't there], ever. Or there will be mixed encoding. And, for the testing server, it's is to be expected that any post in there with incorrect settings will be overwritten anyway.
You might want to check the exported database anyhow for inconsistencies, but from your responses I sense you won't bother, haha.
Quote from: Illori on November 24, 2016, 06:03:02 AM
then your test forum was never converted to UTF-8, so that line was never added to it.
Illori, can you tell us how a installing SMF forum responds with converting to UTF-8 in regard to this setting?
when you convert to UTF-8, that line is added to the Settings.php file, otherwise it does not exist. otherwise i dont know what that line tells SMF as i am not a coder.
The problems with that are :
1. - Who's to know?
2. - If you run the utf-8 conversion on the imported database, it fails with this error -
You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'key key varbinary(255) NOT NULL default '',
CHANGE COLUMN member_name mem' at line 2
File: /public_html/smf20test/Sources/ManageMaintenance.php
Line: 666
It had already been tried by the OP of the original thread. Haven't had time to try this, but it appears that if you want to import a utf-8 database into a clean install, you have to:
1. - Upload full install package.
2. - Create new database.
3. - Run Install.php to populate db with tables.
4. - Run utf-8 conversion on new database from ACP in new forum.
5. - Use phpmyadmin to drop all database tables.
6. - Import utf-8 database.
If you don't do it exactly in that order, imported db will display incorrectly. The only other way to fix it is to manually
add $db_character_set = 'utf8'; to Settings.php.
Would be a whole lot easier to have a setting in smf_settings that SMF could use to identify the database as utf-8, and handle it correctly.
ummmm....Sir Osis... if you just imported your files and database, you don't have to bother with that, since the Settings.php file from your original site will already HAVE that line. I think, this case is due to your strange desire to use an overly complicated way to backup/restore/move a site and database.
I mean:
Quote
1. - Upload full install package.
2. - Create new database.
3. - Run Install.php to populate db with tables.
4. - Run utf-8 conversion on new database from ACP in new forum.
5. - Use phpmyadmin to drop all database tables.
6. - Import utf-8 database.
can be replaced by
1. - create a blank database on the new host
2. - import the database backup and backed up forum files
3. - run repair_settings.php to adjust the paths and urls.
No, you've missed the point. This discussion began in another thread, OP's forum had been hacked, account suspended, support had jerked him around, and done things unknown to the forum. He wanted to move the forum to a new host, and it was recommended that he do a completely new install, to avoid carrying over any crap in forum files from old to new host. Before attempting the move, he tried it in his current host, did clean install in another directory, exported production db, imported it into new db, that's when the problem showed up. I tried importing it into clean install on my host, same result.
If you do a completely new install, Settings.php out-of-box does not contain $db_character_set = 'utf8';, it's added by running utf-8 conversion on default latin1 database. You can't run the conversion on the imported db, it errors out, and you obviously can't run it on a blank db.
Nothing strange about it, the forum owner is experienced running SMF forums, and Sir Osis has a passing familiarity with the software, and it took a couple of weeks, with input from many folks, to figure this out.
What's most annoying is, when the utf-8 coversion was run on the original db, it added
global_character_set UTF-8 to smf_settings, so the db is identified as being utf-8, but SMF ignores it. Why doesn't SMF use the db setting, rather than Settings.php, to determine the correct collation? That would eliminate this problem altogether.
Quote from: Sir Osis of Liver on November 24, 2016, 04:04:13 PM
What's most annoying is, when the utf-8 coversion was run on the original db, it added
global_character_set UTF-8 to smf_settings, so the db is identified as being utf-8, but SMF ignores it. Why doesn't SMF use the db setting, rather than Settings.php, to determine the correct collation? That would eliminate this problem altogether.
Regarding to what I'm quoting here, I can tell you that this can not be easy, since the way the UTF-8 data is stored is to be decided by how MySQL is configured with the ISP. Or in other words, it wouldn't make it easy to support all possible configurations.
And for the db setting in Settings.php, you will have to click UTF-8 when installing SMF. Arantor also told me that.
However, I can tell from your remarks, that you have been on the same road as I was, only difference, I inherited the SMF installation and database.
On the other hand, I still believe that having a db_character_set setting and a language template ISO-8859-1 that must not be uninstalled [stuff depends on that even with UTF-8], and a SMF that switches encoding, when you enable the UTF-8 template [that you will have to install first, SMF should have the UTF-8 language too and work from there i make of that]
And also, on the other hand [someone else's, haha], it might not be possible for a PHP coded program without root privileges retrieve the proper collation and charset from MySQL, since even a latin 1 database supports all encodings and charsets [and does that very precise, fast]
If MySQL, is configured [be able[ to talk UTF-8 with the client, it doesn't matter what is sent by that client. You only can't run with a ISO-8859-1 language template forum, because that will show Latin-1 encoded UTF-8, tadaa.
And some final words; somewhere the Site Admin has to be told and should gather understanding of what is a MySQL database [or how data looks in that dumpfile], to know about language templates and that db-setting switch. phpMyAdmin's way of reporting how MySQL is collated/ encoded, doesn't make me feel attracted to that. because my, MySQL is reporting that it talks Latin1 with the [any] client and phpMyAdmin is reporting UTF8mb4, and I'm awaiting for it to fail, haha.
In this case, it's basically the same difference. Retrieving
$db_character_set = 'utf8'; from
Settings.php is handled in
Load.php here -
// Most database systems have not set UTF-8 as their default input charset.
if (!empty($db_character_set))
$smcFunc['db_query']('set_character_set', '
SET NAMES ' . $db_character_set,
array(
)
);
Just as easy to do this -
/// Check db for UTF-8 collation
if ($modSettings['global_character_set'] == 'UTF-8')
$smcFunc['db_query']('set_character_set', '
SET NAMES ' . 'utf8',
array(
)
);
... and retrieve the setting from the database, so it's available wherever the database is imported.
But what do I know.
More then I do, with programming I suppose, haha.
I'm not sure on what '$modSettings', means [what it does], and this might work for you. But if it is possible that a user inserts a Latin encoded database dumpfile into a UTF-8 talking MySQL database, all special characters will end up 'ich'.
But that might relate to the situations I've been, because that settings, I saw in the database file, and it was empty. It also wasn't showing ISO-8859-1 and I believe there is no other option for encoding scheme's. [and how on earth did that got lost]
So I would, test the whole database, for non-Latin characters, that is easy, I've done it, then, manually convert it with iconv [possibly also the way it is done with SMF, and don't ask me how this is done on a WAMP server, haha] and then send the required command with phpMyAdmin to MySQL, to change character set and collation of all the tables. <- this doesn't change the content inside the database on any table.
What I don't get, is that people that are programmers, still have been able to create, MyISAM tables while InnoDB is prefered, and, that they are able to, still, put the encoding to Latin1, or 'swedish chef', haha. [or that a UTF-8 running SMF forum, still modifications aren't updating language in the UTF-8 version of that language file]. The latter is easy to overcome though.
And a final point. A UTF-8 running SMF forum, does not, display non-Latin characters properly, with the 'okay users' message inside Settings.php, tadaa.
Quote from: Sir Osis of Liver on November 28, 2016, 09:35:18 PM
In this case, it's basically the same difference. Retrieving $db_character_set = 'utf8'; from Settings.php is handled in Load.php here -
// Most database systems have not set UTF-8 as their default input charset.
if (!empty($db_character_set))
$smcFunc['db_query']('set_character_set', '
SET NAMES ' . $db_character_set,
array(
)
);
Just as easy to do this -
/// Check db for UTF-8 collation
if ($modSettings['global_character_set'] == 'UTF-8')
$smcFunc['db_query']('set_character_set', '
SET NAMES ' . 'utf8',
array(
)
);
... and retrieve the setting from the database, so it's available wherever the database is imported.
But what do I know. 
That will not work as modsettings doesn't exist at that point in the code.
You don't replace the existing code where it is, it has to be added after the array is loaded, about a half page down. Makes no difference if the old code remains, they both do the same thing. Makes more sense (imo) to ask the database if it's utf-8, rather than using Settings.php.
That wouldn't be sufficient for an SMF forum with ISO-8859-1 language template. Then the connection with the database can be utf8 and the contents of the posts and messages will be Latin encoded to that utf8 database [èéëòóöàáä -> èéëòóöà áä]. But I did see it drop something right there and told Arantor about that. And what I told in this thread, that Settings.php itself isn't utf8 aware.
Quote from: Sir Osis of Liver on December 02, 2016, 11:26:10 AM
You don't replace the existing code where it is, it has to be added after the array is loaded, about a half page down. Makes no difference if the old code remains, they both do the same thing. Makes more sense (imo) to ask the database if it's utf-8, rather than using Settings.php.