[4081][2.0 RC2] 2 slashes before apostrophes in PMs with UTF-8

Started by MultiformeIngegno, November 14, 2009, 09:23:16 AM

Previous topic - Next topic

MultiformeIngegno

Quote from: lorenzone92 on December 03, 2009, 01:08:09 PM
I've tried to "convert" 'em to utf8_general_ci, it says "query completed" but it's not true! They're always latin1!!
Yeah!! I've converted 'em (don't know why I wasn't able before) to utf8_general_ci!

Now I'll ask at users if it's solved! ;)
RockCiclopedia (wiki - forum), Tutta la storia del rock, scritta da voi ...
Rimanere aggiornati sul mondo della musica grazie al nuovo feed "RockCiclopedia Music News"!

Norv

Quote from: joec88 on December 05, 2009, 05:10:39 PM
My SMF tables are "utf8_general_ci" but it seems any mods that have created tables are all "utf8_unicode_ci" if that helps at all.

This happens most likely because your database has its default collation set as "utf8_unicode_ci", thus any mod that doesn't specify one, will have its table created with the default. You might want to change the default collation of the database in phpmyadmin: having selected the database, in the Operations tab, you can find the current default collation and can set it to "utf8_general_ci". This won't convert the tables that already have "utf8_unicode_ci", but will make sure that for the future, any tables not specifying a collation will have the default.

Converting an existing table may be a little more complicated. For changing from "utf8_unicode_ci" to "utf8_general_ci", being the same character set (utf8), it should work if you change collation for the table (select the table, and go to the Operations tab, set it as the new value), and separately for each text-based row (similarly).
Please note that this is NO conversion going on: it's only setting the value as something else. The data itself is not converted to another new charset, but since the charset is "utf8" in both cases, there is no need to.

lorenzone92: Please note that converting an existing table between different character sets, however, is more dangerous. I strongly recommend to back up all your database before trying, and carefully verify the result, to make sure your data is still displayed properly.
Then, in order to convert the data itself from another charset to utf8, please do backup your database, then try:

ALTER TABLE smf_tablename CONVERT TO CHARACTER SET "utf8" COLLATE "utf8_general_ci";

(where "smf_tablename" should be your tables prefix and table name)
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

b4pjoe

Quote from: Norv on December 06, 2009, 07:18:30 AM
Quote from: joec88 on December 05, 2009, 05:10:39 PM
My SMF tables are "utf8_general_ci" but it seems any mods that have created tables are all "utf8_unicode_ci" if that helps at all.

This happens most likely because your database has its default collation set as "utf8_unicode_ci", thus any mod that doesn't specify one, will have its table created with the default. You might want to change the default collation of the database in phpmyadmin: having selected the database, in the Operations tab, you can find the current default collation and can set it to "utf8_general_ci". This won't convert the tables that already have "utf8_unicode_ci", but will make sure that for the future, any tables not specifying a collation will have the default.

Converting an existing table may be a little more complicated. For changing from "utf8_unicode_ci" to "utf8_general_ci", being the same character set (utf8), it should work if you change collation for the table (select the table, and go to the Operations tab, set it as the new value), and separately for each text-based row (similarly).
Please note that this is NO conversion going on: it's only setting the value as something else. The data itself is not converted to another new charset, but since the charset is "utf8" in both cases, there is no need to.

Thanks for the info. I have changed everything to "utf8_general_ci" as you explained but I'm still getting the extra backslashes in PM's as shown in my other post here: Extra Backslashes

MultiformeIngegno

Quote from: joec88 on December 06, 2009, 11:40:34 AM
Thanks for the info. I have changed everything to "utf8_general_ci" as you explained but I'm still getting the extra backslashes in PM's as shown in my other post here: Extra Backslashes
Same for me! Now my tables are all utf8_general_ci but the problem remains...
RockCiclopedia (wiki - forum), Tutta la storia del rock, scritta da voi ...
Rimanere aggiornati sul mondo della musica grazie al nuovo feed "RockCiclopedia Music News"!

b4pjoe

Quote from: lorenzone92 on December 06, 2009, 12:52:45 PM
Quote from: joec88 on December 06, 2009, 11:40:34 AM
Thanks for the info. I have changed everything to "utf8_general_ci" as you explained but I'm still getting the extra backslashes in PM's as shown in my other post here: Extra Backslashes
Same for me! Now my tables are all utf8_general_ci but the problem remains...

lorenzone92,

I have created a fresh install of SMF 2.0 RC2 and this issue isn't happening on it. I suspect it might be one of the mods I have installed on my production forum. Do you have any of these mods installed?

Edit: I have now installed these mods on my freshly created forum. After installing each mod I checked the problem for the extra backslashes and after installing all of them I do not have the extra backslash problem on the new install of SMF 2.0 RC2.

Norv

Is the new forum UTF8 as well?
Were you always able previously to replicate the issue, or had it anything random in it?
Did you have any errors on the live site?
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

b4pjoe

Quote from: Norv on December 06, 2009, 04:33:54 PM
Is the new forum UTF8 as well?
Were you always able previously to replicate the issue, or had it anything random in it?
Did you have any errors on the live site?

Yes it is UTF8.

I never had this issue until I upgraded from RC1.2 to RC2. I still have a copy of RC1.2 and I don't have the problem with it.

No errors at all.

Not sure if it will help but I've noticed if I create a new PM and type in

Don't won't can't

and hit the Preview button the text in the PM text box changes to:

Don\'t won\'t can\'t

If I hit preview again it changes to:

Don\\'t won\\'t can\\'t

Every time it is quoted or previewed it adds another backslash right into the PM text box.

Norv

Can we have a link to your forum please, and eventually a test account to see the issue? Normal member test account, with the right to send/receive PMs.
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

b4pjoe

Norv, I have sent you a PM with the login info. Thanks.

Oops: The link is http://www.b4print.com

MultiformeIngegno

Quote from: Skhilled on November 16, 2009, 09:54:42 AM
There is a pm issue where you use an apostrophe in a word like"it's" and then "preview" the pm it will and an escape to it ( \ ) after sending it. But also every time that you preview it will add an extra escape to the pm.
Any news...?
RockCiclopedia (wiki - forum), Tutta la storia del rock, scritta da voi ...
Rimanere aggiornati sul mondo della musica grazie al nuovo feed "RockCiclopedia Music News"!

Norv

I am still unable to reproduce this, no matter how I tried to set up the test forum. Any hint about your configuration, including language packs installed may help.

joec88: as you mention, your forum database tables are utf8, however your forum pages at http://www.b4print.com are ISO-8859-1. Can you please, the file Settings.php in your forum directory, does it contain any value like:

$db_character_set = 'utf8';

Then, if you log in phpmyadmin, and run:

select value from smf_settings where variable = 'global_character_set';

what is the result?
Can you please tell what language pack you use, too?

lorenzone92, if you don't mind, the same questions would apply to you, as well. :)
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

MultiformeIngegno

Quote from: Norv on December 08, 2009, 03:30:21 PM
joec88: as you mention, your forum database tables are utf8, however your forum pages at http://www.b4print.com are ISO-8859-1. Can you please, the file Settings.php in your forum directory, does it contain any value like:

$db_character_set = 'utf8';

Yes, in Settings.php the character set is utf8
Quote from: Norv on December 08, 2009, 03:30:21 PM
Then, if you log in phpmyadmin, and run:

select value from smf_settings where variable = 'global_character_set';

what is the result?
I think I can't run commands, anyway all my tables are utf8_general_ci.

Quote from: Norv on December 08, 2009, 03:30:21 PM
Can you please tell what language pack you use, too?
Italian UTF 8... :)
RockCiclopedia (wiki - forum), Tutta la storia del rock, scritta da voi ...
Rimanere aggiornati sul mondo della musica grazie al nuovo feed "RockCiclopedia Music News"!

Norv

If you can login in phpmyadmin, you should be able run SQL code by writing the code in the SQL tab, and hit 'Go'.

Please make sure you have a database backup before running SQL with phpmyadmin.
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

b4pjoe

Quote from: Norv on December 08, 2009, 03:30:21 PM
I am still unable to reproduce this, no matter how I tried to set up the test forum. Any hint about your configuration, including language packs installed may help.

joec88: as you mention, your forum database tables are utf8, however your forum pages at http://www.b4print.com are ISO-8859-1. Can you please, the file Settings.php in your forum directory, does it contain any value like:

$db_character_set = 'utf8';

Then, if you log in phpmyadmin, and run:

select value from smf_settings where variable = 'global_character_set';

what is the result?
Can you please tell what language pack you use, too?

I do not have a line starting with "$db_character_set" in my Settings.php file

When I run the SQL statement I get:



My language is English

Norv

To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

b4pjoe

Quote from: Norv on December 09, 2009, 01:57:44 AM
Please do add the line in your Settings.php.


I have added the line to Settings.php.

Ran the SQL query again. Same result as before.

Still have the same problem with the PM's getting the backslash added before apostrophes when quoting or previewing.

[SAP]Francis

Guys, you are searching way too far away for nothing. The bug is even present on this very site. Try it the way joec88 said, type something with apostrophes and then press preview, you will see.

Vehicles Forum

Founded By Francis Morissette

b4pjoe

Quote from: [SAP]Francis on December 10, 2009, 01:02:28 PM
Guys, you are searching way too far away for nothing. The bug is even present on this very site. Try it the way joec88 said, type something with apostrophes and then press preview, you will see.

I don't have the issue on this site.

b4pjoe

I got my site to stop adding the backslash. It seems it is user independent. If a user has the option checked to "Show WYSIWYG editor on post page by default." in their profile you get the extra backslash in the PM when you preview. Uncheck the box for that and the problem goes away. See first attachment.

And after doing some testing on the PM's here at this SMF site, if you click the Toggle View button as shown in the 2nd attachment and then type a word with an apostrophe, then click preview you will get the backslash.

Norv

Confirmed! You nailed down the right scenario: the issue is happening when using WYSIWYG. Thank you very much for the report, it will be looked into as soon as possible.
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

Advertisement: