UTF8 charset and pasting

Started by piercej, August 31, 2014, 03:26:36 PM

Previous topic - Next topic

piercej

Many of my users like to compose lengthier articles in their word processor and paste it into a forum topic.   Seemingly, the stuff being pasted often has invalid UTF8 encodings, especially 'funny quotes' in ISO8859 or whatever, like 'quoted', and my postgresql database is quite unhappy about that so it rejects their post.

is there anything I can do about that on the SMF side?

interestingly, that worked on this forum.  on mine that post above would have rejected the quotes around quoted as not being utf8.  On my server, that same string gives, ERROR: invalid byte sequence for encoding "UTF8": 0x91... in UTF8 that should be U+2018, or 0xE2 0x80 0x98


hmmmm.   This forum has <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">    while mine says its ISO8859-1.   How can I change that?   Is that coming from the Theme templates?

Arantor

Is your forum set to use UTF-8 first of all?

mashby

You could instruct your users to paste their content into something like Notepad and copy and paste from that.
Always be a little kinder than necessary.
- James M. Barrie

Arantor

Actually that wouldn't necessarily fix things depending on what's in use.

It does sound to me as though SMF isn't set up to be using UTF-8 which will cause havoc in either MySQL or PostgreSQL but especially the latter. (Not that PostgreSQL is especially recommended for SMF anyway.)

piercej

Quote from: ⬔ on August 31, 2014, 03:27:48 PM
Is your forum set to use UTF-8 first of all?

I just realized, its not (ISO8859-1).  how can I change that? I'm not seeing it on the admin pages.

Arantor

Admin > Maintenance > Forum Maintenance > Database

Take a backup first (and not from the admin panel, do it from phpPgAdmin or whatever tool you would normally use with your PostgreSQL installation)

piercej

Quote from: piercej on August 31, 2014, 03:35:02 PM
I just realized, its not (ISO8859-1).  how can I change that? I'm not seeing it on the admin pages.


and  I found it, in Themes/default/languages/index.english.php

Arantor

Please DON'T change the language file.

piercej

it has $txt['lang_character_set'] = 'ISO8859-1' in it.    where else should I change this instead ?    when I changed it there, my forum switched to charset="UTF-8", which is what I want.


piercej

Quote from: ⬔ on August 31, 2014, 03:33:55 PM
(Not that PostgreSQL is especially recommended for SMF anyway.)

Why is this?   One of primary reasons I *chose* SMF over some of the alternatives was that it supported PostgreSQL, which I much prefer to MySQL or its forks.

Arantor

Because it's still built for MySQL first and foremost. There is no optimisation for PostgreSQL beyond it working, and every query is still written for MySQL with sufficient hammering to make it fit PostgreSQL.

In addition, there is virtually no-one here that is fluent in how to tune PostgreSQL for SMF (or vice versa)

Kindred

Changing that text string does not actually change your forum to be UTF-8

Do what Arantor suggested, because you have to upgrade your database as well as the forum display. Finally, you will have to download and install the UTF8 version of the language files.

As for Postgre...   Technically, it is supported and works. However, it is not supported by most/many mods and
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

piercej

the database is already in UTF8, so I dunno what I have to convert?

so where are these alternate utf8 language files for English ?   ok, I see english-utf8 via teh 'add language' interface....   wait... am I really supposed to make the whole darn system writable by the webserver?   that is SUCH a gaping huge security exposure.


Kindred

Well, yes..  To load new things, you need to make things writable.

You can always change if back later....

But, honestly, it is not a huge gaping security issue unless the server gets compromised in the first place...  Which is not to say that permissions should not be locked down, because they should.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Advertisement: