Advertisement:

Author Topic: [3.0] Full UTF8 support  (Read 38725 times)

Offline Suki

  • Customizer
  • SMF Super Hero
  • *
  • Posts: 15,097
  • Kaizoku Jotei
    • MissAllSunday on GitHub
    • SMF mods
[3.0] Full UTF8 support
« on: September 19, 2011, 11:18:43 AM »
Hi all, the Devs are considering going with full UT8 support instead of the current ANSI/UTF8.


We will like to hear all opinions about this, please try to consider all angles that can possibly influence this such as server requirements,  database sizes, hosting restrictions, etc.


Please share your thoughts on this :)
« Last Edit: January 19, 2012, 04:42:56 PM by Norv »
Look at them. They're just asking for it. Maybe the human race deserves to be wiped out.

Offline Kryzen

  • On Hiatus
  • SMF Hero
  • *
  • Posts: 4,046
  • Gender: Male
    • nedroden on GitHub
Re: Full UTF8 support
« Reply #1 on: September 19, 2011, 11:32:35 AM »
I always use the latin set, but going full utf8 would be a good idea though
Amateur PHP & Java developer
DraiWiki | Project Alpha

Offline live627

  • SMF Friend
  • SMF Hero
  • *
  • Posts: 5,265
  • Gender: Male
  • Cat: Destroy!
    • live627 on Facebook
    • live627 on GitHub
    • live627 on LinkedIn
    • @live627 on Twitter
    • livemods
Re: Full UTF8 support
« Reply #2 on: September 20, 2011, 06:36:38 PM »
It's absolutely a no brainer. Chaarset support  problems would mostly vanish, large mods don't have to ****** around with including ANSI/UTF languages, translators need only to translate into utf-8 encoding.
Try not to become a man of success, but rather try to become a man of value.
- Albert Einstein

Offline 青山 素子

  • Server Team
  • SMF Super Hero
  • *
  • Posts: 17,022
  • 戦場ヶ原、蕩れ!
    • srvrguy on GitHub
    • @motokochan on Twitter
    • Nekomusume Moe
Re: Full UTF8 support
« Reply #3 on: September 30, 2011, 11:39:30 AM »
I have to agree with this. If it is technically possible, moving to full Unicode support would be beneficial. Most hosting providers have upgraded SMF's dependencies to versions that support this.
Motoko-chan
Director, Simple Machines

Just because it's pouring down doesn't mean we're gonna drown. There's a time when all you can say is let it rain - Mat Kearney (Let It Rain)

Note: Unless otherwise stated, my posts are not representative of any official position or opinion of Simple Machines.


Offline Xarcell

  • SMF Hero
  • ******
  • Posts: 1,684
  • Gender: Male
  • SMF-DP Supporter
Re: Full UTF8 support
« Reply #4 on: October 05, 2011, 01:09:04 PM »
+1

As it was said, a no brainer.

Offline 青山 素子

  • Server Team
  • SMF Super Hero
  • *
  • Posts: 17,022
  • 戦場ヶ原、蕩れ!
    • srvrguy on GitHub
    • @motokochan on Twitter
    • Nekomusume Moe
Re: Full UTF8 support
« Reply #5 on: October 06, 2011, 12:58:08 AM »
+1

As it was said, a no brainer.

It certainly wasn't when SMF 2.0 was being designed. Back then, PHP4 was still widely used (still is in some areas...) and proper Unicode support was difficult to come by without a ton of effort. With the huge move to PHP5 and cleaned up support, it would be silly to not fully support Unicode in a widely-used software.
Motoko-chan
Director, Simple Machines

Just because it's pouring down doesn't mean we're gonna drown. There's a time when all you can say is let it rain - Mat Kearney (Let It Rain)

Note: Unless otherwise stated, my posts are not representative of any official position or opinion of Simple Machines.


Offline Dzonny

  • Lead Localizer
  • SMF Super Hero
  • *
  • Posts: 11,617
  • Gender: Male
  • No sleep...
    • dzontra.nikola on Facebook
    • Dzonny on GitHub
    • dzontranikola on LinkedIn
    • @opusteniforum on Twitter
    • Samo opusteno
Re: Full UTF8 support
« Reply #6 on: October 09, 2011, 07:35:49 PM »
+1
It would be easier if new members don't have to deal with charset problems, full utf8 support would fix many possible issues.

Offline Daniel Hofverberg

  • Senior Translator
  • Sr. Member
  • *
  • Posts: 981
  • Gender: Male
    • Dubbningshemsidan
Re: Full UTF8 support
« Reply #7 on: October 11, 2011, 03:54:45 AM »
Of course it's preferable with full UTF-8 support for those that do want it. However, I do not want to be forced to use UTF-8, as I prefer good old ISO-8859-1.
 

Offline Nightwish

  • Jr. Member
  • **
  • Posts: 101
  • Gender: Male
    • tabSRMM support
Re: Full UTF8 support
« Reply #8 on: October 11, 2011, 04:50:47 AM »
Of course it's preferable with full UTF-8 support for those that do want it. However, I do not want to be forced to use UTF-8, as I prefer good old ISO-8859-1.
Bad idea.

Seriously, it's 2011. The web should be UTF-8 only. Period. 7/8bit character sets are a relict of ancient days and should die. They are among the most annoying things a developer has to deal with - Unicode makes things so much easier. Time to move on and forget old habits.
Every program has at least one bug and can be shortened by at least one instruction -- from which, by induction, one can deduce that every program can be reduced to a single instruction that doesn't work.
EoS - SMF-based forum under development.

Offline Fustrate

  • SMF Friend
  • SMF Hero
  • *
  • Posts: 6,474
  • Gender: Male
  • Controller of the rum budget
    • Fustrate on GitHub
    • @Fustrate on Twitter
    • Fustrate
Re: Full UTF8 support
« Reply #9 on: October 11, 2011, 05:03:57 AM »
Of course it's preferable with full UTF-8 support for those that do want it. However, I do not want to be forced to use UTF-8, as I prefer good old ISO-8859-1.
 
Do you have a reason for preferring it, or do you just not want to change? Honest question.
Steven Hoffman
Former Team Member, 2009-2012

Offline Daniel Hofverberg

  • Senior Translator
  • Sr. Member
  • *
  • Posts: 981
  • Gender: Male
    • Dubbningshemsidan
Re: Full UTF8 support
« Reply #10 on: October 11, 2011, 05:10:21 AM »
As the rest of my web site is using ISO-8859-1, using UTF-8 for just the forum would cause problems with the integration. That would make SSI.php and other aspects a pain to deal with, unless changing the character set on the entire site. I also don't see any specific need for UTF-8 on my site, as all characters I need to use for the Swedish language is present in Latin1.
 
As my site consist of closer to 1000 pages, moving the entire site over to UTF-8 with no real benefit doesn't really sound too pleasing...
 

Offline Fustrate

  • SMF Friend
  • SMF Hero
  • *
  • Posts: 6,474
  • Gender: Male
  • Controller of the rum budget
    • Fustrate on GitHub
    • @Fustrate on Twitter
    • Fustrate
Re: Full UTF8 support
« Reply #11 on: October 11, 2011, 05:12:42 AM »
I believe there are PHP functions for converting between character sets, though I don't use them often enough to remember their names.

Found it, though: http://www.php.net/manual/en/function.iconv.php

You'd just use that on anything from the forum, to convert from UTF8 to ISO-8859-1
Steven Hoffman
Former Team Member, 2009-2012

Offline 青山 素子

  • Server Team
  • SMF Super Hero
  • *
  • Posts: 17,022
  • 戦場ヶ原、蕩れ!
    • srvrguy on GitHub
    • @motokochan on Twitter
    • Nekomusume Moe
Re: Full UTF8 support
« Reply #12 on: October 11, 2011, 06:15:22 PM »
Of course it's preferable with full UTF-8 support for those that do want it. However, I do not want to be forced to use UTF-8, as I prefer good old ISO-8859-1.

Could be worse, it could be windows-1252 (ugh).


I also don't see any specific need for UTF-8 on my site, as all characters I need to use for the Swedish language is present in Latin1.

Then you shouldn't see a difference, actually. The only possible issues would be with characters outside Latin-1 like the "smart quote" and such. Heck, you could probably send your Latin1 pages as UTF8 without any changes as HTML entities would still work the same way for characters.
Motoko-chan
Director, Simple Machines

Just because it's pouring down doesn't mean we're gonna drown. There's a time when all you can say is let it rain - Mat Kearney (Let It Rain)

Note: Unless otherwise stated, my posts are not representative of any official position or opinion of Simple Machines.


Offline spiros

  • Language Moderator
  • SMF Hero
  • *
  • Posts: 1,604
  • Gender: Male
  • A different point of view
    • spiros.doikas on Facebook
    • doikas on LinkedIn
    • @greektranslator on Twitter
    • Greek Translation
Re: Full UTF8 support
« Reply #13 on: November 05, 2011, 08:37:13 AM »
I could not agree more. There are so many mods which do not support UTF-8 and one has to hack them in order to work in a UTF-8 forum. This should NOT be happening.

Online Kindred

  • The Mean One
  • Support Specialist
  • SMF Legend
  • *
  • Posts: 55,121
  • Gender: Male
    • Kindred-999 on GitHub
Re: Full UTF8 support
« Reply #14 on: November 05, 2011, 09:59:41 AM »
well, that would be a MOD problem, not an SMF issue....
Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

Offline spiros

  • Language Moderator
  • SMF Hero
  • *
  • Posts: 1,604
  • Gender: Male
  • A different point of view
    • spiros.doikas on Facebook
    • doikas on LinkedIn
    • @greektranslator on Twitter
    • Greek Translation
Re: Full UTF8 support
« Reply #15 on: November 05, 2011, 10:20:49 AM »
Indeed, but if SMF does not enforce strict guidelines (i.e. mod compatibility with UTF-8), many mod developers are inclined to ignore it.

Offline Nightwish

  • Jr. Member
  • **
  • Posts: 101
  • Gender: Male
    • tabSRMM support
Re: Full UTF8 support
« Reply #16 on: November 06, 2011, 02:02:20 PM »
I could not agree more. There are so many mods which do not support UTF-8 and one has to hack them in order to work in a UTF-8 forum. This should NOT be happening.
Then trash these mods, period. No mod can be important enough to stand in the way of a modern design and getting rid of ancient character set support *is* an important part of a modern design.

Seriously, it's 2011, people who still insist on ancient 8bit character set support must have been sleeping under a rock for the past 10 years. Most modern web applications are UTF-8 only, because it is, by far, the easiest way to support multiple languages.
Every program has at least one bug and can be shortened by at least one instruction -- from which, by induction, one can deduce that every program can be reduced to a single instruction that doesn't work.
EoS - SMF-based forum under development.

Offline spiros

  • Language Moderator
  • SMF Hero
  • *
  • Posts: 1,604
  • Gender: Male
  • A different point of view
    • spiros.doikas on Facebook
    • doikas on LinkedIn
    • @greektranslator on Twitter
    • Greek Translation
Re: Full UTF8 support
« Reply #17 on: November 06, 2011, 03:06:14 PM »
I thought that php 6 (what happened to it by the way?) was meant to be the version to fully support UTF-8 in its core. If that had happened, there should not have been many excuses left.

Offline Fustrate

  • SMF Friend
  • SMF Hero
  • *
  • Posts: 6,474
  • Gender: Male
  • Controller of the rum budget
    • Fustrate on GitHub
    • @Fustrate on Twitter
    • Fustrate
Re: Full UTF8 support
« Reply #18 on: November 06, 2011, 06:26:53 PM »
iirc, they turned PHP 6 into PHP 5.4
Steven Hoffman
Former Team Member, 2009-2012

Offline Angelina Belle

  • SMF Friend
  • SMF Hero
  • *
  • Posts: 7,586
Re: Full UTF8 support
« Reply #19 on: November 08, 2011, 08:04:56 PM »
I understand that converting code to work with UTF8 will take some work -- strlen() won't be dependable (depends on the language), etc. How difficult will it be for mod writers to convert their code?

If someone is trying to integrate an ISO-8859-1 website with a UTF8 SMF forum, and was using the iconv functions to convert EVERYTHING, what kind of performance hit would that take?

What are the implications of switching to using all UTF8 in MYSQL tables? Any affect on performance?

I have never made the switch to UTF-8 myself, since I don't feel I really understand the implications.
Never attribute to malice that which is adequately explained by stupidity. -- Hanlon's Razor