News:

Want to get involved in developing SMF, then why not lend a hand on our github!

Main Menu

"shorten_subject" and UTF-8 problem in SMF 1.1 RC3

Started by badbuta, August 23, 2006, 06:51:33 AM

Previous topic - Next topic

badbuta

I just upgrade my site to RC3 with Bridge 1.1.6.
I find that the shorten subject function does not function correctly for UTF-8 (Chinese) character.
It shows some strange characters at the end: "�..."

Please see my site to have a look:
hxxp:www.coaxialcomic.org/smf/ [nonactive]

Besides, I have tried to modify "src/load.php @#192" based on another fix for RC2 but FAIL~~:
From:
return strlen(preg_replace.....

To:
return mb_strlen(preg_replace.....

Any suggestion?

Thank you very much.
-David

Compuart

Hendrik Jan Visser
Former Lead Developer & Co-founder www.simplemachines.org
Personal Signature:
Realitynet.nl -> ExpeditieRobinson.net / PekingExpress.org / WieIsDeMol.Com

badbuta

#2
Sorry, I donno how to get the charset you mentioned precisely.
I am using Lunarpage hosting and there is myPhpAdmin 2.8.0.2.  The sql server is MySQL - 4.0.25-standard.  I can only get the character set information from myPhpAdmin:


Server variables and settings:

character set: latin1

character sets: latin1 big5 czech euc_kr gb2312 gbk latin1_de sjis tis620 ujis dec8 dos german1 hp8 koi8_ru latin2 swe7 usa7 cp1251 danish hebrew win1251 estonia hungarian koi8_ukr win1251ukr greek win1250 croat cp1257 latin5


The problem happened after I upgraded from RC2 to RC3.  I remembered that I did some changes in the code and database.  However, I forget the details....  :-[

So, would you mind to teach me how to get the information to you? e.g. The way to get the character set information from myPhpAdmin...etc.

FYI:
I am using Joomla 1.0.10.  Joomla and SMF are sharing the same database.

badbuta

#3
I am finally retry and patch the sourcecode: Load.php
The key is: I missed setting mb_internal_encoding to UTF8.
It is working now (at least~~), YEAH!!!  :D

add line after line #24, like this:

if (!defined('SMF'))
die('Hacking attempt...');
mb_internal_encoding("UTF-8");


replace line ~#192 from:

return strlen(preg_replace(\'~' . $ent_list . ($utf8 ? '|.~u' : '~') . '\', \'_\', ' . implode('$string', $ent_check) . '));'),

to:
return mb_strlen(preg_replace(\'~' . $ent_list . ($utf8 ? '|.~u' : '~') . '\', \'_\', ' . implode('$string', $ent_check) . '));'),

I understand it is not the best (and complete) solution.  However, still hoping new release can handle better for UTF8 ...

Advertisement: