News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

[Maybe solved] SMF 1.01 in UTF-8 encoding, and searching problem

Started by mactable, February 01, 2005, 06:06:06 AM

Previous topic - Next topic

mactable

Hello,

i have posted a testing message on:
http://www.simplemachines.org/community/index.php?topic=26610.0
the wordings in both traditional and simplified chinese.

i am sorry i am not sure if you can read the above post or not, although your SMF here encoding in ISO western european only, i could also read what i typed, and i hope you so.

my long story is, since i would like to use both traditional and simplified chinese on my SMF, so i choose SMF encoded in UTF-8 and MySQL database encoded in default UTF-8 also, SMF UTF-8 works well for input and display traditional and simplified chinese at same time, but 1 problem:  some chinese wordings i cannot search, for example when i search for "功" but SMF return nothing,

i think chinese wording "功" is a special case, because there is no problem for other chinese wordings, and i am not sure how many special case just like "功" which cause fail to return correct result

but i discover a funny case, if SMF is encoded in normal (just like your SMF here, encoded in ISO western european), whatever MySQL database encoded in UTF-8 or iso-8859, SMF could input, display and search all chinese correctly, although it could solve my problem but i am a little confusing -- because i am using asian languages on my site, but how come SMF could correct handle asian languages correctly even SMF encoded in normal ISO western eurpoean ? does it mean SMF could handles asian languages (or UTF-8 formatted wordings) independently and not related to MySQL database encoding.

i am sorry i am very new to SMF so please advise.  thx in advance.

P.S. my platform is:
PHP 4.3.10
apache 2
mysql 4.1.8 (database tried both UTF-8 and iso-8859 encoding)
win2003


CapriSkye

just a note, his problem doens't happen on my forum, which has utf-8 encoding.
the server is using mysql 4.0.22 with iso-8859 encoding i believe.

mactable

#2
i dig into "search.php"

i added:
echo $word;

before:
if (empty($modSettings['search_match_complete_words']))

i have discovered that when i search for wording "功", search.php will change my search wording into "嚿", they are is totally different word. 

therefore, i tried to modified:
from:
$searchArray[$index] = addslashes(strtolower(trim($value)));

to:
$searchArray[$index] = addslashes(trim($value));

after modified, when i search for "功", right result return.

i am not a php expert, so i am not sure any side effect of this modification, could any developer advise something?

thx a lot!

P.S. i need to say again, most of chinese wordings could search correctly, some special wordings like "功", "頌" etc. are special cases.

[Unknown]


CapriSkye

i thought mbstring might solve this, but not being able to on my test forum, maybe i had it set up wrong? ::)

[Unknown]

Well, you might also try:

setlocale(LC_CTYPE, 'locale for Chinese');

That should set the locale stuff up for strtolower/strtoupper.

-[Unknown]

Advertisement: