SMF Development > Bug Reports
[4368]Word censor should disable option to be limited to whole words in UTF-8
(1/1)
Arantor:
The word censor provides the option for restricting censor matches to whole words, or simply match in place.
When matching whole words, this relies on using the \w marker (or is it \W, I forget, either way it's a PCRE control character), which is fine -- until you're in UTF-8 mode.
As covered by http://www.simplemachines.org/community/index.php?topic=363219.msg2612779#msg2612779 and specifically I'll requote the paragraph from the PHP manual on the subject:
--- Quote ---Matching characters by Unicode property is not fast, because PCRE has to search a structure that contains data for over fifteen thousand characters. That is why the traditional escape sequences such as \d and \w do not use Unicode properties in PCRE.
--- End quote ---
Since PCRE does not support \w in UTF-8 mode, the option is actually pointless (since it doesn't work) so the option should be removed when in UTF-8 mode.
N. N.:
Tracked.
Navigation
[0] Message Index
Go to full version