SMF2.1 Typo/Bug in Subs-Charset.php, misspelled array element name, fix included

Started by NightWolve, May 14, 2022, 12:43:00 PM

Previous topic - Next topic

NightWolve

Version: SMF 2.1.0+
File: .\Sources\Subs-Charset.php
Find:  $classes['viramas']   // or Line: 680
Fix/Replace: $classes['Virama']


Info/Details:

While testing the new SMF 2.1.2, I saw this in the error logs:
2: Undefined array key "viramas"
./Sources/Subs-Charset.php
Line: 680


So I did some digging around the source code to find out where this $classes array is populated and why is "viramas" not filled at that point in execution.

I was eventually led to .\Sources\Unicode\RegularExpressions.php

'Saurashtra' => array(
'All' => '\x{A880}-\x{A8C5}\x{A8CE}-\x{A8D9}',
'Letter' => '\x{A882}-\x{A8B3}',
'Nonspacing_Combining_Mark' => '\x{A8C4}',
'Nonspacing_Mark' => '\x{A8C4}-\x{A8C5}',
'Virama' => '\x{A8C4}',
'Vowel_Dependent' => '\x{A8B5}-\x{A8C3}',
),

Now looking at some of the code in Subs-Charset where the bug occurred, I noticed the other element names matched (e.g. Vowel_Dependent,Nonspacing_Mark,etc.) but viramas versus Virama was the mismatch:

$nonspacing_marks = '[' . $classes['Nonspacing_Mark'] . ']*';
$nonspacing_combining_marks = '[' . $classes['Nonspacing_Combining_Mark'] . ']*';
$zwj_pattern = '\x{200D}(?!' . (!empty($classes['Vowel_Dependent']) ? '[' . $classes['Vowel_Dependent'] . ']|' : '') . '[^' . $classes['All'] . '])';
...
$pattern = $letter . $nonspacing_marks . '[' . $classes['viramas'] . ']' . $nonspacing_combining_marks . '\K' . (!empty($zwj_pattern) ? '(?:' . $zwj_pattern . '|' . $zwnj_pattern . ')' : $zwnj_pattern);


I think this bug went unnoticed because it's only triggered by some rare special smiley/emoji characters and what not. I had a member try to use some emoji at the end of his post which now shows up as garbled text because of that failed conversion with that Virama variable I assume.

Anyway, figured I'd quickly stop by and report it to the team.

Arantor

Tracked on GitHub #7464.

Thanks for the report, it looks pretty good :)

Sesquipedalian

I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

shawnb61

Address the process rather than the outcome.  Then, the outcome becomes more likely.   - Fripp

Advertisement: