Simple Machines Community Forum

SMF Development => Feature Requests => Topic started by: shawnb61 on December 09, 2018, 06:05:29 PM

Title: Enhance name duplication detection to account for homoglyphs
Post by: shawnb61 on December 09, 2018, 06:05:29 PM
SMF's existing name duplication checks can be thwarted using homoglyphs.

For example, these two names are distinct & would both be allowed in SMF: "Mіau!" and "Miau!".

More discussion here:
https://www.simplemachines.org/community/index.php?topic=563837.0

Definition & examples of homoglyphs here:
https://en.wikipedia.org/wiki/Homoglyph
Title: Re: Enhance name duplication detection to account for homoglyphs
Post by: Arantor on December 09, 2018, 06:21:14 PM
The problem is how large the list is. How far down that list do you go?
Title: Re: Enhance name duplication detection to account for homoglyphs
Post by: shawnb61 on December 09, 2018, 06:44:27 PM
Yep.  Still, worthy to have that discussion & consider the enhancement. 

I think with more sites using utf8 (esp. 2.1) this warrants consideration.
Title: Re: Enhance name duplication detection to account for homoglyphs
Post by: Kindred on December 09, 2018, 09:08:47 PM
Personally, I don't think that this is a large enough issue to waste development time doing it...  seriously, I have seen this sort of issue reported exactly twice in over a decade.
Title: Re: Enhance name duplication detection to account for homoglyphs
Post by: Sesquipedalian on December 18, 2018, 12:31:42 PM
It might be possible to create a mod to do this relatively reliably and without killing the server, but it won't become a standard feature of SMF any time soon.

Such a mod could use an approach similar to the one I used in 2.1's set_tld_regex() function, by periodically downloading and processing this official file (https://www.unicode.org/Public/security/latest/confusablesSummary.txt) to build an array of substitutions to normalize confusable characters and strings. This array would need to be stored in the database somewhere, and one would want to add a column to the members table that recorded each member's normalized username to use for comparisons.