Simple Machines Community Forum

SMF Development => Bug Reports => Fixed or Bogus Bugs => Topic started by: Spuds on November 30, 2010, 10:33:26 PM

Title: [4604] Buddy List
Post by: Spuds on November 30, 2010, 10:33:26 PM
I was adding some users to a buddy list and noticed the following ... RC4, this site ;)



Title: Re: Buddy List
Post by: Masterd on December 06, 2010, 04:37:56 AM
Are you using the clean SMF installation?
Title: Re: Buddy List
Post by: Kindred on December 06, 2010, 09:36:12 AM
Masterd, he said, THIS site...

Spuds,

Does it happen on your own or any other site?   
Some of this site's files are a different version from a standard RC4 installation...   So, unless you have noticed it on another RC4 site, this is probably not a real bug.
Title: Re: Buddy List
Post by: Masterd on December 06, 2010, 11:27:41 AM
Quote from: Kindred on December 06, 2010, 09:36:12 AM
Masterd, he said, THIS site...

Sorry, I didn't saw that.

Did you tried to reproduce the bug in other browsers?
Title: Re: Buddy List
Post by: Masterd on December 07, 2010, 07:06:55 AM
I tried this in Chrome 9.0.597.10 dev and yes, the "&" character converted to the HTML entity. I was curious so I tried to add a member "©enK" to my Buddy List and the "©" character didn't converted to the HTML entity. (©) 
Title: Re: Buddy List
Post by: Nibogo on January 21, 2011, 06:10:20 PM
Confirmed, thanks for report:

http://dev.simplemachines.org/mantis/view.php?id=4604
Title: Re: [4604] Buddy List
Post by: Masterd on January 22, 2011, 03:11:13 AM
QuoteSuggested List can't handle items with special characters

Well, it's working with the copyright character.
Title: Re: [4604] Buddy List
Post by: JBlaze on January 22, 2011, 03:51:55 AM
Quote from: Masterd on January 22, 2011, 03:11:13 AM
Well, it's working with the copyright character.
The copyright character is not a special entity that needs escaping. Entities such as &, +, ;, = and # need to be escaped since they are also used in things such as URLs and code.
Title: Re: [4604] Buddy List
Post by: Arantor on January 22, 2011, 04:10:56 AM
...never heard of +, ;, = or # being escaped. The list is <, >, & and under some cases " and '.
Title: Re: [4604] Buddy List
Post by: JBlaze on January 22, 2011, 06:41:38 AM
Quote from: Arantor on January 22, 2011, 04:10:56 AM
...never heard of +, ;, = or # being escaped. The list is <, >, & and under some cases " and '.
Was just throwing out the ones off the top of my head, didn't know if they were right or not :P
Title: Re: [4604] Buddy List
Post by: Masterd on January 22, 2011, 01:11:34 PM
Quote from: JBlaze on January 22, 2011, 03:51:55 AM
The copyright character is not a special entity that needs escaping. Entities such as &, +, ;, = and # need to be escaped since they are also used in things such as URLs and code.

Copyright character is an HTML entity.
Title: Re: [4604] Buddy List
Post by: Arantor on January 22, 2011, 01:13:22 PM
Not in UTF-8 it isn't. But it's not relevant here.

Let me clarify. You can express literally any character in a numeric entity, &#xxxx; format. You can also express many common characters in named entities, e.g. lt, gt, and copy.

The issue is an extra call made to htmlspecialchars. This covers specifically the entities mentioned above: < becoming lt, > becoming gt, & becoming amp, and depending how it's called, sometimes " becoming quot and ' becoming apos (or #39) - NOTHING else is affected.
Title: Re: [4604] Buddy List
Post by: Masterd on January 22, 2011, 01:25:11 PM
I tried to add a users "Čoma" on some other SMF 2.0 RC4 forum and the letter "Č" wasn't converted to the UTF-8 entity.
Title: Re: [4604] Buddy List
Post by: Arantor on January 22, 2011, 01:28:23 PM
-sigh-

Why would it be converted to an entity? It's a perfectly legal UTF-8 character. It's also not on the list of characters affected by htmlspecialchars.
Title: Re: [4604] Buddy List
Post by: Masterd on January 22, 2011, 01:35:41 PM
As far as I can see, it appears only with the ISO10646 characters.
Title: Re: [4604] Buddy List
Post by: Arantor on January 22, 2011, 02:01:51 PM
It's nothing to do with those characters at all. How many times do I have to explain this?

The ONLY characters affected by this bug are the ones affected by htmlspecialchars, which is limited to <, >, & and sometimes ' and ".

http://php.net/htmlspecialchars

QuoteThe translations performed are:

'&' (ampersand) becomes '&amp;'
'"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set.
''' (single quote) becomes ''' only when ENT_QUOTES is set.
'<' (less than) becomes '&lt;'
'>' (greater than) becomes '&gt;'

This is run on the members table when new records are made, and it's being done too many times in the auto suggest process.