Excessive smileys causes SMF 2.0.x (and likely 2.1) to fail parse_bbc

Started by Arantor, October 20, 2014, 07:10:32 PM

Previous topic - Next topic

Arantor

Smiley parsing is handled in parsesmileys by way of an enormous regular expression that matches and handles certain types of replacements.

In the old setup, prior to PHP 5.5, this was handled by an /e expression and this was not problematic in PCRE even when throwing obscene numbers of smileys at it - say, 2000 unique smileys.

With the revised 2.0 and 2.1 code, that many smileys will produce a very large regexp which exceeds the limits set internally in the PCRE engine. Before, it got partially parsed before going to PCRE, but now it creates an atomic regexp, which when compiled appears to exceed the 64KB stack size.

In the test case we were examining, there were 2026 smileys creating a 44KB regular expression, which parses out way beyond 64KB.

Unfortunately there's no hard and fast rule about how many smileys are safe; more smileys with shorter filenames may work under some conditions. If it fails to work and there's a crazy number of smileys, try bringing the number down a bit.

Symptomatically: this manifests as a parse_bbc that fails to work, with all the apparent symptoms of a mismatch of encodings. Posts that have no smileys but do have code blocks will render as expected, and this can be used as a suitable diagnostic test to differentiate between encoding failure and parsing failure.

This should also throw an error on the following line in 2.0.x Subs.php:
$message = preg_replace_callback($smileyPregSearch, 'smileyPregReplaceCallback', $message);

I suspect the same behaviour would apply in 2.1 because even though there's now a use of closures, the failure still applies at the callback stage before closures get involved.

I do not think this is fixable in a permanent way without a complete rewrite of the parsing engine and this is completely out of any scope prior to 3.0 IMNSHO.

margarett

Who has 2000+ smileys in a post? If you do that, you SHOULD get an error :P ;D

Eheheheh D
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Arantor

No, it's not in one post.

2026 smileys configured in the admin panel causes *everything* to break because they end up being compiled into a single massive preg_replace_callback call that amounts to smiley1|smiley2|smiley3|smiley4|smiley5 etc. Even if posts don't *have* any smileys they break because they're pushed through the same parse step.

The only exception is for code blocks, where the content inside a code block is not pushed through that step.

Justyne and I were seriously confused the more we looked at this, because we had more and more symptoms that didn't add up and it looked to start with like a content mismatch but after poking and prodding the DB and various other things, it appeared the DB was fine being UTF-8.

$db_show_debug was our friend however since that also turns off certain error suppressions.

margarett

I'll stick with my thought. If you add 2026 smileys in admin, you deserve to have your forum shut down ;D ;D
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Arantor

Fun thing is, this works fine in 1.1.x because the preg_replace still uses /e. It's only on 2.0.7+ that this breaks.

Kindred

Ugh.....  ok. I say the fix is to limit the max number of smilies to 1000 or even 500...
I think we can do that in the smilies admin...
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Arantor

In the case we found, 1500 seemed acceptable without breakage.

Yes, you can do it in the smileys admin, but be careful about breaking user expectations.

mashby

Why not go to 11, as in 2025? :) Nigel Tufnel approves of this message.
Always be a little kinder than necessary.
- James M. Barrie

Arantor

Because 2025 breaks as spectacularly as 2026 does ;)

1500 is about the highest we could get away with while still working :(

margarett

And that's 1490 more than absolutely necessary :P

Yet I agree with this
Quote from: Arantor on October 20, 2014, 09:04:54 PM
but be careful about breaking user expectations.
Still, a fixed limit of... 1000?! Would be more than enough for 99.9% of all users. If the limit is easy to increase, better ;)
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Arantor


Gwenwyfar

Quote from: margarett on October 20, 2014, 09:33:07 PM
And that's 1490 more than absolutely necessary :P

Yet I agree with this
Quote from: Arantor on October 20, 2014, 09:04:54 PM
but be careful about breaking user expectations.
Still, a fixed limit of... 1000?! Would be more than enough for 99.9% of all users. If the limit is easy to increase, better ;)
I agree... How would you even name 1000 emotes differently? /emote1, /emote2, etc? None of the members would even bother trying to know or look more than 200 or something  ;D

If its simple to modify, just leave it with a warning saying it may break your forum if you add more.
"It is impossible to communicate with one that does not wish to communicate"

Illori

if we are going to support 1000 or more smileys, we need to consider the layout in 2.1. we are no longer using a popup and IMO what we have currently in 2.1 will not work if we have that many smileys.

https://github.com/SimpleMachines/SMF2.1/issues/2169#issuecomment-51769114

Kindred

umm.... are you sure?   that github does not say that the popup was removed.....
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Illori

yes i am sure. the popup that did exist in 2.1 before on github is gone and we have this new version that will cover the whole post area if you have too many smileys.

Kindred

that's a problem then....

it's bad UI.

I admit that we don't have to cover edge cases like 500+ smilies...  but we should enable people to have 20-40 smilies without a bad display (and, per your comment here, that is not possible, even though the github thread that you linked to does not mention REMOVING the pop-up)
Removing the pop-up was a poor UI choice and should not have been done...
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Illori

it was removed prior to that issue being opened/commented on. i had opened a topic about this in the old beta board with id of 527436 if you want to read up further. according to that issue it seems like no devs want to look into reverting back to what we had before...

Justyne

I get the removal from a simplification viewpoint and it is definitely good for forum members as leafing through pages of more is just tedious. I guess admin desire collides with usability here.

If we revive the popup or really want to make the interface better I wonder if categorising smileys is something that can be done. If I really have 200 I might want to split them up like the apple emoji interface for example. All "love" related ones in one place, all the "angry" ones in a another.
Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.

Illori

i think at this point that is beyond what can be done in 2.1. maybe something to consider for 3.0.

Justyne

Oh, definitely.

It just occurred to me while reading this that navigation tabs might be useful in communities with such a large amount of smileys.
Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.

Advertisement: