Simple Machines Community Forum

SMF Development => Bug Reports => Fixed or Bogus Bugs => Topic started by: Kolya on April 28, 2017, 06:53:45 PM

Title: Can't post unicode emoji
Post by: Kolya on April 28, 2017, 06:53:45 PM
When trying to post unicode emoji (as can be copied from here for example: http://getemoji.com/) the error is thrown:
QuoteThe following error or errors occurred while posting this message:
The message body was left empty.

1. That error is incorrect
2. SMF supports unicode or does it?

It breaks in Post.php on validating the input:
if (htmltrim__recursive(htmlspecialchars__recursive($_REQUEST['message'])) == '')
$context['post_error']['no_message'] = true;


When this check is disabled, posting unicode emoji becomes possible.
However the post cannot be previewed which is a separate bug.
Title: Re: Can't post unicode emoji
Post by: Arantor on April 28, 2017, 07:00:27 PM
Version of SMF? Using UTF-8?

Emoji is tricky, but it should be supported even when your database probably doesn't (because MySQL's idea of UTF-8 usually doesn't cope with emoji, which is what the utf8mb4 mode is for, though SMF can support it without that)
Title: Re: Can't post unicode emoji
Post by: Kolya on April 28, 2017, 07:08:05 PM
Version: SMF 2.0.13
I've run the function to "Convert HTML-entities to UTF-8 characters" in the past, if that is what you mean.

//Reading further I found that I have this language installed: English   ISO-8859-1   385   en_US
//Installing "SMF 2.0.13 english british-utf8" now
Title: Re: Can't post unicode emoji
Post by: Gluz on April 28, 2017, 08:58:52 PM
That was patched in the 2.0.10 update, as can see in the Changelog:
SMF 2.0.10   April 22 2015
===============================================================================

March 2015
-------------------------------------------------------------------------------
! Forum Maintenance - Topics fails if header is collapsed
! Fix for unsupported UTF8mb4 characters


And as you can see in this post, just the preview is broken:
https://www.simplemachines.org/community/index.php?topic=537329.msg3818330#msg3818330

I remember another thread about something similar, and the team says that they fix the posting part, the unicode characters save correctly in the database, but somehow they forgot to look at the previews and that is still broken, not a major issue, but sometimes is kind of annoying that if you want to preview the post, you need to take out the unicode characters.


And now that I see the thead again, it seems that Windows 8.1 and 10 added support to emojis and it shows up as image instead as a simple vector like in Windows 7.


For your problem, maybe the collation of the database is not correctly set to utf8_general_ci or utf8mb4_general_ci as pointed out by Margarett.
Title: Re: Can't post unicode emoji
Post by: Arantor on April 29, 2017, 02:30:17 AM
The code explicitly requires the database not be mb4 by way of converting it to an entity before processing. I know, I'm the one who submitted that to the team. This is what the fix mb4 stuff does in Load.php where $smcFunc is defined.

Does it work as you'd expect here?
Title: Re: Can't post unicode emoji
Post by: Kolya on April 29, 2017, 05:27:33 AM
😛 Yeah, it works here as expected, including the preview.
What can I do now to get the same behavior in my forum? Should I set the database collation to utf8mb4_general_ci? Or install an UTF-8 language pack? Or both?

BTW, I'm writing a unicode emoji picker in Javascript. So I will have something to give back to the community.
Thanks for your help.
Title: Re: Can't post unicode emoji
Post by: Arantor on April 29, 2017, 08:13:20 AM
You do not need to change the collation at all.

If you're using the forum with a UTF-8 database but a non-UTF-8 language pack it could easily not work as expected.
Title: Re: Can't post unicode emoji
Post by: Kolya on April 29, 2017, 11:45:13 AM
I changed the language to "SMF 2.0.13 english british-utf8".
Then I looked at a database backup and it uses UTF8 as well.

DROP TABLE IF EXISTS `smf_messages`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `smf_messages` (
  `id_msg` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `id_topic` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `id_board` smallint(5) unsigned NOT NULL DEFAULT '0',
  `poster_time` int(10) unsigned NOT NULL DEFAULT '0',
  `id_member` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `id_msg_modified` int(10) unsigned NOT NULL DEFAULT '0',
  `subject` varchar(255) NOT NULL DEFAULT '',
  `poster_name` varchar(255) NOT NULL DEFAULT '',
  `poster_email` varchar(255) NOT NULL DEFAULT '',
  `poster_ip` varchar(255) NOT NULL DEFAULT '',
  `smileys_enabled` tinyint(4) NOT NULL DEFAULT '1',
  `modified_time` int(10) unsigned NOT NULL DEFAULT '0',
  `modified_name` varchar(255) NOT NULL DEFAULT '',
  `body` mediumtext NOT NULL,
  `icon` varchar(16) NOT NULL DEFAULT 'xx',
  `approved` tinyint(3) NOT NULL DEFAULT '1',
  `edit_reason` tinytext NOT NULL,
  `show_thumbnail` int(1) NOT NULL,
  PRIMARY KEY (`id_msg`),
  UNIQUE KEY `ID_MEMBER` (`id_member`,`id_msg`),
  UNIQUE KEY `topic` (`id_topic`,`id_msg`),
  UNIQUE KEY `ID_BOARD` (`id_board`,`id_msg`),
  KEY `ID_TOPIC` (`id_topic`),
  KEY `participation` (`id_member`,`id_topic`),
  KEY `showPosts` (`id_member`,`id_board`),
  KEY `ipIndex` (`poster_ip`(15),`id_topic`),
  KEY `approved` (`approved`),
  KEY `id_member_msg` (`id_member`,`approved`,`id_msg`),
  KEY `current_topic` (`id_topic`,`id_msg`,`id_member`,`approved`),
  KEY `related_ip` (`id_member`,`poster_ip`,`id_msg`)
) ENGINE=MyISAM AUTO_INCREMENT=111424 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;


Still no luck. The same emoji that work here produce the empty message error on my own board.
I'm running PHP Version 5.5.12 if that is relevant.
Title: Re: Can't post unicode emoji
Post by: Arantor on April 29, 2017, 11:56:57 AM
So you have UTF-8 language files and a UTF-8 database. What does SMF think it's using?
Title: Re: Can't post unicode emoji
Post by: Kolya on April 29, 2017, 01:54:07 PM
Is that  a rhetorical question? Where can I look that up?

The forum language is set to "English British (UTF-8)". The pages are delivered with UTF-8 encoding as well. And the detailed version check doesn't show any outdated forum files.
Title: Re: Can't post unicode emoji
Post by: Kolya on April 29, 2017, 02:05:41 PM
I just noticed that it works fine when entering emoji on the quick reply box. But not via the regular editor page. That throws the empty post error.
And you know what? It's the same problem here.

Try pasting the following string to the quick reply box and hit "Preview". Then hit "Preview" again on the editor page.

📗test
Title: Re: Can't post unicode emoji
Post by: Kolya on May 06, 2017, 03:32:45 AM
I don't want to be that triple posting guy, but it would be nice to get some confirmation that you guys acknowledged this as a bug (maybe created a ticket on whatever bugtracker you're using?).

To summarise: On  SMF 2.0.13 posting unicode emojis works from the quick reply (including preview) but not from the regular editor page, which throws an error that "The message body was left empty.".
Title: Re: Can't post unicode emoji
Post by: live627 on June 07, 2017, 09:09:56 PM
🎂
Title: Re: Can't post unicode emoji
Post by: live627 on June 07, 2017, 09:14:03 PM
That w worked on the full editor. I'll need to look into this more later.
Title: Re: Can't post unicode emoji
Post by: lurkalot on June 08, 2017, 03:26:06 AM
This one doesn't 😜 It gives this error on preview here,

The following error(s) occurred while posting this message:

The message body has been left empty.

Doesn't seem to do it in SMF 2.0 beta 3 though.  The smiley is from the Emoji keyboard of an Apple iPad  Copy and paste it into a post here and try and preview it.  Or modify and preview it.
Title: Re: Can't post unicode emoji
Post by: Arantor on June 08, 2017, 03:33:04 AM
This should have been fixed in the week or so before 2.1 beta 3 landed.
Title: Re: Can't post unicode emoji
Post by: lurkalot on June 08, 2017, 03:45:00 AM
Quote from: Arantor on June 08, 2017, 03:33:04 AM
This should have been fixed in the week or so before 2.1 beta 3 landed.

I was testing this in the test board here a couple of days ago, after someone posted one on one of my forums, and I immediately thought my forum had gone T/U after moving it to new hosting.  Then found it was causing problems here as well.   :)

I then tried it on 2.0 beta 3 and all seems fine on there, previewed ok.  It's just 2.0.xx
Title: Re: Can't post unicode emoji
Post by: live627 on June 08, 2017, 03:18:05 PM
It works using Windows 10... I guess then different systems do it differently. I won't be able to fix it.
Title: Re: Can't post unicode emoji
Post by: Kolya on June 18, 2017, 02:23:51 PM
I am on Windows 10 as well. Try copying this string to the quick reply box and hit Preview, then hit Preview again on the full editor: 🙂test
The preview from the quick reply box works, the full editor gives the error.
Title: Re: Can't post unicode emoji
Post by: Gwenwyfar on June 18, 2017, 05:42:50 PM
Quote from: Kolya on April 29, 2017, 02:05:41 PM
I just noticed that it works fine when entering emoji on the quick reply box. But not via the regular editor page. That throws the empty post error.
And you know what? It's the same problem here.

Try pasting the following string to the quick reply box and hit "Preview". Then hit "Preview" again on the editor page.

📗test
Same behavior for me, both here and on my own forum, on linux mint/FF. I tested with 😜

Quick reply preview ok, full editor preview gives an error.

Edit: Also quick edit gives the same error, full edit is ok.
Title: Re: Can't post unicode emoji
Post by: Gluz on June 18, 2017, 06:34:47 PM
The fix for 2.0.10+ pretty much is the same as for 2.1 beta 3, but it would work only if the forum is in UTF8. I see that in the github and tested in my test forum, I already passed that into my live forum as well.

https://github.com/SimpleMachines/SMF2.1/commit/5b73e8dcc330dfb4d87fec54b7a5518a46272fbf

Basically in /Themes/all_themes_that_have_it/Post.template.php search for all instances of:
.php_to8bit().php_urlencode()

And delete that part, with it, the previews should work.
Title: Re: Can't post unicode emoji
Post by: feline on June 19, 2017, 02:24:41 AM
Better choice ..

replace:
.php_to8bit().php_urlencode()

with:
.html_entity_decode.php_urlencode()

That works perfect ...
Title: Re: Can't post unicode emoji
Post by: Arantor on June 19, 2017, 11:46:58 AM
Don't you want brackets on the decode call?
Title: Re: Can't post unicode emoji
Post by: feline on June 19, 2017, 04:00:43 PM
No .. this I have tested with a Android handy/tablet and a MS Lumia ..
Works perfect ...  ;)
Title: Re: Can't post unicode emoji
Post by: albertlast on June 19, 2017, 04:34:22 PM
what is the motivation for doing this decode/encode stuff?
Because this stuff cost cpu and traffic.
Title: Re: Can't post unicode emoji
Post by: Arantor on June 19, 2017, 06:13:00 PM
I'm not sure but I genuinely don't see how that snippet works correctly. Maybe it ends up doing the same as the 2.1 equivalent change ;D
Title: Re: Can't post unicode emoji
Post by: Gluz on June 20, 2017, 01:33:13 AM
It doesn't, it throws JavaScript error about html_entity_decode not being defined if used without brackets, and that is not a function with brackets.

That part it's failing to do anything.

If you just use the fix for 2.1, it works the same but without errors.
Title: Re: Can't post unicode emoji
Post by: feline on June 20, 2017, 02:01:47 AM
Prewiev Screenshot without html_entity_decode (not_correct)

Preview Screenshot width html_entity_decode (correct)

More questions?
Title: Re: Can't post unicode emoji
Post by: Arantor on June 20, 2017, 03:49:48 AM
I'm just struggling to understand how a JavaScript property can be called as a function without using the operator to tell it to use it as a function.

In other words, as gluz said.

Now, if you had the brackets in there as I suggested, I could see it maybe working.
Title: Re: Can't post unicode emoji
Post by: Gluz on June 20, 2017, 04:08:43 AM
That preview without html_entity_decode have php_urlencode or not?

Where it's defined html_entity_decode in SMF? because the JavaScript console says it's not defined.

With the github fix it displays the Emojis fine in a brand new SMF install, and with your code it displays the Emojis but also generate that error in the JavaScript console.
Title: Re: Can't post unicode emoji
Post by: shawnb61 on October 03, 2019, 07:06:44 PM
This was reported a few times & a fix is targeted for 2.0.16 - I am going to close some dupes & keep one open:
https://www.simplemachines.org/community/index.php?topic=569620.0