[4981] [1.x, 2.0] handling MS Smart Quotes

Started by MrPhil, April 25, 2012, 01:16:42 PM

Previous topic - Next topic

MrPhil

The support boards for all versions of SMF are clogged with reports of certain characters cutting off the rest of a post, or otherwise apparently causing mischief. The root cause of these problems is that people cut and paste text from Microsoft products (especially Word) that contain MS's "Smart Quotes", which are found only in CP-1252 encoding. My proposal is that all incoming text (from TEXT, TEXTAREA, and possibly other input fields) be scanned for Smart Quotes characters (binary), and any found should be replaced by HTML entities. str_replace() might do the job. Here are all the Smart Quotes:


Smart QuoteGlyphClosest ASCIIUTF-16 valueHTML entitySQ DescriptionReserved Use
80€C=20ACeuroEuroreserved control
81 reserved control
82‚"201AsbquoLow-"9" opening quotation markBreak Permitted Here
83ƒf0192fnof1 or 402Florin/script f/folderNo Break Here
84„"201EbdquoLow-"99" opening quotation markIndex
85…...2026hellipEllipsisNext Line
86†+2020daggerSingle daggerStart of Selected Area
87‡++2021DaggerDouble daggerEnd of Selected Area
88ˆ^02C6circCircumflex ^ accent (combining?)Character Tabulation Set
89‰o/oo2030permilo/oo per milleCharacter Tabulation with Justification
8AŠ 0160Scaron1 or 352S + caron accentLine Tabulation Set
8B&lsaquo;<2039lsaquoSingle left angle quote < (guillemet)Partial Line Down
8C&OElig;OE0152OEligOE ligaturePartial Line Up
8D Reverse Line Feed
8E&#381; 017DZcaron1 or 381Z + caron accentSingle Shift Two
8F Single Shift Three
90 Device Control String
91&lsquo;   '2018lsquo"6" opening quotation markPrivate Use One
92&rsquo;'2019rsquo"9" closing quotation mark/apostrophePrivate Use Two
93&ldquo;"201Cldquo"66" opening quotation markSet Transmit State
94&rdquo;"201Drdquo"99" closing quotation markCancel Character
95&bull;*2022bullSolid bulletMessage Waiting
96&ndash;-2013ndashEn-dashStart of Guarded Area
97&mdash;--2014mdashEm-dashEnd of Guarded Area
98&tilde;~02DCtildeTilde ~ accent (combining?)Start of String
99&trade;(tm)2122tradeTrademark TMreserved control
9A&#353; 0161scaron1 or 353s + caron accentSingle Character Introducer
9B&rsaquo;>203ArsaquoSingle right angle quote > (guillemet)Control Sequence Introducer
9C&oelig;oe0153oeligoe ligatureString Terminator
9D Operating System Command
9E&#382; 017Ezcaron1 or 382z + caron accentPrivacy Message
9F&Yuml; 0178YumlY + diaeresis/umlaute accentApplication Program Command

Why call this a bug? Because it's been a festering problem for a long time, and really degrades the public image of SMF that it cannot handle something so common as cutting and pasting in Word document text. Just because your average user is too stupid to realize the difference in encodings is the cause of the problem doesn't mean that we can't work around it. It's also very simple to fix -- just define a function to clean the string and call it from wherever SMF takes in user text. Depending on whether BBCode is recognized, and whether HTML entities work, it might be possible to create either a BBCode for each character, or a BBCode to handle generic HTML entities [ent=nnnn] or [ent=name]. Where BBCode is not processed, replace by ASCII character(s).

kat

I can't see how anyone could ignore such a well-presented, detailed report, Mr. P.

Joshua Dickerson

Quote from: K@ on April 25, 2012, 01:29:56 PM
I can't see how anyone could ignore such a well-presented, detailed report, Mr. P.
Wow, yes... couldn't agree more. If there was an award for best bug report (without the fix), I think this might be it.
Come work with me at Promenade Group



Need help? See the wiki. Want to help SMF? See the wiki!

Did you know you can help develop SMF? See us on Github.

How have you bettered the world today?

kat

Let's keep our fingers crossed, then, ay? ;)

vbgamer45

I think it is a great idea. I have run into those issues were users post those special characters all the time and anything to fix I am for it.
Community Suite for SMF - Take your forum to the next level built for SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com -  Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

emanuele

A doc file with such entities would help in at least do tests. ;)

meh...Devs always want more... :P

BTW: it's always MS's fault!!! >:( :P


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Joshua Dickerson

Come work with me at Promenade Group



Need help? See the wiki. Want to help SMF? See the wiki!

Did you know you can help develop SMF? See us on Github.

How have you bettered the world today?

MrPhil

Wow. Maybe I should try complaining about the other things that irritate me to no end!

  • Fix or remove the SMF database backup
  • Really fix the Settings.php being emptied out problem
  • Get rid of the spurious "your database may require an upgrade" warning
  • Make mod install/uninstall either all or nothing -- refuse if ANY manual edits needed (most people ignore the manual edits) -- and validate that all changes were done properly (no more "undefined index" reports)
  • Ensure that version updates do the database update too -- there are far too many systems out there with way back-level databases (even if it's just the smfVersion entry)
  • Make conversion between encodings foolproof -- too many systems end up half-way changed, or a mix of encodings in the database or language files
  • Fix the calendar so that the min/max limits are deltas to the current year (say, CY - 1 through CY + 5, with hard limits 1970 - 2037), or else automatically update the database, so we don't get reports that "I can't enter a 2012 event -- it's too far in the future!"

I consider all of these to be bugs, rather than enhancements, and they should be fixed in SMF 1 too! Properly dealing with this list will greatly reduce the support load, and greatly improve SMF's image.

Antechinus

The rest sound good, but honestly I'm not sure about this one:

"Make mod install/uninstall either all or nothing -- refuse if ANY manual edits needed (most people ignore the manual edits)........................."

MrPhil

#9
Partially installed and partially uninstalled mods are the bane of SMF. Almost every "undefined index" error involves one. My impression is that people tend to accept partial actions and don't understand that they need to go back and manually edit the failed files. Therefore, I think that fully automatic installs/uninstalls should either do 100% of the job, or refuse to do any.

It would be nice to give some assistance on mod installs/uninstalls where SMF can handle some of the files itself, but we need to prevent people from walking off with the job half done. It might be enough to refuse to take the forum out of maintenance mode (or otherwise unlock it) until they have answered "yes" or "no" to each "Did you successfully manually edit file ________?" Maybe add, "do you swear upon your mother's grave..."?? There's probably no automated way to check that the work was done (else it could have been done fully automatically). If they answer "no" to any question, refuse to unlock the forum until they agree to uninstall/reinstall the mod, and can answer "yes" to all. Or, save a backup of each file affected by the mod, and restore all of them to completely roll back the action. If they say "yes" to a question, compare the current file to its backup, to see if something was done. There's all sorts of things that could be done.

If I had to do SMF over from scratch, I would not eval templates either. We always have to tell people to turn off eval and come back with the right error messages. Do you suppose it's possible to automatically turn off eval if an error is detected (and "eval" is in the message), and possibly purge the non-eval message?

Add: A related problem is that the package manager allows you to install a mod twice. As part of the pre-install check, it should see if the target code is already installed, and if so, refuse to proceed. This would eliminate all reports of "duplicate function defined".

emanuele

Quote from: MrPhil on April 25, 2012, 11:47:36 PM
It would be nice to give some assistance on mod installs/uninstalls where SMF can handle some of the files itself, but we need to prevent people from walking off with the job half done. It might be enough to refuse to take the forum out of maintenance mode (or otherwise unlock it) until they have answered "yes" or "no" to each "Did you successfully manually edit file ________?" Maybe add, "do you swear upon your mother's grave..."?? There's probably no automated way to check that the work was done (else it could have been done fully automatically). If they answer "no" to any question, refuse to unlock the forum until they agree to uninstall/reinstall the mod, and can answer "yes" to all.
And they will simply answer "yes" to all even if they didn't do anything at all...

Quote from: MrPhil on April 25, 2012, 11:47:36 PM
Or, save a backup of each file affected by the mod, and restore all of them to completely roll back the action. If they say "yes" to a question, compare the current file to its backup, to see if something was done. There's all sorts of things that could be done.
And so people will come and complain "the mod doesn't install, I installed it, but the function doesn't show up, etc., etc., etc." negatively affecting SMF's image... ;)

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
Wow. Maybe I should try complaining about the other things that irritate me to no end!
Usually a fix is more appreciated, but a report is as well. :P

Quote from: MrPhil on April 25, 2012, 05:54:38 PM

  • Fix or remove the SMF database backup
http://www.simplemachines.org/community/index.php?topic=474901.0

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
  • Really fix the Settings.php being emptied out problem
I don't remember the "final" decision about this for 2.1...

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
  • Get rid of the spurious "your database may require an upgrade" warning
AFAIR quite difficult...unfortunately.
Unless we get rid of them entirely.

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
  • Make mod install/uninstall either all or nothing -- refuse if ANY manual edits needed (most people ignore the manual edits) -- and validate that all changes were done properly (no more "undefined index" reports)
IMHO: a big no.

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
  • Ensure that version updates do the database update too -- there are far too many systems out there with way back-level databases (even if it's just the smfVersion entry)
AFAIR the SMF version is updated only where changes happen. So if during an update the database doesn't change the version is not incremented. But I may be wrong.

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
  • Make conversion between encodings foolproof -- too many systems end up half-way changed, or a mix of encodings in the database or language files
I would know what is needed to make it foolproof, maybe someone else has an idea?

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
  • Fix the calendar so that the min/max limits are deltas to the current year (say, CY - 1 through CY + 5, with hard limits 1970 - 2037), or else automatically update the database, so we don't get reports that "I can't enter a 2012 event -- it's too far in the future!"
* emanuele thinks about it, but it could need a bit of changes...

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
I consider all of these to be bugs, rather than enhancements, and they should be fixed in SMF 1 too!
Quite difficult they will be fixed for 1.x, we can fix some of these in 2.1, but even 2.0 wuld probably not be update with the relative "fixes".


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

MrPhil

Quote from: emanuele on April 26, 2012, 05:30:04 AM
And they will simply answer "yes" to all even if they didn't do anything at all...
True, but then they will be caught and their lies exposed when SMF compares the current file against the old backup. If they didn't bother to do any editing, the install/uninstall can then be rolled back, and the user informed why.

Quote
And so people will come and complain "the mod doesn't install, I installed it, but the function doesn't show up, etc., etc., etc." negatively affecting SMF's image... ;)
Well, they were told why the mod didn't install (they didn't make the manual edits). If they want to ****** and moan after that, there's not much we can do. I think it's still better to refuse to install than to allow a half-assed job.

Quote
Usually a fix is more appreciated, but a report is as well. :P
I do offer fixes on many of these -- see my sig, especially "Fixes".


  • Fix or remove the SMF database backup
    Quotehttp://www.simplemachines.org/community/index.php?topic=474901.0
    Just for the record, K@ started that topic after we had a PM gripe session over all the long-festering SMF bugs.

  • Really fix the Settings.php being emptied out problem
    QuoteI don't remember the "final" decision about this for 2.1...
    I would hope that this would be addressed for all SMF versions, not just 2.1. Come on, guys, the fix is very simple!

  • Get rid of the spurious "your database may require an upgrade" warning
    QuoteAFAIR quite difficult...unfortunately.
    Unless we get rid of them entirely.
    I offer a fix in my sig. It's probably more complex than need be, but the object is to remember which database levels (smfVersion) are acceptable for the current code version. We have both the code version and the database version available -- it's trivial  to keep the information somewhere (new function?) as to what database versions work for the given code version.

  • Make mod install/uninstall either all or nothing -- refuse if ANY manual edits needed (most people ignore the manual edits) -- and validate that all changes were done properly (no more "undefined index" reports)
    QuoteIMHO: a big no.
    A Big Yes. Something has to be done to keep people from partially installing mods (or uninstalling them) by ignoring warnings that they need to do manual edits.

  • Ensure that version updates do the database update too -- there are far too many systems out there with way back-level databases (even if it's just the smfVersion entry)
    QuoteAFAIR the SMF version is updated only where changes happen. So if during an update the database doesn't change the version is not incremented. But I may be wrong.
    If you're right, that's a stupid way to do it. If we're going to compare smfVersion to the code level and scare users with ominous warnings about mismatched version, we need to update smfVersion to match the code version. Otherwise, add code (see above) to allow still-working earlier smfVersion.

  • Make conversion between encodings foolproof -- too many systems end up half-way changed, or a mix of encodings in the database or language files
    QuoteI would know what is needed to make it foolproof, maybe someone else has an idea?
    I'm not sure about this one, but we need some ideas. I see far too many systems where they've got a random mixture of database encoding, language support encoding(s), and display page encodings. For later SMF versions (2.1 or 3.0), perhaps the best solution would be to simply mandate UTF-8 for everything.

  • Fix the calendar so that the min/max limits are deltas to the current year (say, CY - 1 through CY + 5, with hard limits 1970 - 2037), or else automatically update the database, so we don't get reports that "I can't enter a 2012 event -- it's too far in the future!"
    Quotethinks about it, but it could need a bit of changes...
    The biggest problem would be to ensure that unchanged min/max entries (years) are interpreted as fixed years and not deltas (value >500, treat as year). Perhaps if the max year <= current year + 1, the admin could be prompted to change the values to deltas? It could even be done automatically, with an email to the admin telling what was done.

Quote from: MrPhil on April 25, 2012, 05:54:38 PM
I consider all of these to be bugs, rather than enhancements, and they should be fixed in SMF 1 too!
Quote
Quite difficult they will be fixed for 1.x, we can fix some of these in 2.1, but even 2.0 wuld probably not be update with the relative "fixes".
They're all serious bugs that greatly detract from SMF's base functionality and sully its reputation, and need to be fixed in all release branches. In most cases, acceptable fixes are quite trivial to implement. They may not be the most elegant code, but they do the job to keep SMF from failing.

MrPhil

To add two more to the list before they slip my mind again...

  • Consolidate hard coded permissions in one function, and during configuration find out if 755, 775, or 777 is the necessary permission for SMF to write to a directory (likewise for files 644, 664, or 666). If 777 is absolutely required, see about automatically changing it back to 755 after the upload operation is done.

    SMF still has hard coded 777 directory permissions, which won't even work on some systems (e.g., running suPHP), and are often a security hazard. Let's get permissions straightened out to use the least expansive permissions for any case.
  • With SMF 2 we seem to be getting lots of reports of problems with the SMF cache. It clearly has problems. Let's disable it by default until it can be fixed.

Hmm. how about a third:
  • Adjust the CSS to put some vertical space between list items.

emanuele

Quote from: MrPhil on April 27, 2012, 10:23:42 AM
  • Consolidate hard coded permissions in one function, and during configuration find out if 755, 775, or 777 is the necessary permission for SMF to write to a directory (likewise for files 644, 664, or 666). If 777 is absolutely required, see about automatically changing it back to 755 after the upload operation is done.

    SMF still has hard coded 777 directory permissions, which won't even work on some systems (e.g., running suPHP), and are often a security hazard. Let's get permissions straightened out to use the least expansive permissions for any case.
There are already 2 or 3 topics around in this board and is tracked in mantis.
There are a couple of ideas (don't remember if here or in some less public board), nothing definitive.

Quote from: MrPhil on April 27, 2012, 10:23:42 AM
  • With SMF 2 we seem to be getting lots of reports of problems with the SMF cache. It clearly has problems. Let's disable it by default until it can be fixed.
I'm aware of 2 problems:
1) the "/" in the keys (reported and tracked, waiting for a fix);
2) the "random" (but not exactly random) broken cached files that *should* be fixed in 2.1, but we cannot be sure until we have it tested in a (multitude of) "real-world" case(s)...

Is there any other problem with the cache?


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

MrPhil

Quote from: emanuele on April 27, 2012, 11:19:16 AM
There are already 2 or 3 topics around in this board and is tracked in mantis.
There are a couple of ideas (don't remember if here or in some less public board), nothing definitive.
Fine, so long as progress is made and soon SMF is only using proper permissions (not just hard coded 777). SMF could do some testing at the top of installation, and build a permissions.php file to hold the defines. Something like

<?php
// consolidate permissions in one place

define ('WRITABLE_FILE'0664);
define ('WRITABLE_DIR',  0775);

define ('GENERAL_FILE'0644);
define ('GENERAL_DIR',  0755);

define ('RO_FILE'0444);
define ('RO_DIR',  0555);
?>


and pull this in from Settings.php or something.

Quote
Is there any other problem with the cache?
I don't know of specific problems, except that I see a lot of "file not found" reports in the support board. Emptying out the cache seems to do the trick, but SMF still needs fixing. If it isn't easily fixed, and doesn't speed up the system all that much, I'd just get rid of it.

emanuele

Quote from: MrPhil on April 27, 2012, 12:10:13 PM
SMF could do some testing at the top of installation
Do tests is one of the proposals, but I'd do them more frequently than just during the installation.

Quote from: MrPhil on April 27, 2012, 12:10:13 PM
Quote
Is there any other problem with the cache?
I don't know of specific problems, except that I see a lot of "file not found" reports in the support board. Emptying out the cache seems to do the trick, but SMF still needs fixing.
Clean SMF without mods?
What cache level?
A list of topics could help in finding a pattern and reduce the possibilities.
The second problem I mentioned (that is actually the most important) should lead to "Parse error"-like messages, but not file not found.

Quote from: MrPhil on April 27, 2012, 12:10:13 PM
If it isn't easily fixed, and doesn't speed up the system all that much, I'd just get rid of it.
Reading your posts I start thinking we should get rid of everything! :P
Just, joking of course. ;)


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

MrPhil

Quote from: emanuele on April 27, 2012, 01:00:41 PM
Quote from: MrPhil on April 27, 2012, 12:10:13 PM
SMF could do some testing at the top of installation
Do tests is one of the proposals, but I'd do them more frequently than just during the installation.
My proposal is to test once, during installation, to see what permissions are needed for various directory and file operations. Write out a permissions.php file with the defines and use that everywhere permissions are set. Why would more frequent testing be needed? A host system changing their PHP setup will be a very rare event. In such cases, if the user has problems, they can either manually edit the permissions.php file, or run some utility to rebuild it.

P.S. Make sure that this works properly on Windows servers, too, even if we operate with Unix-style "ugo" permissions.

Quote
Quote from: MrPhil on April 27, 2012, 12:10:13 PM
Quote
Is there any other problem with the cache?
I don't know of specific problems, except that I see a lot of "file not found" reports in the support board. Emptying out the cache seems to do the trick, but SMF still needs fixing.
Clean SMF without mods?
What cache level?
A list of topics could help in finding a pattern and reduce the possibilities.
Anyone interested in pursuing this could search for "cache" without "browser" and find all the reports. They could then contact the member with the original problem and ask for further details.

MrPhil

'Nuther one. A simple fix, but perhaps more of an enhancement:
  • If SMF can't connect to the database, it sends out an email. The wording of this email is confusing to noobs and they post here, asking why SMF (this forum) is trying to connect to their database. The forum title may not be available, but perhaps $mbname or $boardurl from Settings.php could be used in the message to clarify what SMF is trying to connect?

kat

Even easier, just have it say "Your forum cannot communicate with your database."

emanuele

Quote from: MrPhil on April 27, 2012, 05:24:15 PM
My proposal is to test once, during installation, to see what permissions are needed for various directory and file operations. Write out a permissions.php file with the defines and use that everywhere permissions are set. Why would more frequent testing be needed? A host system changing their PHP setup will be a very rare event. In such cases, if the user has problems, they can either manually edit the permissions.php file, or run some utility to rebuild it.
First because SMF already does a lot of checks, secondly because a host changing its PHP setup is rare, but a user changing hist host is not so rare. :P

Quote from: MrPhil on April 27, 2012, 05:24:15 PM
Anyone interested in pursuing this could search for "cache" without "browser" and find all the reports. They could then contact the member with the original problem and ask for further details.
http://www.simplemachines.org/community/index.php?action=search2;search=cache+-browser

50 pages...sorry too much for me. I give up.

Quote from: MrPhil on April 28, 2012, 12:06:36 PM
'Nuther one. A simple fix, but perhaps more of an enhancement:
  • If SMF can't connect to the database, it sends out an email. The wording of this email is confusing to noobs and they post here, asking why SMF (this forum) is trying to connect to their database. The forum title may not be available, but perhaps $mbname or $boardurl from Settings.php could be used in the message to clarify what SMF is trying to connect?
$mbname . ': SMF Database Error!', 'There has been a problem with the database!' . ($db_error == '' ? '' : "\n" . $smcFunc['db_title'] . ' reported:' . "\n" . $db_error) . "\n\n" . 'This is a notice email to let you know that SMF could not connect to the database, contact your host if this continues.');

$mbname is actually the first thing on the email.
How could it be more clear?
A link to the forum?
Where? At the end?


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Advertisement: