News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

Subs-Post.php: From header body field does not conform to RFC2047

Started by FfdG, February 10, 2013, 08:15:10 AM

Previous topic - Next topic

FfdG

RFC2047 specifies encoded-word and its usage as follows:

2. Syntax of encoded-words

   An 'encoded-word' is defined by the following ABNF grammar.  The
   notation of RFC 822 is used, with the exception that white space
   characters MUST NOT appear between components of an 'encoded-word'.

   encoded-word = "=?" charset "?" encoding "?" encoded-text "?="


5. Use of encoded-words in message headers

[...]

(3) As a replacement for a 'word' entity within a 'phrase', for example,
    one that precedes an address in a From, To, or Cc header.  The ABNF
    definition for 'phrase' from RFC 822 thus becomes:

    phrase = 1*( encoded-word / word )

                       [...].  An 'encoded-word' that appears within a
    'phrase' MUST be separated from any adjacent 'word', 'text' or
    'special' by 'linear-white-space'.


Subs-Post.php generates the From field with forbidden quotation marks (no "linear-white-space") surrounding $from_name:

$headers = 'From: "' . $from_name . '" <' . (empty($modSettings['mail_from']) ? $webmaster_email : $modSettings['mail_from']) . '>' . $line_break;

resulting in an invalid (as encoded-word) header, for instance:
From: "=?ISO-8859-1?B?...?=" <[email protected]>

Fix: just remove both ".

Arantor

Interesting. Would you not also need to edit the Reply-To as well?
Holder of controversial views, all of which my own.


FfdG

Maybe, I haven't checked the other headers. Reply-To is set (without quotes) to:
$headers .= $from !== null ? 'Reply-To: <' . $from . '>' . $line_break : '';
The sendmail() calls with "from" parameter are usually set to $user_info['email'] or $_POST['y_email']), both are guarded in many code duplications with a subset of valid(?) addr-specs:
preg_match('~^[0-9A-Za-z=_+\-/][0-9A-Za-z=_\'+\-/\.]*@[\w\-]+(\.[\w\-]+)*(\.[\w]{2,6})$~', ...)
also without quotes. But I only took a quick look...

Arantor

Funny, in the code I'm looking at, $from_name is used rather than $from, which would imply it is applicable to Reply-To as well.
Holder of controversial views, all of which my own.


FfdG


Arantor

Ah, I'm not looking at 2.0.4 or 1.1.16 (or 1.1.18 as that's the current 1.1.x maintenance release)

I do know that I changed it from $from to $from_name some time ago from a separate bug report I'd had on it for that header.
Holder of controversial views, all of which my own.


FfdG

Quote from: Spuds on February 13, 2013, 12:12:13 PM
Interesting for sure .... did the presence of those "'s cause an issue with a mail client or is this strictly a spec conformance issue?
Yes, in my mail notifier, because Python's (3.2.3) email package handles them as word.

QuoteI believe in that spec the "'s were optional, used to treat the text as an atom
RFC822:

     authentic   =   "From"       ":"   mailbox  ; Single author
                 / ( "Sender"     ":"   mailbox  ; Actual submittor
                     "From"       ":" 1#mailbox) ; Multiple authors
                                                 ;  or not sender


     mailbox     =  addr-spec                    ; simple address
                 /  phrase route-addr            ; name & addr-spec


     route-addr  =  "<" [route] addr-spec ">"


     phrase      =  1*word                       ; Sequence of words
     word        =  atom / quoted-string


     atom        =  1*<any CHAR except specials, SPACE and CTLs>
     quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
                                                 ;   quoted chars.

Did you spot the difference? atom allows no SPACE and " (special) and quoted-string requires proper escaping. You may add the quotes, but _before_ the encoding.

QuoteAs an aside I took another test though iconv_mime_decode From: "=?ISO-8859-1?B?...?=" <[email protected]> (=?ISO-8859-1?Q?a?=) and I would have expected it to honor the encoded cname (in strict) but it does not. Probably another sub clause somewhere about that as well ....
I'm not sure which rule you refer to (RFC2822):

from            =       "From:" mailbox-list CRLF
mailbox-list    =       (mailbox *("," mailbox)) / obs-mbox-list
obs-mbox-list   =       1*([mailbox] [CFWS] "," [CFWS]) [mailbox]
mailbox         =       name-addr / addr-spec
name-addr       =       [display-name] angle-addr
angle-addr      =       [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
obs-angle-addr  =       [CFWS] "<" [obs-route] addr-spec ">" [CFWS]
addr-spec       =       local-part "@" domain
local-part      =       dot-atom / quoted-string / obs-local-part
domain          =       dot-atom / domain-literal / obs-domain
display-name    =       phrase

Arantor

Unfortunately the email spec is so badly misimplemented that clients invariably do just suck it up either way.

I've implemented it and will see if people have issues with it, I think.
Holder of controversial views, all of which my own.


FfdG

Today there was another quirk in a "Summary of posts awaiting approval at..." notification:
Subject: =??B?...?=
From: "=??B?...?=" <[email protected]>

Empty charset in encoded-word and "Encoding: 7bit" lead to broken decoding of subject and from headers. Our forum still uses ISO-8859-1 (or 15?) and therefore contains characters above 127. Not only my mail notifier is complaining, also the web interface of my pretty popular mail provider (also doesn't decode quoted from).

Arantor

Hmm.

Well, I've just committed the original matter - RFC2047 compliance - to SMF itself (originally I was talking about it on a fork)... but I've never encountered the second issue. I can see that the subject is pushed through mimespecialchars() and the From is broken because of the " in there as you rightly pointed out.

But I can't imagine why the subject is broken - what charset were you using? Was it definitely ISO-8859-1 or not? (The code varies between how it reacts if you're using ISO-8859-1 or something else in non UTF-8 mode)
Holder of controversial views, all of which my own.


Advertisement: