Closed Bug 87653 Opened 23 years ago Closed 12 years ago

Message body contents are not displayed when Content-Type header is folded, doesn't handle whitespace (boundary="abc [CRLF] xyz"[CRLF] is specified, but --abcxyz is used for boundary line in mail)

Categories

(MailNews Core :: MIME, defect)

x86
Windows NT
defect
Not set
major

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: naving, Assigned: DAntrushin)

References

(Blocks 1 open bug, )

Details

(Keywords: dataloss, helpwanted, testcase, Whiteboard: [patchlove][has draft patch])

Attachments

(1 file, 2 obsolete files)

From Bugzilla Helper: User-Agent: Mozilla/4.7 [en]C-AOLNSCP (WinNT; U) BuildID: 2001-06-25-04-trunk I have a message in my inbox for which the body does not get displayed in the message pane. If you do View | Message Source you can see the source. Also it worksfine on 4.x. I can send the message to the person who will work on this bug. Also it happens on 2001062004. Reproducible: Always Steps to Reproduce: 1.Select the message Actual Results: The contents are not displayed. Expected Results: The contents should be displayed.
If I fwd the message it gets displayed.
Attached file Body of an email that shows blank (deleted) —
I can confirm that this bug is still happening in Mozilla 0.9.3. I added an attachment containing a complete email that is exhibiting this problem. Hopefully that will help fix the bug.
This is besause Content-Type header of that message is folded (in terms of rfc822): Content-Type: multipart/alternative; boundary="=_alternative 0011E5AD86256AC0_=" According to rfc822, (unfolded) boundary value should be "=_alternative 0011E5AD86256AC0_=" (CRLF and all spaces at the beginning of next line are replaced with single space), while in mozilla it's "=_alternative 0011E5AD86256AC0_=" (extra spaces are not removed. This is bug in MIME_StripContinuations (mozilla/mailnews/mime/src/mimehdrs.cpp) But there is yet another problem with that header and I can't find answer in RFCs yet: first line have whitespace at the end: ..."=_alternative<SPACE><CRLF> What to do with SPACE? I believe that spaces at the end of line should be trimmed, but this is quoted string, so what to do with this trailing space? If we don't remove it, message still will not be displyed (we'll have double space after word 'a;ternative', but separator actually used has only one space. If we remove that whitespace unconditionally... is it safe? Does anyone here have Lotus Notes mailer? If so, could you please send me (adu@sparc.spb.su) small message with attachment? P.S.: I think that platform/OS should All, it's not windows only bug :-)
Attached patch remove rfc822's line continuations (obsolete) (deleted) — — Splinter Review
Did I understand rfc2822 right: folding whitespace ([*WSP CRLF] 1*WSP) is semantically equivalent to just whitespace (even inside of quoted string)? In that case: Index: mimehdrs.cpp =================================================================== RCS file: /cvsroot/mozilla/mailnews/mime/src/mimehdrs.cpp,v retrieving revision 1.52 diff -u -r1.52 mimehdrs.cpp --- mimehdrs.cpp 2001/09/28 20:07:43 1.52 +++ mimehdrs.cpp 2001/11/06 12:50:49 @@ -807,7 +807,10 @@ /* p2 runs ahead at (CR and/or LF) */ if ((p2[0] == nsCRT::CR) || (p2[0] == nsCRT::LF)) { - p2++; + p2++; + while (nsCRT::IsAsciiSpace(*p1)) p1--; + while (*p2 && nsCRT::IsAsciiSpace(*p2)) p2++; + if (*p2) *p1++ = ' '; } else { *p1++ = *p2++; } Or, even just like that (isn't too risky?): if (nsCRT::IsAciiSpace(*p2)) { p2++; while ( *p2 && nsCRT::IsAsciiSpace(*p2)) p2++; if (*p2) *p1++ = ' '; } else { *p1++ = *p2++; } In any case, breaking line at the middle of qstring is a bad idea of lotus notes mailer, i think :-)
Blocks: 234547
Bug 226502 may be related.
Product: MailNews → Core
has patch, needs owner :) will be challenging to find dupes. bug 317263?
Assignee: mscott → nobody
Severity: critical → major
Component: MailNews: Backend → MailNews: MIME
QA Contact: esther → mime
Summary: Message body contents are not displayed. → Message body contents are not displayed when Content-Type header is folded, doesn't handle whitespace
Product: Core → MailNews Core
ran testcase, still fails.
Whiteboard: [patchlove][needs owner]
Attachment #56241 - Flags: superreview?(bienvenu)
Attachment #56241 - Flags: review?(bienvenu)
this looks like the right thing to do, actually. But we need an hg patch to start with, and one that doesn't have tabs, etc.
Comment on attachment 56241 [details] [diff] [review] remove rfc822's line continuations we should also have the while and if clauses on their own lines. I can try this in my own tree and see what happens. We'd also want a test case for this.
Flags: in-testsuite?
Comment on attachment 56241 [details] [diff] [review] remove rfc822's line continuations the patch doesn't apply, and if I fix it to apply, and then fix it to compile by using NS_IsAsciiWhitespace, it still doesn't work - in fact, this code doesn't seem to get hit.
Attachment #56241 - Flags: superreview?(bienvenu)
Attachment #56241 - Flags: superreview-
Attachment #56241 - Flags: review?(bienvenu)
Attachment #56241 - Flags: review-
Keywords: helpwanted
My suspicion is that you'd want to fix MimeHeaders_get to strip continuations correctly.
Comment on attachment 56241 [details] [diff] [review] remove rfc822's line continuations Obsoleting the patch due to rejected review.
Attachment #56241 - Attachment is obsolete: true
Denis, any chance of an updated patch?
(In reply to comment #15) > Denis, any chance of an updated patch? won't be hearing from Denis, his address bounces.
Keywords: testcase
I lost password from my old account, so could not update it with new email address. I'm surprised this bug wasn't fixed in 8 years since I left active work with mozilla. :-) I can not promise updated patch anytime soon - to make it, I will have to learn how to develop mozilla again.
Attached patch stip line continuations (obsolete) (deleted) — — Splinter Review
MimeHeaders_get is not used to strip continuations for boundary parameter. In MimeMultipart_initialize() we have: 118 ct = MimeHeaders_get (object->headers, HEADER_CONTENT_TYPE, PR_FALSE, PR_FALSE); 119 mult->boundary = (ct 120 ? MimeHeaders_get_parameter (ct, HEADER_PARM_BOUNDARY, NULL, NULL) 121 : 0); And in MimeHeaders_get_parameter we have 504 rv = mimehdrpar->GetParameterInternal(header_value, parm_name, charset, 505 language, getter_Copies(result)); This is nsMIMEHeaderParamImpl::GetParameterInternal who improperly strip continuations: 250 // if the parameter spans across multiple lines we have to strip out the 251 // line continuation -- jht 4/29/98 252 nsCAutoString tempStr(valueStart, valueEnd - valueStart); 253 tempStr.StripChars("\r\n"); 254 *aResult = ToNewCString(tempStr); 255 NS_ENSURE_TRUE(*aResult, NS_ERROR_OUT_OF_MEMORY); 256 return NS_OK; Attached path fixes the problem. Note, however, that I didn't hacked mozilla for last 8 year, so this patch most likely won't be acceptable as is :-)
Denis, please attach a patch that excludes ogg stuff - I don't think ogg relates to MIME, does it? :) Also, please do a `hg diff` or `hg export` from the comm-central folder, not the comm-central/mozilla folder. After that, you should be set to request review from bienvenu. I could get the patch to work once I worked around these. :)
Also, assigning to Denis.
Assignee: nobody → DAntrushin
Status: NEW → ASSIGNED
Whiteboard: [patchlove][needs owner] → [patchlove]
(In reply to comment #6) > Did I understand rfc2822 right: folding whitespace > ([*WSP CRLF] 1*WSP) is semantically equivalent to just whitespace > (even inside of quoted string)? In that case: My understanding of folding/unfolding of message header defined by RFC 2822 is as follows. - Folding : Insert a [CRLF] before a(single) WSP Unfodling: Remove [CRLF] before a WSP (the single WSP should be kept) - [CRLF] for folding can be inserted before any WSP in message header, although RFC recommends insert at WSP for delimiter of higher level. e.g. Content-Disposition: attachment; filename="abc xyz.txt" => attachment;[CRLF] filename="abc xyz.txt" instead of attachment; filename="abc [CRLF] xyz.txt" - Interpretation of folded message header should be done after unfolding. So, I think the message header has to be interpreted as next. > Content-Type: multipart/alternative; boundary="=_alternative 0011E5AD86256AC0_=". Denis Antrushin, what do you mean by "folding whitespace"? Is this bug for tolerance of Tb with such wrongly created message header by bug of old mailer or old mail server? This bug's quirks will produce real problem on next VALID header with spaces in boundary delimiter. > Content-Type: multipart/xxx; boundary=" ... ...[CRLF] ..."[CRLF] > (boundary delimiter line) > [CRLF]-- ... ... ...[CRLF] Please note that space is a valid character of boundary delimiter. > http://tools.ietf.org/html/rfc2046 > boundary := 0*69<bchars> bcharsnospace > bchars := bcharsnospace / " " > bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / > "+" / "_" / "," / "-" / "." / > "/" / ":" / "=" / "?" AFAIK, quirks of next exists. remove WSP for folding and following spaces" in name="abc[CRLF] def.txt" in Conetnt-Type: header. (I don't know about filename of Content-Disposition:) AFAIR, reason of the quirks was that such header was generated by MS's software. So, I'm not opposite to implementation of quirks for this bug's case. However, break in above VALID header case should be cared for, because quirks by this bug apparently produces RFC violation by Tb for above VLAID header. Note: Quirks on name parameter won't produce real problem, because quirks on file name. Is quirks by this bug still required? Do many mailer still send mail of this bug's header? Note: The header was generated by Beta of first version of Lotus Notes in 2001. > X-Mailer: Lotus Notes Build M10_08082001 Beta 3 August 08, 2001
(In reply to comment #21) > My understanding of folding/unfolding of message header defined by RFC 2822 is > as follows. > - Folding : Insert a [CRLF] before a(single) WSP Section 2.2.3 says: The general rule is that wherever this standard allows for folding white space (not simply WSP characters), a CRLF may be inserted before any WSP. Section 3.2.3 defines folding whitespace (FWS) as FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space obs-FWS obs-FWS = 1*WSP *(CRLF 1*WSP) ; obsolete FWS I.e., there may be any whitespaces before and at least one after CRLF. > Unfodling: Remove [CRLF] before a WSP (the single WSP should be kept) > - [CRLF] for folding can be inserted before any WSP in message header, Section 2.2.3: Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. but in Section 3.2.3 we read: Runs of FWS, comment or CFWS that occur between lexical tokens in a structured field header are semantically interpreted as a single space character. Does it means that _whole_ FWS is interpreted as a single space? > Denis Antrushin, what do you mean by "folding whitespace"? This is term from RFC 2822 > Is this bug for tolerance of Tb with such wrongly created message header > by bug of old mailer or old mail server? I have no idea. > Is quirks by this bug still required? Do many mailer still send mail of this > bug's header? > Note: The header was generated by Beta of first version of Lotus Notes > in 2001. > > X-Mailer: Lotus Notes Build M10_08082001 Beta 3 August 08, 2001 Have no idea, either. I've been out of mozilla development for 8 years and been surprised to see this bug still open and being asked for updated patch :-) Also, two bugs mentioned in this report (234547 and 317263) seems as a different issues for me
Thanks for pointing RFC description. As you say, issues are next (i) / (ii) is true or false. (i) Spaces in quoted string is "folding white spaces". (ii) Spaces in quoted string is "Runs of FWS, comment or CFWS that occur between lexical tokens in a structured field header". In any case, next (A) should be interpreted after conversion to (B) by unfolding. > (A) Content-Type: multipart/xxx; boundary=" ... [CRLF] ..."[CRLF] > (B) Content-Type: multipart/xxx; boundary=" ... ..."[CRLF] Because I saw many Content-Type: aa/bb; name="xx[CRLF] yy.ext"[CRLF] in bugs, and I didn't see description of "RFC violation" in such bugs, I think (i) is true. Because the spaces is one in quoted string which is a token(semantically same as a word), I think (ii) is false. However, if { number of mails with (P) >> number of mails with (Q) } && { number of mails with (Q) is negligible } && { number of mails with (P) is still not so small }, quirks by this bug is practically acceptable. Boundary in Content-Type: Used boundary (P) This bug's case : boundary="abc [CRLF] xyz"[CRLF] --abcxyz (Q) Sample in Comment #9 : boundary=" abc [CRLF] xyz"[CRLF] -- abc xyz (R) Apparently valid one : boundary=" abc xyz"[CRLF] -- abc xyz
Dennis, will you be following up on the draft patch? current procedure is https://developer.mozilla.org/En/Developer_Guide/How_to_Submit_a_Patch
Whiteboard: [patchlove] → [patchlove][has draft patch]
Summary: Message body contents are not displayed when Content-Type header is folded, doesn't handle whitespace → Message body contents are not displayed when Content-Type header is folded, doesn't handle whitespace (boundary="abc [CRLF] xyz"[CRLF] is specified, but --abcxyz is used for boundary line in mail)
I can update the patch, but what about concerns expressed in comments #21 and 23? If the root of the problem is broken Lotus mailer and fix could break valid messages, do we want to fix it? It's 8 years old and seems did not caused much trouble to anyone except submitter and has no vote. Also note that fix for this bug won't fix bug 234547.
(In reply to comment #25) > I can update the patch, but what about concerns expressed in comments #21 and bienvenu / dmose, ping &/or thoughts?
Comment on attachment 398222 [details] [diff] [review] stip line continuations Adding review flags to get this onto David's radar.
Attachment #398222 - Flags: superreview?(bienvenu)
Attachment #398222 - Flags: review?(bienvenu)
I see same problem. With latest Thunderbird 3.0 (and previous 2.x versions) MIME boundary containing commas (and other non-alpha chars) doesn't get decoded. MIME message just shown inline as text not decoded. Coming from someone using a nokia E65 phone. 2 messages okay: boundary="EPOC32-z1G82Qp+YQsMY44N0z3DsrBDJ5L2s9mKj+Ykc140ms_jxhkn" boundary="EPOC32-Lx73B+hVbJ5wSRpXTlLKyjwD8YWVf1hg8bHk+'Tssd4tJHbT" 5 messages not okay (note the one without the comma in boundary .. a - at end of boundary is a problem?): boundary="EPOC32-GQ-K-kR,QDS7Gvzz5VbDXdBbM477'73Sc_wWCYc,nmD9fKgf" boundary="EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6" boundary="EPOC32-0VSK1,9DqPMqGctWX_PyNxx,2+FmXKn5b1DM9_K7yclDZlst" boundary="EPOC32-yMNqRYYmV'jRmXlDTj7rtscB8H9mtYrjHfxb0JFc7VyYK47-" boundary="EPOC32-WnvGdp2s03kDPHxjvTW,khRc1,4_C-q1BWv_lMZdg6_-CwBs"
No. I'm wrong. It's not same problem. Sorry! The problem I see seems to be because the email headers have a blank line inside the To: header line. Email readers Thunderbird/Evolution/Outlook/mutt display the message body starting after the blank line in To: line and MIME message type is not detected. nokia + hotmail munpack extracts attachments okay. >From muh Tue May 11 16:00:42 2010 Return-path: <muh@meh.mah> Envelope-to: muh@meh.mah Delivery-date: Tue, 11 May 2010 16:00:42 +0100 Received: from meh.hotmail.com ([meh.mah.meh.moo]) by dspsrv.com with esmtp (Exim 4.71) (envelope-from <muh@meh.mah>) id 1OBqwr-0008WG-4B for muh@meh.mah Tue, 11 May 2010 16:00:42 +0100 Received: from meh ([meh.mah.meh.moo]) by meh.hotmail.com with Microsoft SMTPSVC(meh.mah.meh.moo); Tue, 11 May 2010 08:00:39 -0700 X-Originating-IP: [meh.mah.meh.moo] X-Originating-Email: muh@meh.mah Message-ID: <muh@meh.mah> Received: from [meh.mah.meh.moo] ([meh.mah.meh.moo]) by meh.hotmail.com over TLS secured channel with Microsoft SMTPSVC(meh.mah.meh.moo); Tue, 11 May 2010 08:00:24 -0700 From: muh@meh.mah Reply-to: muh@meh.mah To: <muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah>, X-OriginalArrivalTime: 11 May 2010 15:00:26.0865 (UTC) FILETIME=[B0A6DE10:01CAF11A] Date: 11 May 2010 08:00:26 -0700 <muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah>,<muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah>, <muh@meh.mah> Subject: Does this mean Date: Tue, 11 May 2010 16:00:19 +0100 Message-ID: <muh@meh.mah> X-Mailer: EPOC Email Version 2.10 MIME-Version: 1.0 Content-Language: i-default Content-Type: multipart/mixed; boundary="EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6" This is a MIME Message --EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6 Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable text of message muhed mehed and mahed too --EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6 Content-Type: image/jpeg Content-Disposition: attachment; filename="11052010.jpg" Content-Transfer-Encoding: base64 /9j/4RusRXhpZgAASUkqAAgAAAAIAA8BAgAGAAAAbgAAABABAgAEAAAARTYz ABIBAwABAAAAAQAAABoBBQABAAAAdAAAABsBBQABAAAAfAAAACgBAwABAAAA AgAAABMCAwABAAAAAQAAAGmHBAABAAAAhAAAAKoBAABOb2tpYQAsAQAAAQAA . . 9pceYAFO7jp7VV1NAkxKD7wzg1E04yEmmivbSOjZwRnqPWr28FVPbINataXI vpY//9k= --EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6--
Comment on attachment 398222 [details] [diff] [review] stip line continuations sorry for the delay - this is core necko code so I can't technically review it.
Attachment #398222 - Flags: superreview?(cbiesinger)
Attachment #398222 - Flags: superreview?(bienvenu)
Attachment #398222 - Flags: review?(cbiesinger)
Attachment #398222 - Flags: review?(bienvenu)
I am noticing something similar in bug #574155 where the Content-Type appears to have a line feed and spaces before the data IE: Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document; name="DocumentName.docx" This causes TB to display a binary representation of the attachment in the message body, also when opening the attachment notepad.exe is used. The message in question was sent from Squirrel Mail 1.4.19, message sent from TB itself do not have the line feed + space issue. A bug has been filed with SM as I am not sure where the issue is. -Ron
I believe Dennis was more on track with his reading of the RFC. I think WADA has an incorrect understanding of what to do with spaces after the CRLF in a fold. The [CRLF] and any following whitespace should be treated as a single space. It's more clear (with examples that speak to this issue) in RFC 822, section 3.1.1. I realize that has been obsoleted by 2822, but I think that section is still relevant and puts this question to rest. @Ron - this bug is a pretty good indicator that your issue is with Thunderbird
On more detailed reading, I think the problem is that the RFCs are simply unclear and contradictory. RFC 822 (section 3.1.1) is contradictory wherein its examples show that you can add any number of spaces after the CRLF and it is supposedly identical to a single space, but then it goes on to state that "Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char." RFC 2822 is just as contradictory, as section 2.2.3 states "Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP." However, the definition of FWS is: ([*WSP CRLF] 1*WSP), which implies that a fold is any trailing spaces on the first line, the CRLF and any spaces after that (which should all be removed when unfolding). I tend to think that the implied meaning is that unfolding is accomplished by removing the CRLF and all spaces both before and after it, but I'm not sure how many clients do this. SquirrelMail does not. WADA believes Thunderbird should not. Who knows.
(In reply to comment #33) FYI. Mail data attached to comment #0. > X-Mailer: Lotus Notes Build M10_08082001 Beta 3 August 08, 2001 > Date: Thu, 6 Sep 2001 22:15:29 -0500 Following is Comment #4 by Denis Antrushin on 2001-11-02. > According to rfc822, (unfolded) boundary value should be > "=_alternative 0011E5AD86256AC0_=" > (CRLF and all spaces at the beginning of next line are replaced with single space), > while in mozilla it's > "=_alternative 0011E5AD86256AC0_=" > (extra spaces are not removed. RFC 2822: > Request for Comments: 2822 QUALCOMM Incorporated > Obsoletes: 822 April 2001 > Category: Standards Track "Lotus Notes Build M10_08082001 Beta 3 August 08, 2001" looks to have used RFC822 for folding, with bug of "excess space just before inserted [CRLF] for folding". "RFC822 folding/unfolding or RFC822 folding/unfolding" was possibly option of Lotus Notes, because Lotus Notes has option for "Return-Receipt-To:" or "Disposition-Notification-To:". Mozilla at 2001-11-02 apparently applied RFC2822 instead of RFC822 to unfolding. RFC2822 defines folding pattern produced by RFC822 and refers to problems in folding of RFC822. > 4.2. Obsolete folding white space > In the obsolete syntax, any amount of folding white space MAY be > inserted where the obs-FWS rule is allowed. This creates the > possibility of having two consecutive "folds" in a line, and > therefore the possibility that a line which makes up a folded header > field could be composed entirely of white space. > obs-FWS = 1*WSP *(CRLF 1*WSP) paul@squirrelmail.org, your knowledge about header folding/unfolding looks based on RFC822. Please note that boundary line of --=_alternative0011E5AD86256AC0_= is absolutely mail sender side RFC violation even if unfolding of RFC822 is applied, > RFC822 : boundary="=_alternative 0011E5AD86256AC0_=" > RFC2822 : boundary="=_alternative 0011E5AD86256AC0_=" although apparent bug of old Lotus Notes Beta is "adding a space before [CRLF] for RFC822 folding". For quirks by this bug. If pattern is like next, automattic quirks of "application of RFC822 unfolding" + quirks of "remove space(s) just before [CRLF] for RFC822 folding" is possible. > Content-Type: xxx/yyy; boundary="abcdefg[SP][CRLF] > [SP] ... [SP][CRLF] <== RFC violation, because space only line is invalid. > [SP] ... [SP][Non-SP-chars]";[SP][CRLF] > [SP] ... [CRLF] However, if next, it's impossible to know which folding was used, and it's impossible to know space before [CRLF] is valid one or garbage by mailer's bug. > Content-Type: xxx/yyy; boundary="abcdefg[SP][CRLF] > [SP] ....[SP][Non-SP-chars]"[CRLF] If "RFC822 unfolding"+"quirks for space(s) just before [CR]" is still required, I think folder option like next(for wrong charset) is better. > [?] Apply default to all messages in the folder (individual message > character encoding settings and auto-detection will be ignored) My questions are; - "any number of added spaces after [CRLF] in RFC822 folding" is really applicable to RFC822 folding within quoted text as value of boundary parameter? - "application of RFC822 unfolding" is still required for very old mails? - quirks of "remove space(s) just before [CRLF] for RFC822 folding" is still required? - If quirks is still required, can quirks be "remove any spaces in boundary parameter" after RFC2822 unfolding? Folder Properties: [?] Remove space(s) in boundary parameter of multipart for torelance with header folded by RFC822 folding. If this kind of quirks, it can be applied to bug 234547 case too. I guess (number of modern mailers who use space in boundary line) is far smaller than (number of buggy mailers who produce problem like bug 234547).
Reply to comment #34 > "Lotus Notes Build M10_08082001 Beta 3 August 08, 2001" looks to have used > RFC822 for folding, with bug of "excess space just before inserted [CRLF] for > folding". No, I think that's an incorrect interpretation of that header. I think they were trying to follow RFC2822. See below. > paul@squirrelmail.org, your knowledge about header folding/unfolding looks > based on RFC822. Why would you say that when I in fact quoted both 822 and 2822? Please read with care. > > RFC822 : boundary="=_alternative 0011E5AD86256AC0_=" > > RFC2822 : boundary="=_alternative 0011E5AD86256AC0_=" You put these here like this is the unquestioned way to unfold. My point is that there is NOT a clear definition of how to unfold -- depending on the section of the RFC you are reading (either 822 OR 2822), you can make a case that all white space around a CRLF should be removed or that only one white space after the CRLF should be removed (or replaced with a single space). So I believe that your claim about how to unfold could be argued to be wrong (in fact I think your interpretation of RFC822 unfolding IS wrong). As I see it, these are the possible interpretations of how to unfold that header, depending on how you read the RFCs: RFC822 : "=_alternative 0011E5AD86256AC0_=" ("CRLF WSP ==> WSP"; per section 3.1.1, last sentence of 2nd to last paragraph) RFC822 : "=_alternative 0011E5AD86256AC0_=" ("CRLF 1*WSP ==> WSP"; per section 3.1.1 examples) RFC2822 : "=_alternative 0011E5AD86256AC0_=" ("CRLF WSP ==> WSP" ("CRLF is invisible"); per section 2.2.3 and most of section 3.2.3) RFC2822: "=_alternative 0011E5AD86256AC0_=" ("*WSP CRLF 1*WSP ==> WSP"; per last paragraph of section 3.2.3) However, because the RFCs are both contradictory, I can't say which is right or which is wrong. My FEELING is that the "CRLF is invisible" approach is best in that it allows the recipient to respect what the sender was doing with multiple spaces (as long as the spaces aren't fluff). Otherwise, intentional spacing near a fold gets munged. In the specific case of this old Lotus Notes header, the extra spaces are in fact fluff, which is THEIR problem IMO. In that sense, I would say that this bug is INVALID and Thunderbird's current behavior is RFC-CORRECT. > although apparent bug of old Lotus Notes Beta is "adding a space before [CRLF] > for RFC822 folding". This is NOT a bug in the sense that you suggest! I believe this was probably INTENTIONAL. That space is NOT "added." It is NOT "garbage" as you suggest. The space at the end of the line before the fold corresponds to the actual space that is in the real boundary string that is used later. The fold happens after the space. I think they left it on the end of the line before the fold so that it would not be removed by greedy unfolding (removal of all white space after CRLF). It is part of the boundary string. They seem to assume that unfolding would be "remove CRLF 1*WSP", which is, as far as I can tell, a misunderstanding of section 2.2.3, specifically "Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP," where they assumed removal of the CRLF AND the WSP, even though I believe the RFC is saying only the CRLF gets removed (which is further backed up in section 3.2.3). HOWEVER, their assumption/misunderstanding of section 2.2.3 ends up being a semi-valid way to read RFC2822, in that it gives you the same result as what is described by the last paragraph of section 3.2.3: "Runs of FWS, comment or CFWS that occur between lexical tokens in a structured field header are semantically interpreted as a single space character." Given that the date on the Lotus Notes version in use is after the release of RFC2822, this is a plausible explanation. > - "any number of added spaces after [CRLF] in RFC822 folding" is really > applicable to RFC822 folding within quoted text as value of boundary > parameter? 1) RFC822 section 3.1.1 is indeterminate on this point 2) I believe Lotus was following RFC2822, not RFC822 > - "application of RFC822 unfolding" is still required for very old mails? That seems like a rat's nest; moreover, it's probably not possible to detect the difference, especially considering that some clients might be adding extra spaces on purpose or for fluff. > - quirks of "remove space(s) just before [CRLF] for RFC822 folding" is still > required? You misunderstand what Lotus was doing. This is not a "quirk" or a "bug" per se. They were trying to follow RFC2822. I think this concern is unfounded and should be dropped. Keep in mind that although you assume there is only one way to unfold in a RFC-2822-compliant manner, this is not necessarily the case. So there is still another open question as to what the best way to unfold per RFC2822 is. My gut feeling is that Thunderbird is already doing the right thing. (My prior comment @Ron thus has to be taken back; SquirrelMail has a small bug if this is the case.) It would be interesting to see an additional header added to tell recipients how to unfold: X-HEADER-UNFOLDING: Remove CRLF X-HEADER-UNFOLDING: Remove CRLF WSP X-HEADER-UNFOLDING: Remove CRLF WSP+ X-HEADER-UNFOLDING: Remove WSP+ CRLF WSP X-HEADER-UNFOLDING: Remove WSP+ CRLF WSP+
Comment on attachment 398222 [details] [diff] [review] stip line continuations Denis, I couldn't get the patch to apply cleanly as-is, were the liboggplay changes for FreeBSD intended to be in this patch as well? Moreover the nsMIMEHeaderParamImpl.cpp file now seems to be in the mozilla/netwerk/mime/ directory. Up for a new patch? :)
Attachment #398222 - Attachment is obsolete: true
Attachment #398222 - Flags: superreview?(cbiesinger)
Attachment #398222 - Flags: review?(cbiesinger)
I don't think a Thunderbird patch should be what you want. As I said in my last (long) comment #35, "I would say that this bug is INVALID and Thunderbird's current behavior is RFC-CORRECT." It is my opinion that if you find this "problem" with Thunderbird viewing emails, you need to contact the authors of the email client that was used to compose the offending message.
This bug ended up in Wayne's TheList, hence I am reviewing it, but looking over the content it seems to me that this really needs a ruling about whether a fix would even be accepted. :squib and/or :jcranmer, could you review this and make some comments about whether it makes sense to fix this or not?
What is the intent of the RFC? Quoting from 5322, 2.2.3: The general rule is that wherever this specification allows for folding white space (not simply WSP characters), a CRLF may be inserted before any WSP. [...] Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. The confusion comes from RFC822's ambiguous definition for folding, the BNF for FWS, and this paragraph in 3.2.3: Runs of FWS, comment, or CFWS that occur between lexical tokens in a structured header field are semantically interpreted as a single space character. If we concern ourselves solely with the legal definitions in RFC 2822 and 5322 (they are, to my knowledge, equivalent), then the interpretation that comes out is that continuations should be dealt with by simply stripping CRLF (string.replace(/\r|\n/, ''), basically). The intention in section 3.2.3 is to remind us that spaces and comments in structured headers are merely separators for the actual tokens (think whitespace and comments in C or C++) and have no semantic meaning whatsoever. FWS can occur within a lexical token, namely quoted strings, but as is mentioned earlier, folding requires that CRLF be stripped. Given that I see several instances in RFC 5322 which state emphatically that CRLF (within FWS) is semantically invisible, it seems to me that the intent is that you could preprocess headers by deleting all CRLF from the header and there would be no semantic difference. Now, if there were clear evidence that CRLF + WSP* -> SP is the more common assumption and is necessary for compatibility, I would be persuaded to implement it. The lack of duplicates, votes, or complaining comments on this bug suggests to me that its impact is relatively minor, though.
Wayne: jcranmer says, which I tend to agree with, "The lack of duplicates, votes, or complaining comments on this bug suggests to me that its impact is relatively minor, though." So why is this in The List?
(In reply to Kent James (:rkent) from comment #40) > So why is this in The List? most likely because it had some of the attributes we are looking for - testcase, above average severity, good discussion. Plus draft patch, ... it looked ready to roll. Beyond that, why I would have chosen it instead of other mime bugs would have been highly subjective - perhaps even random - expecially given that it's highly unlikely I read all of the bug.
It's been two months since I made comment 39, and no evidence has been forthcoming that this bug is worth fixing. In lieu of such information, I am marking this as WONTFIX.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #36) > Comment on attachment 398222 [details] [diff] [review] > stip line continuations > > Denis, I couldn't get the patch to apply cleanly as-is, were the liboggplay > changes for FreeBSD intended to be in this patch as well? > > Moreover the nsMIMEHeaderParamImpl.cpp file now seems to be in the > mozilla/netwerk/mime/ directory. > > Up for a new patch? :) Would it make sense to refresh (un-bit-rot?) this patch, and try again? I don't see why not. Age of the bug is not enough to disregard it.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: