Closed Bug 574989 Opened 14 years ago Closed 8 years ago

Malformed multipart/alternative message with text/html part first only displays text/plain

Categories

(MailNews Core :: MIME, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 49.0

People

(Reporter: mark, Assigned: bugzilla)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 5 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100611 Firefox/3.6.4 ( .NET CLR 3.5.30729; .NET4.0C) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 If a message is multipart/alternative with a text/html alternative part first followed by the text/plain alternative part, Thunderbird displays only the text/plain part regardless of the View -> Message Body As setting. Granted Thunderbird's behavior is strictly compliant with sec. 5.1.4 of RFC 2046, but given that the text/html followed by text/plain sub-part ordering within a multipart/alternative part is an obvious and correctable error and given that MUAs[1] exist that create these messages, I think Thunderbird should be more flexible in this situation. [1] At least some versions of Lotus Notes do this. The particular message that prompts this bug report is a reservation confirmation letter from alaskaair.com. Reproducible: Always Steps to Reproduce: 1. Open the following message for example: Return-Path: <mark@msapiro.net> X-Original-To: mark@ms2.msapiro.net Delivered-To: msapiro_mark@sbh16.songbird.com Received: from msapiro.net (msapiro.net [68.183.193.239]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sbh16.songbird.com (Postfix) with ESMTP id 9E92269008C for <mark@ms2.msapiro.net>; Sat, 26 Jun 2010 12:53:11 -0700 (PDT) Received: from [127.0.0.1] (helo=msapiro) by msapiro.net with smtp (Exim 4.69) (envelope-from <mark@msapiro.net>) id L4N0KN-0002FS-8M for mark@ms2.msapiro.net; Sat, 26 Jun 2010 12:53:11 -0700 MIME-Version: 1.0 From: Mark <mark@msapiro.net> To: MARK@MSAPIRO.NET Date: 26 Jun 2010 12:08:14 -0700 Subject: Test mail Content-Type: multipart/alternative; boundary=--boundary_130_69e35cfc-9aec-494a-b58c-5441f10e5534 Message-Id: <20100626012214.389F56900AA@sbh16.songbird.com> X-GPC-MailScanner-ID: 9E92269008C.AF4B6 X-GPC-MailScanner: Found to be clean X-GPC-MailScanner-SpamCheck: not spam, SpamAssassin (cached, score=-3.049, required 5, autolearn=not spam, BAYES_00 -0.75, HTML_MESSAGE 0.00, RCVD_IN_DNSWL_MED -2.30) X-GPC-MailScanner-From: mark@msapiro.net X-Spam-Status: No ----boundary_130_69e35cfc-9aec-494a-b58c-5441f10e5534 Content-Type: text/html; charset=utf-8 <html> <head></head> <body> This is <b>the HTML</b> part </body> </html> ----boundary_130_69e35cfc-9aec-494a-b58c-5441f10e5534 Content-Type: text/plain; charset=utf-8 This is the plain text part ----boundary_130_69e35cfc-9aec-494a-b58c-5441f10e5534-- 2. Select View -> Message Body As -> Original HTML Actual Results: The text/plain part is displayed Expected Results: The text/html part is displayed
(In reply to comment #0) > I think Thunderbird should be more flexible in this situation. I don't think so. A way to see all parts in multipart/alternative is sufficient for malformed multipart/alternative mail, which is very rare compared with number of mails user daily receives. If ad-hoc(per mail, one time) option and/or Tb-wide option to see multipart/alternative as multipart/mixed is provided, user can check contents via other than View/Message Source. (1) Change multipart/alternative to multipart/mixed internally. (2) Add text/plain part which states "multipart/mixed mode for m.p/alternative" as first part internally. This is usefull too in "different contents in text/plain and text/html" case. This approach can be applied to not-used parts in multipart/related which is other kind of malformation of mail structure produced by buggy mailers. Tb-wide option: [ ] Show multipart/alternative as multipart/mixed. ad-hoc, per mail, one time option: Button to see in alternative or mixed if alternative exists. It's similar concept to "ad-hoc show in HTML button" for Tb-wide View/Message Body As/Plain Text setting, which is requested in other bug.
Severity: normal → enhancement
OS: Windows XP → All
Hardware: x86 → All
Version: unspecified → Trunk
Component: Message Reader UI → MIME
Product: Thunderbird → MailNews Core
QA Contact: message-reader → mime
Not ad-hoc way, but new "All Body Parts" of View/Message Body As, which is global setting, has been implemented by Bug 602718. The "All Body Parts" mode is mode which shows any multipart/xxx as if multipart/mixed, so it's applied to multipart/alternative. Because of this functionality, text/plain and text/html part is probably shown at same time, and both parts are shown as attachment at attachment pane, and, as a result of sort out/clean up by bug 351224, any part can be detached/deleted. Does the new "All Body Parts" fulfill your requirement? Or requesting flexible way to choose part in multiple/alternative? Or requesting feature like Bug 301441?
(In reply to WADA from comment #3) > > Does the new "All Body Parts" fulfill your requirement? It is acceptable to me. I can think of a pathological case that might still not allow a 'correct' rendering - i.e. multipart/alternative multipart/related text/html image/jpeg (referenced by the HTML part) text/plain I think current Tbird will display only the text/plain part regardless of View -> Message Body As selection. If I understand View -> Message Body As -> All Body Parts it will display the text/html and the image/jpeg separately instead of displaying the image as referenced in the HTML. However, this is still better than current, and anyway, I doubt that such a message structure actually would be found in the wild.
> multipart/alternative > multipart/related > text/html > image/jpeg (referenced by the HTML part) > text/plain > I think current Tbird will display only the text/plain part regardless of View -> Message Body As selection. Yes, you are right. Because "Original HTML" and "Simple HTML" is currently mode of "capable to process both html and text", Tb always selects last(most preferred) text/plain part according to RFC. To improve this, next is required in multipart/alternative processing; (1) Start with "capable to process html only" mode when "... HTML" is requested. As html only mode, text/html part is selected. (2) If no text/html part exists, or nothing is written in text/html part, or text/html part is space only, fall back to "capable to process text only" mode, and select text/plain part Tb curremtly shows text converted from text/html part when "Plain Text"(Bug 253830). So, similar change on "Plain Text" is also needed to avoid such bug. (1) Start with "capable to process text only" mode when "Plain Text" is requested. As text only mode, text/plain part is selected. (2) If no text/plain part exists, or nothing is written in text/plain part, or text/plain part is space only, fall back to "capable to process html only" mode, and select text/html part, then do one of next; - Convert to text and show as text, because "Plain Text" is requested Any above change is never RFC violation. Above are RFC compliant behaviour. > If I understand View -> Message Body As -> All Body Parts it will display > the text/html and the image/jpeg separately instead of displaying the image > as referenced in the HTML. Yes. It's because "All Body Parts" is currently mode of "show any multipart/xxx as multipart/mixed" in order to provide generic way to access/detach/delete any part of any malformed or wellformed mail.. To improve malformed multipart/related mail handling, Bug 674473 for multipart/related is already requested. I feel enhancement of "All Body Parts", say "multipart/related+mixed+alternative mode", is better. - Even if multipart/mixed, resolve cid: url as if multipart/related mail and render it. Show all parts as attachment normally. - Even if multipart/related, show non-displayed/non-referred part as attachment, or show all parts as attachment, after render multipart/related normally. (should be capable to choose which, in case of html mail consists of many (small image parts?) - Select and render appropriate part in multipart/alternative normally. Show non-selected part in multipart/alternative as attachment, or show all parts in multipart/alternative as attachment.
This data-loss bug is still present (4 years after reporting. It manifests in Outlook Calendar invites showing up as a blank message in TB with no indication that the Calender include has been ignored (not even turned into an attachment). This fits into the basic rule of thumb ... Be conservative in what you send out - Outlook fails this, since shouldn't be using Multipart/alternative Be liberal in what you accept - TB fails this by ignoring the incorrectly formated, but recognizable message.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
RFC 1341 says that the last part should be displayed: https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html 7.2.3 The Multipart/alternative subtype. Clearly this sample does not comply. I'll leave this bug open since the behaviour of TB could be improved here to handle incorrect messages better. This is clearly an enhancement.
Attached patch Prioritize alternative parts better (obsolete) (deleted) — Splinter Review
This patch should fix the issue.
Attachment #8758914 - Flags: review?(mozilla)
Comment on attachment 8758914 [details] [diff] [review] Prioritize alternative parts better Review of attachment 8758914 [details] [diff] [review]: ----------------------------------------------------------------- I'll look at it in detail later, but here are some nits. ::: mail/test/mozmill/message-window/test-view-plaintext.js @@ +94,5 @@ > > + // 6) plain and HTML parts reversed in order > + checkSingleMessage("./test-alt-plain-HTML-reversed.eml", "Plain Text", "HTML Body"); > + // 7) 3 alt. parts with 2 plain and 1 HTML part > + checkSingleMessage("./triple-alt.eml", "PLAIN TEXT VERSION", "HTML VERSION"); Can you use "Plain Text" and "HTML Body" for consistency. Where is the message? You need to run "hg add" to add a new file. Also, please stick to the already existing naming: test-triple-alt.eml in this case. @@ +100,2 @@ > // Now some cases that don't work yet. > + // 7) multipart/related with embedded multipart/alternative We have two cases 7) now.
Attached patch Prioritize alternative parts better (obsolete) (deleted) — Splinter Review
New patch with revised test
Attachment #8758914 - Attachment is obsolete: true
Attachment #8758914 - Flags: review?(mozilla)
Attachment #8758927 - Flags: review?(mozilla)
Assignee: nobody → bugzilla
Status: NEW → ASSIGNED
Comment on attachment 8758927 [details] [diff] [review] Prioritize alternative parts better Review of attachment 8758927 [details] [diff] [review]: ----------------------------------------------------------------- Slowly getting to it ... ;-) ::: mailnews/mime/src/mimemalt.cpp @@ +454,5 @@ > + /* TODO: > + If the user prefers plaintext and if this is a multipart that > + contains a text/plain part (and it is not a converted/downgraded > + text/html part) then the priority should be a bit less than > + PRIORITY_HIGHEST, but above PRIORITY_NORMAL. Do we still need this comment or is this done now? You have PRIORITY_HIGH in between those two now. Am I missing something?
No that comment still stands. I have added a priority between PRIORITY_HIGHEST and PRIORITY_NORMAL (as you said), but this comment is not about which specific priority to use. It is about that multipart messages that contain text/plain should have a bit higher priority than multipart messages that does not contain text/plain when the user prefers plaintext. I still do not know how that can be done, so I let that comment stand.
Just to clarify further, the multipart messages I am taking about here, are sub-parts of the multipart/alternative we are processing in this context.
Could you please add a test for multipart/alternative text/plain multipart/related text/html image text/sanitizer-log (or any other text/xxx). Just copy one of the existing tests and add the text/xxx. This structure multipart/alternative text/plain text/html text/sanitizer-log already works as per bug 101719 comment #27. Also, would you mind changing the boundaries to boundary="------------alternative" etc. as I have done in my tests since I find this better readable than boundary="----=_NextPart_000_0771_01C1466B.F8E13BC0"
Attached patch Prioritize alternative parts better (obsolete) (deleted) — Splinter Review
added new test, and changed existing test
Attachment #8758927 - Attachment is obsolete: true
Attachment #8758927 - Flags: review?(mozilla)
Attachment #8759791 - Flags: review?(mozilla)
Please feel free to add to the tests yourself.
Comment on attachment 8759791 [details] [diff] [review] Prioritize alternative parts better Review of attachment 8759791 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for your work. It looks good, let's go with it. Please address the review comments. Mostly nits (you understand "nit", right?). I'm puzzled by the "rtf" case you added. There is very little RTF support in all of Mozilla (well, I added the ability to paste RTF to the editor and get a JS notification for it). I'm 99.9995% sure that no RTF will ever be displayed. As for the previous comment of making changes to your patch. Impossible, since then I couldn't review my own work ;-( I'm sure I get the new patch from you soon. Please note that one Monday there is branch day, so we step up the version. Therefore it would be good to land this before Monday. Sorry for the small delay in the review (but same-day reviews are very rare). ::: mailnews/mime/src/mimemalt.cpp @@ +378,5 @@ > an earlier alternative, or both. > */ > priority_t priority = PRIORITY_UNDISPLAYABLE; > + bool prefer_plaintext = false; > + nsIPrefBranch *prefBranch; Please move into the if clause. @@ +416,5 @@ > + > + /* (with no / in the type) */ > + if (!PL_strcasecmp(content_type, "text")) { > + if (prefer_plaintext) { > + /* When in plain text view, a plain text part is what we want */ Full stop. @@ +419,5 @@ > + if (prefer_plaintext) { > + /* When in plain text view, a plain text part is what we want */ > + return PRIORITY_HIGH; > + } > + /* We normally prefer other parts over the unspecified text type */ Full stop. @@ +435,5 @@ > + } > + /* Because the html and the text part may be switched, > + * or we have an extra text/plain added by f.ex. a buggy virus checker, > + * we prioritize text/plain lower than normal. > + */ According to out coding standard, comments should look like this: // /* foobar */ /* * foobar */ Now, this module tramples all over these rules and uses /* foobar more foobar */ or /* foobar more foobar */ Let's not introduce yet another variation. @@ +442,5 @@ > + > + if (!PL_strncasecmp(text_type, "html", 4) || > + !PL_strncasecmp(text_type, "enriched", 8) || > + !PL_strncasecmp(text_type, "richtext", 8) || > + !PL_strncasecmp(text_type, "rtf", 3)) { Where did "rtf" come from? You know that this is RTF, right: {\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}} {\colortbl ;\red255\green0\blue0;} \uc1\pard\sa200\sl276\slmult1\cf1\lang9\f0\fs22 red} For all I know, we don't handle RTF at all. So please remove. @@ +446,5 @@ > + !PL_strncasecmp(text_type, "rtf", 3)) { > + return PRIORITY_NORMAL; > + } > + > + /* We prefer other parts over unknown text types */ Full stop. @@ +460,5 @@ > + only a part of the message, but if there are multiple multipart > + sub-parts then one that contains a true text/plain should be chosen > + over a multipart that does not contain a true text/plain.) > + But I do not know how to do this. - Terje Bråten. > + */ The comment terminator should move to the left. See comment above. Frankly, I'd prefer to remove the comment since I don't see the added value. We have a more or less intricate system of attributing priority, so if that doesn't work, we'll revisit it. But I guess the purpose of the bug is to make it work ;-)
> For all I know, we don't handle RTF at all. So please remove. We might handle it in the future, no? When this function runs we have already established that we can display this content inline. The purpose of that if statement is to stop those cases to fall through to the "return PRIORITY_TEXT_UNKNOWN;" statement. I just added anything in that if statement that could be relevant, and should have priority normal if it is ever implemented.
Attached patch Prioritize alternative parts better (obsolete) (deleted) — Splinter Review
New patch after review
Attachment #8759791 - Attachment is obsolete: true
Attachment #8759791 - Flags: review?(mozilla)
Attachment #8759897 - Flags: review?(mozilla)
> New patch after review Great, thanks for all the changes, but the RTF needs to go. I haven't got time now to research it, but as far as I know, there will never be RTF support, see for example bug 78458 comment #16. What happens with your code if a message contains text/plain and text/rtf and HTML display is selected? That selects the text/rtf part or not?
Maybe not, I tried. But I find the "RTF" misleading since it surely doesn't work. If one day (which will never come) it should get supported, we will add the statement there. I could tolerate a comment if you wish, but "enriched" and "richtext" are pretty obvious there, don't you think?
(In reply to Terje Bråten from comment #22) > When this function runs we have already established that we can display this > content inline. Sorry, I missed that line before trying it. As you know, the display technology in Thunderbird comes from Mozilla core and I don't see that they will ever be able or willing to render RTF given that they ferociously removed it.
Attached patch Prioritize alternative parts better (obsolete) (deleted) — Splinter Review
Added comment to the "if" with "rtf" in it. Also added "calendar" to text/... parts that should have normal priority.
Attachment #8759897 - Attachment is obsolete: true
Attachment #8759897 - Flags: review?(mozilla)
Attachment #8759958 - Flags: review?(mozilla)
Given that we have this test on line 383: if (clazz && clazz->displayable_inline_p(clazz, sub_hdrs)) { this makes sure only parts that are displayable inline will be considered. If "text/rtf" ever passes this test, it should have priority normal and not priority text_unknown. Therefore I would like rtf to be in the "if" test at line 448, because it is a part of the mime standard, and thus (even if it is just in theory) could one day be implemented in thunderbird.
Fix English grammar error in comment.
Attachment #8759958 - Attachment is obsolete: true
Attachment #8759958 - Flags: review?(mozilla)
Attachment #8759960 - Flags: review?(mozilla)
Comment on attachment 8759960 [details] [diff] [review] Prioritize alternative parts better I'm not 100% convinced, but let's go with it. In the future please put r=xxx, where xxx is the reviewer. No need to submit a new patch, I will fix the checkin comment and than land this later today.
Attachment #8759960 - Flags: review?(mozilla) → review+
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 49.0
Thank you for getting it in. Finally, a 6 year old bug is fixed. ;-)
6 years is nothing. We have and ongoing competition of who fixes the oldest bug. I think I win with bug 26734: 2000 bug 117236: 2001 bug 209189 bug 232021 bug 233705 bug 253830: 2004 - you fixed that.
Depends on: 1334937
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: