Closed
Bug 448198
Opened 16 years ago
Closed 8 years ago
format=flowed: Reply to non-f=f message containing certain characters breaks quoting
Categories
(Core :: DOM: Serializers, defect)
Core
DOM: Serializers
Tracking
()
VERIFIED
WORKSFORME
People
(Reporter: mcow, Unassigned)
References
(Blocks 2 open bugs)
Details
(Keywords: dataloss, regression, Whiteboard: [no l10n impact][needs updated patch with unit tests])
Attachments
(1 file, 3 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review |
A plain-text but not format=flowed message which contains one of the characters
& < >
(call these "entity" characters) directly adjacent to a space (before or after) will be misquoted on reply, if you are set up to send f=f mail. A space character before the entity will be deleted; a space after the entity will be doubled.
Steps to reproduce:
1) Turn format=flowed off (set mailnews.send_plaintext_flowed to False)
2) Compose a plain-text message and send it to yourself. Put this text in the message:
============
Ben & Jerry
Ben < Jerry
Ben > Jerry
Ben ; Jerry
Ben ' Jerry
Ben &Jerry
Ben <Jerry
Ben >Jerry
Ben 'Jerry
Ben ;Jerry
Ben& Jerry
Ben< Jerry
Ben> Jerry
Ben' Jerry
Ben; Jerry
============
2) Turn format=flowed back on (set the pref back to True).
3) When the message arrives, reply to it (in plaintext). Examine the quoted text.
Actual results:
3) Spaces next to [&<>] chars are mangled:
============
> Ben& Jerry
> Ben< Jerry
> Ben> Jerry
> Ben ' Jerry
> Ben ; Jerry
>
> Ben&Jerry
> Ben<Jerry
> Ben>Jerry
> Ben 'Jerry
> Ben ;Jerry
>
> Ben& Jerry
> Ben< Jerry
> Ben> Jerry
> Ben' Jerry
> Ben; Jerry
============
Expected results:
3) Spaces are preserved:
============
> Ben & Jerry
> Ben < Jerry
> Ben > Jerry
> Ben ' Jerry
> Ben ; Jerry
>
> Ben &Jerry
> Ben <Jerry
> Ben >Jerry
> Ben 'Jerry
> Ben ;Jerry
>
> Ben& Jerry
> Ben< Jerry
> Ben> Jerry
> Ben' Jerry
> Ben; Jerry
============
This is trunk-only, not seen on 2.0. I'm not sure, but I'm guessing this is fallout from one of Andriy's patches, probably the one at bug 155622.
Comment 1•16 years ago
|
||
Thank you Mike. I will check whether this is the regression caused by my patches.
Comment 2•16 years ago
|
||
Mike, you are right - this is the regression caused by my patch to bug 261467. Though, i can't say now which of the changes there caused this regression and why. TBC...
Comment 3•16 years ago
|
||
(In reply to comment #2)
> the regression caused by my patch to bug 261467.
sorry - not bug 261467, but bug 215068.
Updated•16 years ago
|
Assignee | ||
Updated•16 years ago
|
Product: Core → MailNews Core
Comment 4•16 years ago
|
||
It seems that this patch fixes the bug.
Updated•16 years ago
|
Attachment #332360 -
Flags: review?(mscott)
Updated•16 years ago
|
Assignee: nobody → andrit
Whiteboard: [has patch]
Updated•16 years ago
|
Attachment #332360 -
Flags: review?(mscott) → review?(bienvenu)
Updated•16 years ago
|
Target Milestone: --- → Thunderbird 3.0b2
Updated•16 years ago
|
Whiteboard: [has patch] → [has patch] [patch needs testing]
Comment 5•16 years ago
|
||
duh, can I even review this, since it's part of content? Bz, can you review this, or recommend someone who can?
Comment 6•16 years ago
|
||
Bz, see previous comment, thx!
Comment 7•16 years ago
|
||
I can probably do that, given a sufficiently clear explanation of what's going on (e.g. why just those characters are affected).
Comment 8•16 years ago
|
||
It's a long story, but i'll try to remember... :)
Here spaces are deleted by stringpart.Trim(" ", PR_FALSE, PR_TRUE, PR_TRUE); and appended by mCurrentLine.Append(PRUnichar(' ')); (see patch). It happens only on quoted text (i.e. preformatted) which is handled by first “major codepath” in nsPlainTextSerializer::Write() (see comment there).
While debugging this I noticed, that for some reason, when handling the text Write() receives not the full string line (for example, like “Ben & Jerry”) but it is called thee times with following arguments: “Ben ” – first, “&” symbol second and “ Jerry” and the end. That’s why space after Ben is trimmed and one space before Jerry is stuffed (code recognize it as start of new line). And this happens only with &, > and < symbols, in HTML-editing mode and with quoted text.
So to protect from such situations I added additional checks: to not trim if it is preformatted text in HTML-mode (IsInPre()) and to stuff space if it is not quotation (mCiteQuoteLevel == 0).
Now while explaining I recognized the flow of this patch. It won’t trim EOL spaces of Preformatted HTML text and thus lines will be jointed in format flowed capable viewers. What if to add the same check for quotation for trimming? I’m not sure we have to trim EOL spaces in quoted text.
Thanks for forcing me to explain things :)
Comment 9•16 years ago
|
||
Here is new patch.
Attachment #332360 -
Attachment is obsolete: true
Attachment #332360 -
Flags: review?(bienvenu)
Comment 10•16 years ago
|
||
Comment on attachment 336995 [details] [diff] [review]
patch2.0
thx for the new patch. requesting review from bz - if that line is well over 80 chars, it might want to be wrapped, and we might prefer !mCiteQuoteLevel.
But I have a question - does your patch still work if you reply to the first reply in your test case, where the mCiteQuoteLevel will be 1?
Attachment #336995 -
Flags: review?(bzbarsky)
Comment 11•16 years ago
|
||
(In reply to comment #10)
> But I have a question - does your patch still work if you reply to the first
> reply in your test case, where the mCiteQuoteLevel will be 1?
David, what do you mean?
Comment 12•16 years ago
|
||
I'll be honest; I really don't understand all the bits here well enough to review this... I don't even have a good feel for what this code is trying to do, much less what the edge cases are.
Comment 13•16 years ago
|
||
I tried the patch, but unfortunately it doesn't seem to work.
The simple way to see the bug is to hit reply - plain text composition - on a bug mail, then notice the space wrongness around the < and >. (On both sides.)
E.g. for the previous comment.
original: (todo: 200+ items)<bzbarsky@mit.edu> 2008-09-26
reply: (todo: 200+ items)<bzbarsky@mit.edu> 2008-09-26
patched: (todo: 200+ items)<bzbarsky@mit.edu> 2008-09-26
Comment 14•16 years ago
|
||
(In reply to comment #12)
> I'll be honest; I really don't understand all the bits here well enough to
> review this... I don't even have a good feel for what this code is trying to
> do, much less what the edge cases are.
Boris, thank for for honesty :) Did you have a chance to read my comment #8 where I tried to explain the bits?
Comment 15•16 years ago
|
||
Yes, and then you posted a new patch different from the previous one...
Also, what about comment 13?
Comment 16•16 years ago
|
||
Boris, in comment #8 i tried to explain the problem with patch1.0, that's why patch2.0 appeared. About comment 13 - no idea. The patch2.0 works for me, see screenshot attached. Magnus, could you provide more details? Thank you.
Comment 17•16 years ago
|
||
The details are pretty simple: apply the patch, hit reply on any bug mail and look at the --- Comment #16 from line.
Comment 18•16 years ago
|
||
Magnus, at what to look?
Comment 19•16 years ago
|
||
In the quote the line looks like
> --- Comment #18 from Andriy Tkachuk<andrit@ukr.net> 2008-09-29
Although it should be
> --- Comment #18 from Andriy Tkachuk <andrit@ukr.net> 2008-09-29
Comment 20•16 years ago
|
||
Ah, right. I see now. Thanks.
Well, it should be fixed probably from another side. First, it should be investigated, why it works in following way:
(In reply to comment #8)
> While debugging this I noticed, that for some reason, when handling the text
> Write() receives not the full string line (for example, like “Ben & Jerry”) but
> it is called thee times with following arguments: “Ben ” – first, “&” symbol
> second and “ Jerry” and the end. That’s why space after Ben is trimmed and one
> space before Jerry is stuffed (code recognize it as start of new line). And
> this happens only with &, > and < symbols, in HTML-editing mode and with quoted
> text.
Updated•16 years ago
|
Attachment #336995 -
Attachment is obsolete: true
Attachment #336995 -
Flags: review?(bzbarsky)
Updated•16 years ago
|
Attachment #340916 -
Attachment is obsolete: true
Comment 21•16 years ago
|
||
Presumably because the HTML escaping needed means breaking up the string and feeding it in a bit at a time.
Comment 22•16 years ago
|
||
(In reply to comment #20)
> > And this happens only with &, > and < symbols, in HTML-editing mode and with
> > quoted text.
As appeared, not only. Replying to notification mail from Bugzilla in plain-text mode also affected.
Updated•16 years ago
|
Whiteboard: [has patch] [patch needs testing] → [needs new/working patch]
Comment 24•16 years ago
|
||
As reported in bug 456053, reply to f=f plain-text messages unflows the quote, which appears to be related to the problem here. Andriy has posted a patch in that other bug, which includes a patch (update?) for this bug as well, but that was 2008-09-22, thus may be obsoleted by now with the discussion here.
Comment 25•16 years ago
|
||
Andriy: any update on this? Would be good to get this moving asap, especially as it affects core code.
Comment 26•16 years ago
|
||
Sorry guys - pretty tough schedule now...
Comment 27•16 years ago
|
||
Magnus, please try this patch. Pay attention also that following bugs are still fixed with this patch: bug 125928, bug 215068, bug 261467. Thank you!
Comment 28•16 years ago
|
||
Comment on attachment 353384 [details] [diff] [review]
patch3.0
I'll run with this patch, but we need to get a module owner to review it...
Comment 29•16 years ago
|
||
Tried the patch, the previous problem is gone, however
I sent myself
<><><><>
< > < > < >
><><><
> < > < > <
Replying I get
> <><><><>
> < > < > < >
> ><><><
> > < > < > <
... so I think there's still some minor problem there. Thanks for working on this!
Comment 30•16 years ago
|
||
Andriy, any idea about the problem Magnus saw? In particular, I guess there's an extra space after the '>' on the last two lines.
Comment 31•16 years ago
|
||
moving to b3, though we'd certainly take a patch for b2
Target Milestone: Thunderbird 3.0b2 → Thunderbird 3.0b3
Comment 32•16 years ago
|
||
Since this is a pure Gecko bug, in addition to fixing previously mentioned bug, it will also be necessary to write unit tests. I can imagine two possible ways for this get approved for the appropriate Gecko branch:
1) A new patch with unit tests gets posted, reviewed, and landed on the trunk quickly. We might then be able to convince the 1.9.1 drivers to allow this fix as a ridealong if there's another release candidate for Firefox 3.5 after RC1.
2) This happens more slowly, and we land it after Firefox 3.5 ships (for 1.9.1.1). There's some risk that Thunderbird could ship before 1.9.1.1 does, however.
Andriy, what sort of time are you likely to be able to devote to this soonish?
Component: Composition → Serializers
Flags: blocking-thunderbird3+
Product: MailNews Core → Core
QA Contact: composition → dom-to-text
Whiteboard: [needs new/working patch] → [needs updated patch with unit tests] [tb3needs]
Target Milestone: Thunderbird 3.0b3 → ---
Updated•15 years ago
|
Flags: in-testsuite?
Updated•15 years ago
|
Status: NEW → ASSIGNED
Comment 33•15 years ago
|
||
(In reply to comment #32)
> 2) This happens more slowly, and we land it after Firefox 3.5 ships (for
> 1.9.1.1). There's some risk that Thunderbird could ship before 1.9.1.1 does,
> however.
Well, this obviously missed that milestone and won't make 1.9.1.2 either.
So, what's the fallback if a fix doesn't come in? Backing out the patch for
bug 215068 again or living with its issues? Sounds like picking the lesser
of two evil... :-\
Comment 34•15 years ago
|
||
I'd be somewhat surprised if the Gecko folks allowed us to back out something that old without tests as well, so we may just need to push forward and get this fixed somehow.
Andriy, would you be able to make some time to address the problem in comment 29 and put together some unit tests in the near future?
Keywords: dataloss
Updated•15 years ago
|
Whiteboard: [needs updated patch with unit tests] [tb3needs] → [needs updated patch with unit tests] [tb3needs][no l10n impact]
Updated•15 years ago
|
Whiteboard: [needs updated patch with unit tests] [tb3needs][no l10n impact] → [no l10n impact][needs updated patch with unit tests] [tb3needs]
Comment 35•15 years ago
|
||
Andriy and I had a short email exchange, and he said that the code is tricky enough that he has given up on trying to unregress this. He recommends backing out bug 215068.
I notice that philor only has one blocker, so I'm hoping he'll be interested in picking this up, figuring out whether a backout is actually practical at this point, and either driving that or proposing another strategy...
Assignee: andrit → philringnalda
Reporter | ||
Comment 36•15 years ago
|
||
Please don't back out. This bug is annoying, but not as annoying as the bug that the patch fixed.
Comment 37•15 years ago
|
||
I see two possible strategies:
One: I drop everything else, patches and reviews both, and spend all my time from now to shipping on first learning about f=f, and then tracking down every fix we've made to this code, so I know what output we want from what input so I can stick it into tests, then fill out the rest of the tests for things we got right in the first place and haven't yet broken, then learn the code well enough to patch it, then find someone willing to review it even though in the past pretty much nobody has been willing to do so, and then in the end find myself unwilling to lie to the 1.9.1 drivers, so that I have to tell them that this is in fact an API that already shipped, but because I like us better than I like a hypothetical extension which is using that API and already working around the broken bits, I want to change that shipped API on a stable branch and break that hypothetical extension, at which point they (better) say no.
Two: we drop the tb3needs, accept that we already slipped this bug late last spring when 1.9.1 first froze, and start out right now to persuade the people who already know about f=f and care about this bug to work up the testcases (without needing to write the automated tests, just needing to create a single comprehensive list saying that "- --" in format X quoted into format Y must be output as "(whatever it should be)" and in format Y quoted into format Y must be whatever else, for all the tens or hundreds of cases), so that we can actually fix it for whatever version of Tb we next ship off a Gecko that isn't yet frozen.
(Yeah, there's a third strategy, where we either fork the serializer or do our own post-processing workarounds, but I'm not persuaded that this bug, which has garnered zero duplicates over the course of almost two years, actually calls for that sort of really terrible idea.)
Assignee: philringnalda → nobody
Status: ASSIGNED → NEW
Comment 38•15 years ago
|
||
Phil's reasoning seems sound. This code feels fragile to touch at this late date anyway, so removing tb3needs.
Whiteboard: [no l10n impact][needs updated patch with unit tests] [tb3needs] → [no l10n impact][needs updated patch with unit tests]
Comment 39•15 years ago
|
||
Could there be any relationship with <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=519165"> bug 519165 ?
Comment 40•15 years ago
|
||
Note that not only replies are affected, bug 541978 states a reproducible problem related to the issue here when composing a message from scratch and
a downconversion from HTML to plain text occurs upon sending...
Comment 43•14 years ago
|
||
(In reply to comment #0)
> Actual results:
> 3) Spaces next to [&<>] chars are mangled:
> ============
> > Ben& Jerry
> > Ben< Jerry
> > Ben> Jerry
> > Ben ' Jerry
> > Ben ; Jerry
Plain Text mode composition utilizes <pre> style internally.
As bug 541978 and bug 597181 is HTML/<pre> case with format=flowed, phenomenon becomes obvious. (Note: Options/Format/Plain and Rich(HTML) Text is required to see text/html part due to Bug 414299.)
Check result with Tb 3.1.7 on Win-XP.
> Content-Type: multipart/alternative; boundary="..."
>
> --...
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> P< Q&&...&& X> Y
>
> --...
> Content-Type: text/html; charset=UTF-8
>
>(snip, three spaces between words)
> <pre>P < Q &&...&& X > Y
> </pre>
Phenomenon looks:
- spaces before character entity is removed.
- a space is added after word which starts with the character entity.
Blocks: 597181
Comment 44•14 years ago
|
||
Why the heck is this bug unfixed after 2.5 years?
Mangling messages is not an option. I'm raising the severity of this. If someone wants to drop it back down, they can add a comment explaining why.
Severity: normal → major
Comment 45•14 years ago
|
||
> Why the heck is this bug unfixed after 2.5 years?
Because there's no one around who understands the code involved, and no one has stepped up to learn it and fix it, other than Andriy. And so far he hasn't written a correct patch yet.
If you're willing to work on this, that's great: you're in as good a position to do it as anyone else is.
Comment 49•13 years ago
|
||
Here is a thunderbird extension from Paolo Bonzini that you can install into "~/.thunderbird/<profile name>/extensions" while waiting for this bug to ever get fixed:
http://people.redhat.com/pbonzini/format-flawed.tar.gz
"It just disables format=flowed when replying to non-format=flowed messages. It's ugly as hell, but it does the job."
Comment 50•13 years ago
|
||
A partial workaround is to select the original text, copy it to the clipboard, click reply, remove the quoted text and then insert as citation (Ctrl-Shift-O). Empty lines after lines already quoted in the original text seem to be dropped, but spaces around &, < and > are preserved.
Comment 51•13 years ago
|
||
Re comment #50: Is this what you really want end-users to do?
Comment 52•13 years ago
|
||
(In reply to David E. Ross from comment #51)
> Re comment #50: Is this what you really want end-users to do?
I'm just an end-user, and I won't jump through these hoops every time I reply to an email. I just used it as a last resort when commenting on a diff.
It's interesting that Thunderbird apparently has the ability to do the right thing built in, though. Perhaps this code path can be used to implement a solution or at least a band-aid fix?
Updated•12 years ago
|
Summary: f=f: Reply to non-f=f message containing certain characters breaks quoting → format=flowed: Reply to non-f=f message containing certain characters breaks quoting
Comment 53•11 years ago
|
||
Seems to work in Thunderbird 24.
Comment 54•11 years ago
|
||
Not for me. When I do selective quoting (and only then!), this still happens, e.g. with URLs. _Very_ annoying. (BTW, the _very_ would have been affected as well from this bug if this was a mail :-)
Comment 55•11 years ago
|
||
Indeed, while the original testcase works now for me, I can still reproduce this with a different testcase (and independent of selective vs. full quoting). For example:
==================
foo
- bar
+ baz
pig
==================
is all misaligned after replying. My extension needs an update for TB24, too (gPrefBranch => Services.prefs).
Comment 56•8 years ago
|
||
I produced myself message with *no* f=f with the content described on comment #0 and comment #55.
I did full and partial replies with f=f and can't see a problem. That's using today's Daily.
Should this still be an issue, please open a new bug with a reproducible example.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•