<a class="header-button" href="https://bugzilla-dev.allizom.org/home" title="Go to home page"> Bugzilla

Comment 9

•

21 years ago

*** Bug 199540 has been marked as a duplicate of this bug. ***

Updated

•

21 years ago

OS: Windows 98 → All

Hardware: PC → All

Comment 10

•

21 years ago

Attached patch proposed patch (obsolete) (deleted) — Details — Splinter Review

This patch replaces GetNextLocalLine() by GetNextLocalLogicalLine() which doesn't use ReadLine() but Read() and then gets a logical line from the buffer using RemoveCRLF(). RemoveCRLF() is able to distinguish line breaks within a folded header from a line break at the header line end. So we also get multi-line headers correctly. Although GetNextFilterLine() already extracted multi-line headers correctly, I also added RemoveCRLF() to GetNextFilterLine() to remove the line breaks in folded headers. That's because of RFC 2822, 2.2.3: "Each header field should be treated in its unfolded form for further syntactic and semantic evaluation." RemoveCRLF() may look complicated and costly, but most of the constructs are also in nsRandomAccessInputStream::readline() and called functions. So the new behaviour will not be noticeable slower than the old. An existing problem is now also fixed: A line that couldn't be read completely (if a line is longer than the buffer (512 bytes at the moment) doesn't contain at least one CR or LF) led to wasTruncated == true and stopped Mozilla read more of the message. I also did some changed in MatchArbitrary() to save the PL_strncasecmp() if buf starts with CR or LF (is the header terminator).

Updated

•

21 years ago

Assignee: mscott → ch.ey

Status: NEW → ASSIGNED

Updated

•

21 years ago

Attachment #141319 - Flags: review?(bienvenu)

Comment 11

•

21 years ago

Comment on attachment 141319 [details] [diff] [review] proposed patch mostly looks good, thx for doing this - I have some nits are about the variable and method names in RemoveCRLF and a question. cp and tp aren't useful variable names to me - I'm guessing that cp is charPtr but I'm not sure what tp is. Doesn't RemoveCRLF potentially remove multiple CRLF's? In which case it should be called RemoveCRLFs. Or maybe CoalesceMultilineHeader(). If the next line starts with multiple spaces or tabs, I don't see us removing all the spaces, just the leading space. And I don't see us replacing the tab with a space. I could be mis-reading, however...

Comment 12

•

21 years ago

Comment on attachment 141319 [details] [diff] [review] proposed patch Well, cp and tp were taken from readline that has originally been used. I guess it means target pointer. I don't know an intelligent name because these pointers are simply first and second pointer. Only in the memmove it's clear that cp is the destination and tp the source. Coalesce - I had to look it up in my dictionary. :) But yes, that name would be ok. This patch doesn't remove a single blank or tab and that was intended. I don't know if the spaces and tabs are always just folding or if there can be leading ones for another reasons. But replacing all spaces and tabs by a single space is simple. Shall I? One thing I just noticed is, that GetNextLine() is also used for MatchBody() and that my changes screwed this up completely. It was necessary to add a condition and move another.

Attachment #141319 - Attachment is obsolete: true

Attachment #141319 - Flags: review?(bienvenu)

Comment 13

•

21 years ago

Currently lines being part of a folded header but starting after byte bufSize (512) in the header entry aren't recognized as part of the header entry. I could work around this in MatchArbitraryHeader() with some effort. But I'm not sure if this is worth the costs/trouble, that long headers are very uncommon. Any thoughts?

Comment 14

•

21 years ago

Christian, I think you're right that's hdrs > 512 bytes is an edge case that might not be worth handling right now. Yes, I figured those var names were from copied code but this is still new code. This comment made me think we should do the replacing of multiple tabs/spaces: "RFC822/2822 is precise that the additional lines with leading whitespace are exactly equivalent to stringing the whole thing out with a single space separating the "lines." But thinking about it more, it just says "equivalent". However, if you wanted a filter to match "Discussion list for DDD, the GNU graphical"..., it would be easier to write the filter if we made it into: "Discussion list for DDD, the GNU graphical" So I think it would be a good idea.

Comment 15

•

21 years ago

> Yes, I figured those var names were from copied code but this is still > new code. What about cp == beforeSkip and tp == behindSkip. As I wrote, I don't know intelligent names for them. > So I think it would be a good idea. Ok, done.

Comment 16

•

21 years ago

ok, cp points to the linebreak, and tp starts off at the line break and then gets advanced to the start of the next line. Perhaps these variables are hard to name because their meaning shifts around, which doesn't make the code easy to read. Let me see if I can rewrite this to make more sense.

Comment 17

•

21 years ago

How about this? I haven't tested it, but I think it does what the old code does, and it's more readable. +PRInt32 nsMsgBodyHandler::RemoveCRLF(char *s) +{ + PRInt32 skipped = 0; + + char *eol = strpbrk(s, "\n\r"); + char *nextLine; + while(eol && eol != s) + { + nextLine = eol + 1; + if ((*eol == '\n' && *nextLine == '\r') || (*eol == '\r' && *nextLine == '\n')) + nextLine++; // possibly a pair. + + // add chars we will skip + skipped += nextLine - eol; + + // next line begins with white space + // and contains a line break (what means we have a whole line) + if(*nextLine == ' ' || *nextLine == '\t') + { + memmove(eol, nextLine, strlen(nextLine) + 1); // delete the line breaks + eol = strpbrk(nextLine, "\n\r"); // jump to the next line break + } + else + { + *eol = '\0'; + break; + } + } + + return skipped; + }

Comment 18

•

21 years ago

Hm, yes, that's fine. But > eol = strpbrk(nextLine, "\n\r"); is dangerous since nextLine is a few bytes in the next line after the memmove(). I'd use eol = strpbrk(eol, "\n\r"); And > if ((*eol == '\n' && *nextLine == '\r') || (*eol == '\r' && *nextLine == '\n')) is just copied, but > if (*eol == '\r' && *nextLine == '\n') should also be save. Or am I missing something? And another new question. MatchArbitraryHeader() is only called for non-standard headers. So e.g. Subject: or To: lines that are extraced using other functions do in fact contain all data from multiple lines but also the CRLF and all leading whitespaces. Do you think that's an issue we should care about? Applying CoalesceMultilineHeader() to the data in RowCellColumnToCharPtr() (which delivers data to e.g. GetSubject() called in ProcessSearchTerm()) would work. But besides for Subject: I don't think someone searches for a string over line borders (for Suject, CoalesceMultilineHeader() could also be applied to the subject string in ProcessSearchTerm() before MatchRfc2047String()).

Comment 19

•

21 years ago

>I'd use > eol = strpbrk(eol, "\n\r"); ah, you're right because the old line terminators are gone. >> if (*eol == '\r' && *nextLine == '\n') >should also be save. Or am I missing something? I wondered about that too - it seems safe to me. RowCellColumnToCharPtr() is too low a level since it can be used for arbitrary string values, other than header. It should either be in GetSubject(), GetTo, etc, or in the caller. I lean towards the caller of GetSubject, your last suggesting.

Comment 20

•

21 years ago

In order to call CoalesceMultilineHeader() from ProcessSearchTerm() we've to create a new nsMsgBodyHandler object. Or it has to be moved outside any class. The first one is slower and more complex, the later not so good style.

Comment 21

•

21 years ago

Problem still in Thunderbird version 0.6+ (20040428)

Comment 22

•

21 years ago

Right, since no patch has been checked in, why should it be different? Unfortunately the patch from bug 197166 broke my existing draft, so it will take another weeks to get a working patch.

Comment 23

•

21 years ago

Spam Assassin attaches some pretty long headers in verbose mode--over 512 characters.

Comment 24

•

21 years ago

yeah, I know - even though I broke Christian's patch, I did fix our handling of lines over 512 bytes...

Comment 25

•

20 years ago

*** Bug 252115 has been marked as a duplicate of this bug. ***

Comment 26

•

20 years ago

*** Bug 148612 has been marked as a duplicate of this bug. ***

Comment 27

•

20 years ago

Would this bug be a blocker for bug 240924? What about the case of bug 87653, where a MIME boundary is getting folded and then, on unfolding, has whitespace in the middle of the boundary text? (It's some other mail program doing that rather bogus folding.)

Comment 28

•

20 years ago

See comments #18 and #19 about using this patch for GetSubject() too. It doesn't look like it's a good idea to use it for non-search/-filter work.

Comment 29

•

20 years ago

*** Bug 243479 has been marked as a duplicate of this bug. ***

Comment 30

•

20 years ago

Adding keywords to summary for easier searching.

Summary: Filter or Search: do not handle multi-line headers correctly → Filter or Search: do not handle multi-line (wrapped, folded) headers correctly

Jeremy Faulkner

Comment 31

•

20 years ago

This mishandling of headers also manifests in improper thread display when the In-Reply-To: header is wrapped. <quote> In-Reply-To: Your message of "Sun, 07 Nov 2004 20:39:07 +1100." <20041107093907.GK79646@cirb503493.alcatel.com.au> </quote> Do the proposed patches address this manifestation of the bug as well?

Evan Prodromou

Comment 32

•

20 years ago

Let me just say: this is an amazingly rookie bug. I don't think there's another RFC822 parser on the planet that doesn't handle continued lines in headers correctly and automatically. A freshman compsci student who wrote RFC822 header-parsing code that didn't take continued lines into account would get an F. How such a parser got into production code is beyond me. That this is *even an issue* makes me extremely concerned; that it's been left open for TWO+ YEARS is simply mind-boggling.

Myk Melez [:myk] [@mykmelez]

Comment 33

•

20 years ago

(In reply to comment #32) > Let me just say: this is an amazingly rookie bug. I don't think there's > another RFC822 parser on the planet that doesn't handle continued lines in > headers correctly and automatically. This is no excuse to the bug, but let me say, that the affected code is not the one for header-handling like Date, From, To and also MIME headers. It's a search code that only looks for the header name at the line start. This search is done block-wise and that is IMHO the main problem. > A freshman compsci student who wrote RFC822 > header-parsing code that didn't take continued lines into account would get an > F. How such a parser got into production code is beyond me. > > That this is *even an issue* makes me extremely concerned; that it's been left > open for TWO+ YEARS is simply mind-boggling. It's great to have such a professional here and we always appreciate people who know how everything works. We'd be happy if you'd let us partake in your wisdom when you fix the bug. Thanks for helping.

jwq

Comment 34

•

20 years ago

Some extra information on folded headers which I hope will be useful. Some email clients fold the Content-Type header field for each keyword-value pair after the first, thus: Content-Type: text/plain; format=flowed; charset="iso-2022-jp"; reply-type=original Noteworthy, perhaps, because folding of the Content-Type is performed regardless. Having read the comments, may I suggest that the Summary for this bug be modified to read "Search matches text only on the first line of multi-line (wrapped, folded) header fields"?

Updated

•

20 years ago

Product: MailNews → Core

Comment 35

•

20 years ago

Cases where it failed before are now working correctly in Thunderbird 1.0. Huge thanks to whoever fixed it.

Comment 36

•

20 years ago

(In reply to comment #35) > Cases where it failed before are now working correctly in Thunderbird 1.0. I'm not seeing this working. Could you provide specific examples? For instance, I tried a search on Content-Type, contains, "boundary" -- the only hits have unfolded Content-Type headers.

Comment 37

•

20 years ago

I have the following in an email header: X-Spam-Status: No, score=-102.6 required=6.5 tests=BAYES_00,NO_REAL_NAME, USER_IN_WHITELIST autolearn=no version=3.0.1 In Thunderbird 0.6 I could search for "WHITELIST" in X-Spam-Status and it would not find it. Now it does.

Comment 38

•

20 years ago

(In reply to comment #37) > In Thunderbird 0.6 I could search for "WHITELIST" in X-Spam-Status and it would > not find it. Now it does. Doesn't work for me in 1.0. Example: X-Spam-Status: No, score=-3.595 required=5 tests=BAYES_00,NO_REAL_NAME,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.0.2 Searched for "BAYES_00" in X-Spam-Status and the message was not found. Searching for "No" or "required" worked, though.

Comment 39

•

20 years ago

By any chance does the term "whitelist" appear in any other header? See bug 209488 comment 4 -- false positives occur when searching using "contains" and other headers have the search string. I don't use whitelisting, but if I search X-Spam-Status for "HTML" I get a bunch of hits -- I believe because Content-Type also has "html". But if I search for "TBODY" or "REAL_NAME", the spams containing those strings in X-Spam-Status don't show a hit.

Comment 40

•

20 years ago

I checked several more examples, including a case where "autolearn" was on the 4th line. It works rock solid. Did you delete your old TB installation before installing the new one? The version I'm running is 1.0 20041206.

Comment 41

•

20 years ago

(In reply to comment #40) > Did you delete your old TB installation before installing the new one? The > version I'm running is 1.0 20041206. Yes. First time through I uninstalled and renamed the remaining directory before installing the new version. Just to be sure, I just uninstalled, wiped the folder, and reinstalled. No change. It's also 1.0 20041206. I have not wiped my profile, however, because I expect that would be a serious pain to rebuild.

Comment 42

•

20 years ago

Just to make sure we're on the same page: I'm selecting a folder, say Inbox, and then doing a control-Shift-F to display the "Search Messages" dialog. I'm selecting a custom entry of X-Spam-Status, the operator "contains" and then a string of "autolearn" (not in quotes). It finds emails where the string autolearn appears on the second line or later of the X-Spam-Status header. I don't have a clue what the difference is. It used to fail for me and I've been checking whether it was fixed or not with every new Thunderbird Release (I think it worked at 0.9 too.) FWIW, I'm running Windows 2000.

Comment 43

•

20 years ago

(In reply to comment #38) > Searched for "BAYES_00" in X-Spam-Status and the message was not found. > Searching for "No" or "required" worked, though. Will "Bayes_00" and "bayes_00" work?

Comment 44

•

20 years ago

(In reply to comment #42) > Just to make sure we're on the same page: I'm selecting a folder, say Inbox, and > then doing a control-Shift-F to display the "Search Messages" dialog. I'm > selecting a custom entry of X-Spam-Status, the operator "contains" and then a > string of "autolearn" (not in quotes). Exactly the same. > FWIW, I'm running Windows 2000. Same here. (In reply to comment #43) > Will "Bayes_00" and "bayes_00" work? No, though I can't imagine why they would if an exact match fails.

Comment 45

•

20 years ago

(In reply to comment #44) > (In reply to comment #43) > > Will "Bayes_00" and "bayes_00" work? > No, though I can't imagine why they would if an exact match fails. To calrify your problem is NOT "case" related problem(already opened bug when 'begin with'). As you say in comment #38, your problem sounds an example of this bug. > X-Spam-Status: No, score=-3.595 required=5 > tests=BAYES_00,NO_REAL_NAME,SPF_HELO_PASS,SPF_PASS > autolearn=no version=3.0.2 > Searched for "BAYES_00" in X-Spam-Status and the message was not found. > Searching for "No" or "required" worked, though. You say as follows in comment #42. >I'm selecting a folder, say Inbox, and then doing a control-Shift-F to display the "Search Messages" dialog. > I'm selecting a custom entry of X-Spam-Status, the operator "contains" and then a string of "autolearn" (not in quotes). > It finds emails where the string autolearn appears on the second line or later of the X-Spam-Status header. Does "CTRL+Sfift-F" find "BASE_00" or "tests="? Does message filter find "tests="? This is question to clarify ; - General search algorythm problem is involded or not - Message filter only problem or not (Problem in search, summary says, was already resolved?)

Comment 46

•

20 years ago

(In reply to comment #45) > Does "CTRL+Sfift-F" find "BASE_00" or "tests="? No. > Does message filter find "tests="? No. Ctrl-Shift-F, then searching in the header X-Spam-Status, the "contains" option, and then a string that only ever appears in the second or third lines of the folded header does not find any messages. Searching the exact same header with the same options for something that appears in the *first* line of the folded header DOES return results. The same is true of message filters.

Comment 47

•

20 years ago

(In reply to comment #46) Slight correction on filters: It does seem to work *as mail comes in*, but not when I run the filter through Tools->Run Filters on Folder. I didn't delete my test filter, and a message came in and tripped it. (The action was setting a label. I then reset the label to None, ran the filter from the menu, and this time the same message did *not* trip the filter.) I have no idea why this would make a difference, but it's there in my filter log.

Comment 48

•

20 years ago

(In reply to comment #47) > (In reply to comment #46) > Slight correction on filters: It does seem to work *as mail comes in*, but not > when I run the filter through Tools->Run Filters on Folder. Do you use "Global Inbox"? If yes, read thru next FAQs, http://kb.mozillazine.org/Thunderbird_:_FAQs_:_Global_Inbox http://kb.mozillazine.org/Thunderbird_:_FAQs_:_Filters then clarify what/where/when problem occurs, please.

Comment 49

•

20 years ago

(In reply to comment #48) > Do you use "Global Inbox"? No. > then clarify what/where/when problem occurs, please. This is a filter on a particular account, and I did all my testing on the inbox for that account. (In fact, if I were using the global inbox, the FAQ you referenced suggests I would see the *opposite* problem - i.e. filters working manually, but not on incoming mail) And of course, it doesn't work with searches either.

jwq

Comment 50

•

20 years ago

A little testing on the same email account has convinced me that searching the header contents in a Thunderbird POP account setup and a Thunderbird IMAP account setup yeild different results. The POP and IMAP accounts point to the same email account on the same server and should yeild the same results. Searching for the unique receiving server name in the 'Received' header, I see that: the POP account only matches on the first line of a folded Received header the IMAP account matches on any line of a folded Received header Perhaps this explains the difference in results observed above? Kevin, Mike and Kelson: what kind of accounts are you using, POP or IMAP?

Comment 51

•

20 years ago

(In reply to comment #50) > Kevin, Mike and Kelson: what kind of accounts are you using, POP or IMAP? I'm using POP.

Comment 52

•

20 years ago

(In reply to comment #50) > the POP account only matches on the first line of a folded Received header > the IMAP account matches on any line of a folded Received header This sounds to be able to explain Kelson's result in comment #47. > Slight correction on filters: It does seem to work *as mail comes in*, but not > when I run the filter through Tools->Run Filters on Folder. "Filter for incoming mail on POP3" is done during receiving mail, then it bocomes same process as search on IMAP, then no problem. - Header is to be analyzed on receive, internal header data is possibly unfolded. But, when "manual filter" or "search" on localy saved mail, header data is folded in mail folder file.

Comment 53

•

20 years ago

AHA! Things didn't get fixed with a new version of Thunderbird, but when I switched from POP3 to IMAP. When I search on IMAP folders it works. When I search on Local folders, it fails.

Comment 54

•

20 years ago

*** Bug 283759 has been marked as a duplicate of this bug. ***

Adam Guthrie

Comment 55

•

19 years ago

*** Bug 274753 has been marked as a duplicate of this bug. ***

Adam Guthrie

Comment 56

•

19 years ago

*** Bug 302396 has been marked as a duplicate of this bug. ***

Comment 57

•

19 years ago

I confirm this is still happening on Thunderbird 1.0.6 on Windows 2000, using POP3. Could someone please increase the priority and/or severity? It's very annoying.

Comment 58

•

19 years ago

This bug can not be a blocker of Tb 1.5, since all of released Mozilla mail&news and Tb has this problem, but I hope Tb 1.5 with this bug fixed. I'm very tired to answer "it's very old bug" in BBS, and to close bugs as DUP of this bug :-)

Severity: normal → major

Frank

Comment 59

•

19 years ago

I can confirm that it is still occuring on TB trunk nightly 20050713 To be SURE that it is because of the newline characters, I copied a test message to two sunfolders of one folder.. I then existed TB I modified the folder file for *one* of the folders and stripped the newline characters from a particular Received header. Then I deleted the .msf file for that TB folder I then started TB and did a search for unique characters that were in the 'second line' of that received header.. TB found the message I modified to have NO newlines ( because the characters were now in the 'first line') but did NOT find the one that still had the newlines Good luck

Comment 60

•

19 years ago

One of simple solutions is keeping unfolded version of mail header in local mail file, as Frank has proven. Limitation of single line length by SMTP/POP/IMAP is not applicable when a line in a file, except file size limitation by OS/file system and buffer size limitation by implementation, although care on mail forwarding or moving to IMAP server is required in order not to violate standards. Biggest problem is difficulty in fall back or compatibilty. Can older version of Moz mail/Tb read/handle unfolded mail file properly? How about other mail import programs or mailers?

Comment 61

•

19 years ago

I don't think it's important to check how other mail clients handle theese headers, the parser simply should comply with the RFC. I'm seeing the same problem with the Subject field. I don't know if it's because of the break lines or tabs, but it looks bad.

jwq

Comment 62

•

19 years ago

It appears that the Subject: field is searched fully even if it is wrapped. Nevertheless, if the matching string straddles a line break then the match will fail. That is, if the search string is "the string" and it occurs in the Subject: field as "the\n string" then the match will fail. This bug also, naturally, affects saved search folders (which is where I just noticed it, again).

Comment 63

•

19 years ago

Is someone working on this?

Comment 64

•

19 years ago

I took a look at the source code. I don't know much about it, but my best guess is that the problem is inside mailnews/mime/src, probably mimehdrs.cpp:MimeHeaders_get. It seems that function adds ",CR-LF-TAB" at the end of each line break, and the whatever uses that function needs to parse the header field. I would say the whole code needs a redesign, but for now I would say MimeHeaders_get shouldn't add those line breaks, or whatever is being used to pass header fileds to the ui should parse accordingly. I think it's nsMimeHeaders::ExtractHeader. Any thoughts?

David A. Cobb

Reporter

Comment 65

•

19 years ago

re: "Any thoughts?" Why is there any need to preserve the line-breaks in the headers at all? If I'm correctly interpreting RFC-?(2822?), the headers should simply be "normalized" by replacing everything from the last non-blank on line (n) up to the first non-blank on line (n+1) by a single blank (x#20). I'm not aware of any of the storage mechanisms that require short lines. So, why not simply normalize each header as it is received, and store it that way.

Comment 66

•

19 years ago

(In reply to comment #65) > So, why not simply normalize each header as it is received, and store it that > way. In fact I digged a little bit more and what I found is that the MimeHeaders_get function is only called at some point in the message display, not on the search, and not on the folder display. So it seems there is code that handle folded header fields in message display, search, and folder display for each of IMAP, POP3 and local. In each one of these it's different and in none of those is working properly except message display AFAIK. So it's really ugly. I would say a single message header normalization code is the way to go, it's just that I don't know where to put it, and how each of the relevant modules should call it. I tried IRC and the mailing lists to find someone who knows the code better, but no one answered.

Comment 67

•

19 years ago

*** Bug 330597 has been marked as a duplicate of this bug. ***

Comment 68

•

18 years ago

the fix in bug 338310 has fixed this problem except for the case where the search term spans lines. I'll leave this bug open for that problem.

Summary: Filter or Search: do not handle multi-line (wrapped, folded) headers correctly → Filter or Search: do not handle multi-line (wrapped, folded) headers correctly when search term spans lines

Comment 69

•

18 years ago

(In reply to comment #68) > the fix in bug 338310 has fixed this problem except for the case where the > search term spans lines. I'll leave this bug open for that problem. And what about the subject being displayed incorrectly on the main window? It is also a problem because related to the parsing of multiple lines.

Magnus Melin [:mkmelin]

Comment 70

•

18 years ago

*** Bug 356156 has been marked as a duplicate of this bug. ***

Andrey G. Sergeev

Comment 71

•

18 years ago

I can confirm that this bug is in place in Thunderbird 1.5.0.9 (Windows/20061207). Example: I need to search messages with the " 2006 " substring in the Received: header. The search results include messages with single-line Received: only, just like this one Received: by beetle.zenon.net (Postfix, from userid 400) id AFB945286; Wed, 30 Nov 2005 14:16:24 +0300 (MSK) but not these multi-line Received: from pc-170-40-215-201.cm.vtr.net (pc-170-40-215-201.cm.vtr.net [201.215.40.170]) by bird.zenon.net (Postfix) with ESMTP id 2A94A2CBC1; Thu, 15 Dec 2005 19:32:02 +0300 (MSK) Received: by fly.zenon.net (Postfix, from userid 400) id 0647E7E6; Fri, 16 Dec 2005 21:31:27 +0300 (MSK) Received: from SERVER2.computery.ru ([213.85.58.145] verified) by backend4.aha.ru (CommuniGate Pro SMTP 4.2.10) with SMTP id 72147738 for andris@aernet.ru; Mon, 14 Nov 2005 16:23:45 +0300 and the similar ones.

Comment 72

•

18 years ago

have you tried 2.0? The fix in that bug is not in 1.5.0.9, afaik.

Andrey G. Sergeev

Comment 73

•

18 years ago

Not yet but i'll try with the different installation of TB 2.0 because 2.0 is still in alpha stage. Thanks for the tip, David.

Comment 74

•

18 years ago

2.0 beta 2 came out last week.

Comment 75

•

18 years ago

And it still have the same bugs. From: "Felipe Contreras" <felipe.contreras@foobar.com> To: felipe.contreras@foobar.com List-Id: Discussion list for breakline, the breakline <break.org> Subject: This is a test that is testing This is a one line message. /usr/sbin/sendmail felipe.contreras@foobar.com < testmail.txt The filter on the second line of the Subject works, but not on List-Id. And the subject has weird characters when the character after the breakline is a tab. This is on IMAP.

Comment 76

•

18 years ago

Attached image Bad subject line (deleted) — Details

Comment 77

•

18 years ago

The bad-subject-line problem is bug 271312 (TB, bug 240924 for the suite). Per bug 184490, custom headers (which List-ID is) under IMAP get filtered OK in arriving mail, but fail when the filter is run "after-the-fact." MailViews (the dropdown list at the top of the thread pane) also fail for custom headers under IMAP. In these cases, the failure will occur whether the text is on the first line of the header or one of the folded lines after. Which is not to say that arriving-mail, IMAP, custom-header filtering on folded headers is working 100% correctly, but I haven't had a chance to test that yet.

Comment 78

•

18 years ago

Right. Maybe there should be a bug that supersedes all these bugs: "Headers' parser is ****". From what I could see on the code each backend type (IMAP, POP3, etc) has it's own way to handle the messages, which is far from ideal. Anyway, I'll check if the List-ID works when the text to filter is in the first line, and at arrival or after-the-fact. It seems I won't be using Thunderbird for another couple of years. Sorry if this seems like negative feedback, but I tried to solve it and I got no feedback... I don't think these kinds of things should be happening.

Nobody; OK to take it and work on it

Comment 79

•

18 years ago

All the code (imap, news, pop) use the exact same header parser. They all have different ways of fetching headers, but that's because the protocols are different, which should be obvious on the face of it.

Keith S.

Comment 81

•

18 years ago

FWIW, I can confirm that this problem still exists in TB 2.0.0.0 final -- at least on incoming mail or after the fact for an IMAP account. (Haven't had a case to reproduce on POP.)

Russell Odom

Comment 82

•

17 years ago

I'm seeing this in Thunderbird 2.0.0.6 (20070811) when using the Quick Search box to search subject lines which are folded. A direct copy from part of the subject line (where the text is folded) in the message pane, and pasted into the box, fails to find the message. This raw header... Subject: this is a long subject line ...is displayed in the message pane as one line. Copy and paste various bits into the Quick Search box: * "this is a" -> match * "a long subject" -> FAIL * "subject line" -> match

Magnus Melin [:mkmelin]

Updated

•

17 years ago

QA Contact: laurel → backend

Whiteboard: [see comment 68]

Updated

•

16 years ago

Product: Core → MailNews Core

Karsten Düsterloh

Updated

•

16 years ago

Summary: Filter or Search: do not handle multi-line (wrapped, folded) headers correctly when search term spans lines → Filter or Search: does not handle multi-line (wrapped, folded) headers correctly when search term spans lines

Oliver Meyer

Comment 84

•

16 years ago

Version 2.0.0.19 (20081209) I just experienced this problem when applying a subject filter on my INBOX. The message contains a single subject line spanning multiple lines in the message source. The Subject is displayed in a single line everywhere, but the "subject"-"contains"-filter does not match the message. Source: Message-ID: <23984203.01231276489500.JavaMail.lm@PCCM2> Subject: JDialog Server Build N-trunk on PCCM2 is finished. (Build successful) MIME-Version: 1.0 Filter: Subject contains: "trunk on PCCM2 is finished. (Build successful)"

Updated

•

15 years ago

Blocks: qfasfailtracker

IU

Comment 88

•

15 years ago

Attached file test case: Spam email with victim addresses obscured (obsolete) (deleted) — Details

The title may not be entirely accurate. Sometimes filtering fails -- even when the search term does not wrap -- as in the case of the following: I occasionally get spam with non-Western character encoded subject or sender information. I try to filter these out by looking for the '?=' that begins the precedes the text. This fails. Here's a scenario: 1. Download and unzip the attached spam message 2. Drag-and-drop the extracted message to your Shredder Inbox 3. Create a message filter as follows: a) Click Tools --> Message Filters... b) Select the Inbox where you dropped the message and choose New... c) Filter name: test d) Apply filter when: Check Mail or Manually Run e) Match all of the following: {Subject} {begins with} ?= f) Perform these actions: Delete Message g) Click OK 4. Highlight the filter you've just added and click "Run Now" 5. Nothing happens. So then, I guess this bug isn't going to be fixed, given it's been languishing now for well over 7 years. :-(

Comment 89

•

15 years ago

(In reply to comment #88) > test case: Spam email with victim addresses obscured > e) Match all of the following: {Subject} {begins with} ?= > The title may not be entirely accurate. Subject: header of attacched mail. ({CRLF}==0x0D0A) > Subject: =?GB2312?B?1MLIscqxztLP68Tjo6zUwtSyyrHO0sTuxOOjrM7ewtvUwtSy1MLIsaOsztK1?={CRLF} > =?GB2312?B?xNDEyOfEx7rjucWyu7HktcTUwrnixKzErLXEzqrE49ejuKMu16PW0Mfvvdq/?={CRLF} > =?GB2312?B?7MDW?={CRLF} Decoded subject begins with: > 月缺时我想你，... Matching is executed after decoding of Base-64 with GB2312. => Your comment #88 is absolutely invalid report and is absolutely irrelevant to this bug. The title of this bug is absolutely accurate.

IU

Comment 90

•

15 years ago

(In reply to comment #89) > Matching is executed after decoding of Base-64 with GB2312. > => Your comment #88 is absolutely invalid report and is absolutely irrelevant > to this bug. > The title of this bug is absolutely accurate. Whoa. Somebody needs some fresh air.

riedel

Comment 93

•

14 years ago

Here is an example, which could be quite relevant for a lot of people: I have a sieve filter moving all messages, where "X-Spam-Level" contains "***" in to a spam folder. Running a search for "X-Spam-Level" doesn't contains "***" gives me quite a lot of hits. So for people without a sieve capable IMAP server-side spam filtering won't really work. This is IMHO clearly some-how related to this bug but "search term spans lines" does not apply here. There seems is multi-line header before that seems to destroy all custom header searches afterwards. The behavior however seems to me a little inconsistent because sometimes headers seem to be found even in the presence of multi-line headers. Should I open a new bug or is there a possibility to handle all header parsing issues relating to search in this bug, i.e to remove "when..." from the title of this bug? Or is there a bug concerning this already ? At least it is not part of Bug 519202. BTW: Is still some work done on this? As stated before, it seems that header parsing is quite messed up with quite some side-effects and nothing is listed as "depends on". This quite irritates me...

Assignee

Comment 94

•

14 years ago

Wayne asked me to look at this. I'll add it to my ASSIGN list to keep it on my radar screen, but if anyone else feels inspired to work on this (as if!) don't be dissuaded. Changing the component to Search.

Assignee: ch.ey → kent

Component: Backend → Search

QA Contact: backend → search

Comment 95

•

14 years ago

xref - bug 338761 - searching body of emails for text doesn't match word-wrapped text - Bug 353746 - [mozTXTToHTMLConv] Structured text recognition should span line breaks / Bug 5351 - [mozTXTToHTMLConv] MIME linkifying code should cross linebreaks

Comment 96

•

14 years ago

another encounter with this (and Bug 338761) - <insert adjectives>. gloda gets it right - or at least, it returns the right results.

Keywords: testcase

Whiteboard: [see comment 68] → [see comment 68][datalossy]

Allan Macdonald

Comment 97

•

14 years ago

Hello, I wish to add to the chorus: I am seeing this behaviour when I try to run a filter on existing message sitting in the inbox and the filter just isn't catching the message. Therefore, as I see it, the component still should include "filter". I initially searched for this issue in google and ended up (mistakenly) posting my 2 cents worth in this forum topic here: http://forums.mozillazine.org/viewtopic.php?f=31&t=344189&start=0 Unfortunately, I should have read that post more clearly - a review of a previous postsin that article ended up leading me here. To avoid duplication, please read that post for some additional info I wish to add. There you will find a detailed description of a case of the behaviour I have encountered. I have included a copy of the headers in the email (sanitized) and a copy of my filter rules file. I hope this helps. Thanks.

Assignee

Comment 99

•

14 years ago

Attached patch Fix for arbitrary headers (obsolete) (deleted) — Details — Splinter Review

This works for strings split over multiple lines in arbitrary headers, but not for standard headers (like subject). Perhaps we need to split this bug in two to deal with the separate cases. I also decided to accumulate received headers, so that if there are multiple received headers you can match on a string in any of them. Not sure if that is the right behaviour or not - comments? I'll ask for a review in a day or two.

Assignee

Comment 100

•

14 years ago

Attached patch Fix for both search and db parsing (obsolete) (deleted) — Details — Splinter Review

Certain values, including the important subject, are parsed in nsParseMailbox.cpp and not in the search. This patch also fixes folded whitespace there. I need to review it some more myself before I ask for a review since this is such critical code.

Attachment #527425 - Attachment is obsolete: true

Assignee

Comment 101

•

14 years ago

Attached patch Patch for review (obsolete) (deleted) — Details — Splinter Review

Let's get this reviewed. Two issues to consider: 1) I accumulate received headers in search to allow a single search to look for values in any received header 2) I fix a line of code which I assume was supposed to allow the parser to detect a header with an improper space before the colon. These are both optional if you think these are bad ideas.

Attachment #413431 - Attachment is obsolete: true

Attachment #528137 - Attachment is obsolete: true

Attachment #528457 - Flags: review?(dbienvenu)

Assignee

Updated

•

14 years ago

Whiteboard: [see comment 68][datalossy] → [see comment 68][datalossy][has patch for review]

Comment 102

•

14 years ago

Accumulating received headers seems like the right thing to do. Not sure about 2) - I'll look at that more closely.

Comment 103

•

14 years ago

a couple nits: + // Should we allow an incorrect space after a header name? It seems like + // that is what this code was supposed to do. + while (end > buf && (*(end - 1) == ' ' || *(end - 1) == '\t')) end--; that code hasn't worked in a very long time, if ever, so I think it should just go. + PRBool isContinuationHeader = searchingHeaders ? NS_IsAsciiWhitespace(buf.CharAt(0)) + : false; PR_FALSE, not false.

Comment 104

•

14 years ago

Comment on attachment 528457 [details] [diff] [review] Patch for review r=me, modulo the aforementioned nits.

Attachment #528457 - Flags: review?(dbienvenu) → review+

Assignee

Comment 105

•

14 years ago

Attached patch patch to checkin (obsolete) (deleted) — Details — Splinter Review

with nits fixed.

Attachment #528457 - Attachment is obsolete: true

Assignee

Comment 106

•

14 years ago

Attached patch No, THIS is the patch to checkin (deleted) — Details — Splinter Review

Attachment #529119 - Attachment is obsolete: true

Assignee

Comment 107

•

14 years ago

Comment on attachment 529120 [details] [diff] [review] No, THIS is the patch to checkin Checked in http://hg.mozilla.org/comm-central/rev/7b75008cb771

Serge Gautherie (:sgautherie)

Assignee

Updated

•

14 years ago

Status: ASSIGNED → RESOLVED

Closed: 22 years ago → 14 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Whiteboard: [see comment 68][datalossy][has patch for review] → [see comment 68][datalossy]

Target Milestone: --- → Thunderbird 3.3a4

Updated

•

14 years ago

status-seamonkey2.1: --- → ?

Serge Gautherie (:sgautherie)

Comment 108

•

14 years ago

Comment on attachment 529120 [details] [diff] [review] No, THIS is the patch to checkin http://hg.mozilla.org/releases/comm-2.0/rev/7e66634e0e20

Serge Gautherie (:sgautherie)

Updated

•

14 years ago

status-seamonkey2.1: ? → ---