Closed Bug 381588 Opened 18 years ago Closed 17 years ago

Junk filter duplicating messages when connections cached set to 1 plus non-inbox folders configured to get checked for new messages

Categories

(Thunderbird :: General, defect)

x86
All
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: darose, Assigned: Bienvenu)

References

Details

(Keywords: fixed1.8.1.15, regression)

Attachments

(1 file)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20070517 Firefox/2.0 Build Identifier: version 2.0.0.0 (20070501) Ever since I upgraded to Thunderbird 2.0, the junk filter has been duplicating messages. When I check my mail in the morning, I see multiple copies of the same spam message in the junk folder. Initially I thought "Well, I guess some spammer is sending me multiple copies now ... oh joy". But now I don't think that's actually what's happening. The other day a couple of legitimate messages wound up in the junk folder - and they got duplicated too. So I'm pretty sure that this is a bug in Thunderbird. And it's definitely something that just started happening recently, so I'm guessing it's related to the upgrade to 2.0. By the way, before you ask me, the answer is: Yes, these messages are definitely duplicates. They have exact same message ID, headers, date/time, etc. Plus, I just watched the junk filter in action as it went through my in box marking things as spam. I made a note of the subject of one of the messages before it got marked as spam and moved to my junk folder. When I then went to the junk folder to look for it, that message had been duplicated. The problem occurs on TBird 2.0 - on both Linux and Windows (the problem happens on either one) - connecting to an IMAP mailbox (on courier-imap). Only other thing of note here is that I have the TBird account settings set to not use the local junk folder; I have the junk folder set to be one of the IMAP folders. Anyone have any ideas what could be causing this? Or how to fix it? It's REALLY irritating. Reproducible: Always Steps to Reproduce: 1. Check my mail 2. Watch the junk filter move the junk messages out of the Inbox. Actual Results: Go to the junk folder and see the junk messages present, but duplicated. Expected Results: Go to the junk folder and see the junk messages present, but not duplicated.
Version: unspecified → 2.0
Checked the junk filter log? You don't have any other forgotten filter that mark things as junk do you? Other than that, duplicates often indicate the .msf file for the folder is messed up, do you can try deleting that. <http://kb.mozillazine.org/Profile_folder>
Just turned on the junk filter log. Will check it next time some spam comes in. I definitely don't have any filter marking things as junk. All my filters just do a "move to folder" action. (i.e., filtering msgs from mailing lists into their own folders.) And I tried deleting the .msf file, but the problem still remains. Looks like a bug to me. I'm surprised that other people aren't seeing this too.
So does anybody have any ideas as to cause/fix/workaround for this bug? It's extremely irritating! Frankly, I'm pretty surprised that there's not lots of other people being affected by this, as it seems to me to have clearly been a bug introduced in TB2.0. But if the reason for that is because of something unique to my setup, I'd like to identify what that is and get this addresses ASAP. Any help anyone can offer here would be much appreciated! TIA.
Any (normal) filters turned on?
Yes. I've got loads of filters set up to filter the ham messages (which nearly all come from one of the several dozen mailing lists I'm on) into separate mailboxes.
My guess is the filters are involved. I had some odd behaviour with a normal filter moving spam in combination with the built in detection... I'm using the imap mark-as-deleted model, and that filter somehow made two of each spam end up in Junk mail - one of which marked for deletion.
Hmmmm ... I'm a bit confused then. What are you suggesting that I do to workaround this? I don't know if I have imap mark-as-deleted set - or how to turn it off if it is. But anyway, frankly I don't think that's even relevant, because I don't think that's the problem. I'm getting way more than 2 of each spam showing up in my junk box. I often get a dozen duplicates - or even more. (Totally anecdotal here, but it seems like the number of duplicates might have some correlation to the number of spam messages that are arriving in my inbox. When I check my inbox after, say, 1 hour, and there's only a handful of new spam messages that have arrived, then I might see only 2 copies of the spams in my junk folder. If I've been away for a weekend and there's several hundred new spams, then I'll start to see dozens of copies of each in my junk folder - i.e., a total of thousands of messages in my spam folder.) Anyway, let me say again something that seems to be getting ignored here: THIS IS A BUG! This was not happening prior to 2.0. I am not making this up! So given that a bug was introduced in the junk filter code for 2.0, perhaps a good approach to finding out what the problem is here is to see what changed in the junk filter code between 1.5 and 2.0. Makes sense, right? Thoughts anyone?
Oh I believe you, why would you make it up? But that doesn't help unless we can pin down what it causing it... What you possibly can do is check if the filters are causing it (and which). In Options > Privacy > Anti-virus, does it make a difference if you allow anti-virus programs to quarantine messages or not?
OK, thanks. Just getting a bit frustrated - sorry! Anyway, I don't have that option enabled. (I'm running on Linux, so no need for anti-virus.) I assume you're not suggesting that I turn it on; but rather, if it was on to turn it off.
Yeah, on the other hand, shouldn't hurt to try that way either...
I'll try it when I get a chance. That said, it doesn't sound like it's likely the problem. (Again, I don't think it's an issue of what settings I have enabled - since I haven't changed my settings in ages - but rather, what changed in the s/w.) Anyone have any other ideas what might be happening here?
FYI, I came back to look at this bug again, since its behavior is getting increasingly intolerable. (Thunderbird's duping resulted in over 8000 emails in my junk folder this morning.) I finally checked the junk filter log. Not sure what it's telling me, other than what I think I already know: that T'bird is somehow duping my emails: Detected junk message from "Logan Rivera" <santbergen.com@kingdirt.com> - She will love you more than any other guy at 04/13/2001 04:13:14 PM moved message id = 000701c7d27a$a934c780$0100007f@ffoyxke to imap://darose@10.1.0.1/INBOX/Possible Spam Detected junk message from "Logan Rivera" <santbergen.com@kingdirt.com> - She will love you more than any other guy at 04/13/2001 04:13:14 PM moved message id = 000701c7d27a$a934c780$0100007f@ffoyxke to imap://darose@10.1.0.1/INBOX/Possible Spam ... (repeated 13 times) Detected junk message from "Mandy Corbett" <qrskhbkbx@rrgroup.in> - Relax and take the time at 07/30/2007 05:33:29 AM moved message id = 65941242898726.E683EADD57@M4WTV to imap://darose@10.1.0.1/INBOX/Possible Spam Detected junk message from "Mandy Corbett" <qrskhbkbx@rrgroup.in> - Relax and take the time at 07/30/2007 05:33:29 AM moved message id = 65941242898726.E683EADD57@M4WTV to imap://darose@10.1.0.1/INBOX/Possible Spam ... (repeated 15 times) Does anyone have *any* idea what the problem is here?!?!? This is a really horribly irritating bug! TIA
Sigh. No responses. I'm going to bump this up to major priority. It's a real irritation. 13,000 messages in my spam folder every Monday morning! WTF?!?!? Methinks it's time for me to start looking into other email clients ...
Severity: normal → major
OS: Linux → All
an imap protocol log will tell you if you're getting multiple instances of the same spam messages: http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap
I'm not.
David, I can only suggest one thing, disabling your filters, and leaving just the junk mail controls running, and see what the effect is. You haven't really returned back here saying whether certain suggestions worked or did not work. I would suggest testing this particular suggestion and respond back, at least this way, when you respond back saying whether certain suggestions worked or failed to change the outcome, that it provides a bump to the bug, instead of leaving it for a few weeks between responses. It also *seems* obvious that this behaviour is either related to your specific configuration, or possibly with relation to your configuration & your imap server. I also noticed that you did not provide the exact version of thunderbird that you are running... 2.0.0.6 is the latest to my knowledge (update your firefox if you haven't already while you're at it). One other thought that has come to mind, is trying a clean installation of thunderbird on another PC if the prior suggestion doesn't provide any useful results. Tyler.
I have indeed tried every suggestion offered (including: checked the junk filter log, delete the .msf file, toggled the quarantine messages setting, checked the imap protocol log) and commented back on their results (none have solved the problem). I'll try your suggestion of disabling the filters too, when I have a moment, and see if that has any effect. As far as whether this pertains to my specific configuration, that's certainly possible. I'll try to focus my thoughts that way and see if it turns up anything (e.g., that the issue is my filters, a problem with using imap, a problem with the courier-imap server specifically, etc.) What seems obvious to *me*, though, is that this problem stems from the upgrade to TBird 2.0. The problems started then. As far as which version I'm using, I'm currently seeing this in 2.0.0.6 (on both Linux and Windows), but as I said I've been seeing this with every 2.0 version. As far as performing a clean installation, I'll try that as well when I have a moment. I currently experience the problem with 3 different installations of TBird on 3 different machines (2 Linux, 1 Windows), but all of those were originally 1.x installations where the software was later upgraded to 2.0. Thanks very much for the suggestions. It is greatly appreciated. Hopefully one of the things you suggested here will solve the problem.
I'd still like to see the protocol log
I didn't post it because a) it's massive (118MB), and b) it contains the full text of numerous legitimate personal emails. Not sure how to post it for others to look at without at least addressing issue #2. Ideas?
I was thinking you could zip it up and e-mail it to me. You could snip out the personal e-mails first. Is this from a session where you just log on and retrieve new mail without reading it? The only part of the log I care about is the beginning through the part where we've finished moving the messages to your Spam folder. That will include the parts where we fetch messages to see if they're junk - you could snip those, if they're not too numerous.
Attached patch avoid the dups (deleted) — Splinter Review
this is a bit of a band-aid for the core problem, but it's a lot easier than fixing what seems to be going on, if I'm reading the log correctly, which I'll put off describing until David R confirms my guess that TB is configured not to cache IMAP connections.
Assignee: mscott → bienvenu
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #280769 - Flags: superreview?(mscott)
Attachment #280769 - Flags: superreview?(mscott) → superreview+
David somehow got the number of connections cached set to 1 (the default is 5). He's also got several folders configured to get checked for new messages, presumably because of server-side filters. This combination is causing contention over the one cached connection - we select the inbox and download new headers. When we see that there are new messages, we try to analyze them for junk status, and move to them to the junk folder. At the same time, we try to check other folders for new messages, but we have to queue those urls because they're waiting for one connection to come free. We tend to round-robin the queued urls, so that when the url that downloads new headers from the inbox is finished, we let the url that checks a non-Inbox folder for new messages run; then we run the url that tries to download a new inbox message to check it for junk status, then a url to check an other non-Inbox folder for new messages; then the url to move the junk messages from the inbox to the spam folder, etc. This causes us to select the INBOX three or more times, instead of just re-using the cached connection. I believe the selecting of the INBOX multiple times is causing us to count the junk messages multiple times. I haven't verified this in the code, but it seems likely. Ideally, we would try to run the inbox urls more in sequence in this case, instead of interleaving other urls for other folders, but we're not optimized for the single connection case, nor do we want to starve the other urls...
the patch is checked in on the trunk.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
(In reply to comment #22) > David somehow got the number of connections cached set to 1 (the default is 5). > I believe the selecting of the INBOX > multiple times is causing us to count the junk messages multiple times. Woo hoo! Thanks for figuring this out, David. That was indeed the problem. The reason why my cached connection count was set to 1 is because I started receiving errors a while back about Thunderbird running out of cached connections, so I changed the setting to 1 - a bad choice, apparently. Apparently, as you pointed out to me in a private email, my IMAP server (courier-imap) only allows a max of 4 connections by default. So I've now changed the cached connection setting to 4 instead of 5, and it seems to have mostly taken care of the problem. Note: I say "mostly" because I still am seeing an occasional duplication of messages in my junk folder, but it's *MUCH* much less than before. Still, it probably shouldn't be happening at all. (Not to mention that setting TBird to use only 1 cached connection shouldn't cause this behavior either.) So IMO, David's patch to fix this should still go into the code. Anyway, thanks again for the help, David, in finding and fixing this extremely irritating bug!
Should we take this fix for branch too?
David B., Is cached connections=1 or junk processing _required_ for this condition to occur? IOW, might this happen with connections at 2, or connections=1 plus normal filtering? David Rosenstrauch, Is this a regression? Your comment 0 seems to imply this didn't happen in v1.5 (In reply to comment #25) > Should we take this fix for branch too? hard to say how widespread the issue is since most people don't mention their number of cached connections. If this isn't a regression then this might be a fix for some who comment in older bugs like bug 196066.
Summary: Junk filter duplicating messages → Junk filter duplicating messages when connections cached set to 1 plus non-inbox folders configured to get checked for new messages
(In reply to comment #26) > David B., > Is cached connections=1 or junk processing _required_ for this condition to > occur? IOW, might this happen with connections at 2, or connections=1 plus > normal filtering? > > David Rosenstrauch, Is this a regression? Your comment 0 seems to imply this > didn't happen in v1.5 1) I currently have connections set to 4 (since courier-imap seems to screw up at 5), and although the duplicating behavior is mostly gone, I still see it occasionally. (Usually only when I start up TBird and start trying to view emails before it's finished the spam filtering.) 2) Correct - I only started seeing this problem in 2.0.
Wayne, I can't say definitely that those two conditions are necessary, but they make the problem much much more likely.
Comment on attachment 280769 [details] [diff] [review] avoid the dups this has been on the trunk for a long time now, and could help with bug 193325 as well.
Attachment #280769 - Flags: approval1.8.1.15?
Comment on attachment 280769 [details] [diff] [review] avoid the dups Approved for 1.8.1.15, a=dveditz for release-drivers
Attachment #280769 - Flags: approval1.8.1.15? → approval1.8.1.15+
landed on 1.8.1.15 branch
Keywords: fixed1.8.1.15
noting this is a regression going from 1.5 to 2.0 - though we don't know the cause. Is "1.8 2006-07-14 13:45 part of fix for 342912" too easy a target?
Keywords: regression
Wayne, it's possible that's indirectly true.
as i reported at Bug 167173 this problem arises in TB v3.0 so look like your fixes are loosed
This problem persists (Thunderbird 38.5.1, Windows).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: