Closed Bug 170555 Opened 22 years ago Closed 22 years ago

Create Junk folder and add ability to purge messages from junk folder

Categories

(MailNews Core :: Backend, enhancement)

x86
Windows 2000
enhancement
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: naving, Assigned: naving)

Details

Attachments

(3 files, 4 obsolete files)

As we discussed, I have written code so that we create junk folder on demand when we are trying to filter messages to junk folder. This is for both pop and imap. Also I added ability to purge messages based on age as threshold and I tied this with auto-compact as you suggested. I think we need to do this just once per session because we age junk folder in days. patch coming.
Attached patch proposed fix (obsolete) (deleted) — Splinter Review
See comments above
what if you leave the app running for multiple days?
instead of duplicating the code in nsMsgDatabase::PurgeMessagesOlderThan, could you re-use it? You'll probably need to restructure the code a little so that you can call nsMsgDatabse::GetMessagesToPurge(PRUint32 daysToKeepHdrs, PRBool keepUnreadMessagesOnly, nsMsgKeyArray *keysToPurge) and then the callers get to decide what to do with the keys - for your code, call DeleteMessages on the folder, for the retention settings code, call DeleteMessages on the folder.
Do you really think we need to purge more than once per session.
Attached patch patch addressing duplication of code. (obsolete) (deleted) — Splinter Review
Severity: normal → enhancement
Status: NEW → ASSIGNED
My first take would be that we should purge once per day, which might be a lot less than once per session if you have lots of sessions in a day, or might be multiple times per session if you leave the client running...but if that's too hard to code, then I guess we could do it once per session.
We would have to store the timestamp of last purge if we want to do it only once per day. I think once per session is good enough because I think on an average people would start their mail app atleast once per day. Scott, what do you think ?
there an issue that I think might be being overlooked: for imap, you need to select the folder on the server in order to age messages - iterating over the db won't give you any headers unless the folder has been opened and the messages filtered into it downloaded to the db. So, in order to correctly age the spam folder by iterating through the db, you need to update it by doing a select. This can be a moderately expensive operation if there are lots of headers to download so avoiding doing it every session might be worthwhile.
Do we need to do a select ? I think people will go to their junk folder to see if there are any other msgs besides junk. I think we should not be doing a select and deleting msgs because it doesn't even give them a chance to look at their junk messages. It is dangerous. This is something like deleting from underneath us!
I think if we say we're going to age the junk folder, we should age the junk folder - if we don't age it, it will grow without bounds if the user doesn't open it. And we'll only age away messages that were put in the junk mail folder 10 days since you last opened the junk mail folder. If we decide that you have to open the junk mail folder to age it, then we should age it when the user goes to another folder or shuts down. And finally, for pop3, aren't you going to age the messages away without opening the folder?
I think once every 24 hours or greater (depending on the last time they logged on) would be preferable. I also think we can't depend on the fact that they will select their junk mail folder through the UI.
I think Navin has a point that auto-aging messages the user hasn't seen might be undesirable, unless we warn the user when they set the pref. Is auto-aging on by default, or does the user have to turn it on?
we should make it off by default. I guess if someone goes on vacation for a month we probably don't want to delete mail they may not have seen.
I don't think we should do a select and delete it because it is drastic action no matter what.
if we are going to warn the user then it is ok.
Attached image Junk Mail Controls dialog (deleted) —
Plan was that users could decide if/what folder they wanted junk msgs moved to. And if/when they wanted junk messages auto deleted.
Auto delete is Off by default. User must manually turn it on first.
thanks for the screen shot, jglick. I'll be working on the UI changes as part of bug #169638
If UI isn't doable, let me know and we'll need to discuss what needs to be changed.
jglick: that dialog barely fits at 800x600 (into a web browser), and it has way too many controls and way too much text for me to handle, and I spend lots of time reading large amounts of both. I'd suggest: 1. ditch most of the documentation setences, if people need them they can click help. 2. change the off..highest into a select. 3. rename them. "lowest" and "highest" are useless, "tolerant" and "aggressive" are slightly better. 4. drop "When messages are ..." 5. Color junk mail <light grep |v> 6. drop display an indicator. always display one, either in the thread position or perhaps in some col which the user can turn off. 7. We could combine the two move items, it'd buy space Move to: <junk mail on Foo |v> |junk mail on bar | |... | |junk mail on local folders | |----------------------------| |other >| +----------------------------+ I know we haven't in other places, but i think that's a bug (it costs us a lot of space) 8. move always accept to the very top. (there are a few reasons, one is that you're trapping grayspace by putting it where you did, another is that of all the options it's one of the most important, it's also not remotely destructive whereas move and delete are) [x] Always accept mail from <personal address book|v> <Aggressively|v> filter inbound mail for junkmail ... 9. The title should say Junk Mail Controls for <accountprettyname> 10. Shouldn't it be < _View_ junk mail log >? -- As for deleting mail, I think it might be best if we wrote something like: [ ] Automatically delete old junk mail after [ 9] days of continuous use. continuous use would have to be defined, but basically if I only use mozilla three days out of twenty then the threshold wouldn't be met. Leaving mail open would not meet the definition either. Reading a few messages or opening multiple folders would. Last usage consideration. Suppose I get 1 junk message a day every day for a month. And suppose I use mozilla for days 20..28 of that month. On day 28 junkmail identified between 1 and 19 would be deleted but junk mail received on day 20 would not be automatically deleted until the next day I use mozilla.
as far as the purging code is concerned, one easy way to solve the "select imap spam folder before you can purge" problem is to use SEARCH to do the purging - just construct a search term that gives you messages older than XX days, listen for the results, and when the search is done, delete the result messages. Search automatically updates the folder before doing the search, and is async so you don't have to worry about blocking the UI... The alternative is to, in the imap case, when you've decided that you need to run an auto-age, run a select url, register yourself as a listener, and when you get notified that the select url has finished, run the purge code from the db...if the user happens to have selected the spam folder, and that has triggered the auto-purge (which would be a highly unlikely event, I believe), you could either just NOT do the auto-age, or still run the select url but *after* the ::UpdateFolder select, which would cause the second select to block until the first one was finished. However, I'm guessing you're going to try to shove this into the already overloaded ::OnStopRunningUrl, and add yet another bit of state to nsMsgImapFolder. I'd encourage you not to do that.
when is the junk mail auto-aging likely to run? Is it most likely to run when the user starts up at the beginning of the day and selects their inbox? Should we worry about the effect on start-up performance?
the search approach sounds nice. I guess i'd suggest that aging happen after the user idles for a while. The problem is that a user might open mail, do stuff, quit mozilla. so we can't really do it that way. How about aging five minutes after the user loads mail and on a low priority thread with a status window which pops up somewhere. The dialog should probably have stop and details buttons. details would bring up a query list showing all the mails scheduled to be deleted.
If you look at the existing code auto-compact never gets called on start-up because we wouldn't have been running any url (imap)! For pop we don't do it until after we have opened the database. I was thinking of adding a bit of state to the folder (a boolean mDeletingJunkMessages) so that we know that we know that we have to delete junk messages after downloading junk messages (imap) or reparsing junk folder (local). Both are updating folder actions so just call deleteJunkMessages from nsMsgDBFolder::OnStopRunningURL when mDeletingJunkMessages is true. This boolean is also needed to ensure that we don't go into infinite loop because UpdateFolder called AutoPurge then again we call UpdateFolder from AutoPurge before purging.
yes, that was the kind of approach I was hoping to avoid. I'll have to think about it for a while.
don't we load the inbox on mail startup, and doesn't that run an imap url? use QueryElementAt here: + nsCOMPtr<nsISupports> support = getter_AddRefs(m_connectionCache->ElementAt(i)); + connection = do_QueryInterface(support); I'm also not sure why auto compacting of folders and auto-purging of the junk mail folder have been forced together like this. Is it for conceptual reasons, or just ease of implementation? Isn't that causing some of the problems you need the boolean to solve? It also looks to me like auto compact for imap only runs when we're offline - is that correct?
When we are loading inbox on start-up we are not running any other url on Inbox at that time, so AutoCompact doesn't get called. But it does get called if you click on GetNewMessages twice very fast or some other url (not select) is running on a folder and you happen to select that folder. I think a better place of moving this auto-compact call would be in nsMsgDBFolder::OnStopRunningURL when we happen to end updating folder. That would definitely ensure AutoCompact getting called and we could remove the call from all child implementations (imap, pop, news) I was tying both of them because they are both cleaning up methods and both have to be performed after some time interval. I had another idea of simplyfying purging junk folder: Only purge if you happen to select junk folder and time interval thing is met. Since we are actually not purging but just deleting, that saving space argument doesn't hold and users will some time or other select the junk folder to see if there is anything besides spam ? agree.
if you're offline, we'll never call ::OnStopRunningUrl. I think your life would be easier if you broke these two things apart (auto-compact and auto-purge). I believe you're painting yourself into a corner by combining them, since they're not called at the same time.
QA Contact: gayatri → laurel
I'm about update the interface. You'll want to use this from your code: + /** + * built from moveTargetMode, actionTargetAccount, actionTargetFolder + */ + readonly attribute string spamFolderURI; it builds up the uri for you, based on those attributes. the folder is not guaranteed to exist, but that's ok, since your code creates it. let me know if you have questions.
QA Contact: laurel → gayatri
QA Contact: gayatri → laurel
I've checked in, so you'll want to fix to use the new attribute. if you're also working on setting / clearing the junk mail folder flag, you'll want to do that from nsMsgIncomingServer::SetSpamSettings().
Attached patch proposed fix (obsolete) (deleted) — Splinter Review
This patch includes the junk folder creation part from last patch. What has changed is how we purge junk folder. As discussed w/ david, I created a purgeService that goes and purges junk folder at intervals of one day. I store the last purge time in the folder cache. I see no point storing this info in the db (msf file) at this point. So basically purges are scheduled when we load accounts. We iterate for servers that can have spam settings and add it to purgeArray. If two purges happen to be at the same time then I delay it by 5 mins. For actual purging I have used search and created term AgeInDays etc... Cavin, David, can you review ? thx.
+#define oneDay 86400000000 +#define delay 300000000 //minimum delay between two consecutive purges these names need to follow the normal naming conventions for #defines, which is all upper case. The names could also be more descriptive ONE_DAY_IN_MILLISECONDS MIN_DELAY_BETWEEN_PURGES. perhaps you were copying the style in nsMsgDBFolder.cpp, #define oneHour 3600000000 but that is also wrong. copyright date should be 2002 + * The Initial Developer of the Original Code is + * Netscape Communications Corporation. + * Portions created by the Initial Developer are Copyright (C) 1999 + * the Initial Developer. All Rights Reserved. + * wouldn't it be cleaner here to use the Equals method on junkFolderURI? + nsXPIDLCString junkFolderURI; + spamSettings->GetSpamFolderURI(getter_Copies(junkFolderURI)); + + *aIsJunkFolder = aCaseInsensitive ? !PL_strcasecmp(junkFolderURI.get(), aFolderURI) : !PL_strcmp (junkFolderURI.get(), aFolderURI); + return NS_OK; changing the comment doesn't make this OK :-) The imap url is not thread safe, so this routine should not be called from the UI thread. -// this method is only called from the imap thread +// this method is only called from the imap thread AND the UI thread... NS_IMETHODIMP nsImapUrl::CreateServerSourceFolderPathString(char **result) { NS_ENSURE_ARG_POINTER(result); + nsAutoCMonitor(this); AllocateServerPath(m_sourceCanonicalFolderPathSubString, kOnlineHierarchySeparatorUnknown, result); return NS_OK; as far as just using the folder cache is concerned, that goes against all the other code that uses the folder cache. We're supposed to have it such that removing the folder cache and restarting works as if you hadn't removed the folder at all (except that startup will be slower), because everything will get regenerated from the db's. I know this was a convenient shortcut for you, but I'm not sure it's the right thing to do.
I think Equals is not there for nsXPIDLCString. It is for nsCAutoString. Also If it has Equals, I think I had problems w/ caseInsensitive comparisons. I made it threadsafe by adding a monitor, look at similar methods below. { NS_ENSURE_ARG_POINTER(result); + nsAutoCMonitor(this); lastPurgeTime is not that important that it has to be recreated for folder cache from dbs because we are purging at interval of one day. We can just start fresh i.e purge and create it again.
I don't like this code going directly to the folder cache. No other code does that (other than the code that has to be involved with the folder cache). No other code should know about the folder cache. If it was too much bother to go through the nsIMsgFolder and the db, instead of just bypassing the nsIMsgFolder and DB entirely, you should try to make it easier for client code to set properties in the db via the nsIMsgFolder. You could add get/set methods to the db, Get/SetFolderProperty(in string propertyName, in string propertyVal), and have those methods get/set the string property on the nsIDBFolderInfo. Then, you'd just need to add an attribute to nsIMsgFolder for the last purge time stamp.
the problem is that other routines in nsImapUrl can access/change data that gets changed in nsImapUrl::CreateServerSourceFolderPathString. Thread-safety isn't just making it so that only one person can call that method at one time. Thread-safety also involves making sure that any data access in that method can't be access by any other method/thread at the same time. The multiple connection code is in the wrong place. It's supposed to go in nsImapProtocol::CanHandleUrl(). We talked about this before, but if you look at that routine, it has two out parameters, canRunUrlImmediately and canRunButBusy. If a connection is creating a folder and another request comes along that tries to do an online copy or move into that folder, then CanHandleUrl should return canRunButBusy as true, and canRunUrlImmediately as false. Hope this helps.
nsImapProtocol::CanHandleUrl() will not know about other imap connections!
As far as multiple connection code is concerned, I still think it is the right place. So lets consider the simplest case, where we have just 2 connections one imap inbox connection and other connection which is creating junk folder, so now the copyUrl (filter moves) wants a connection so that it can be run. We iterate through the connection cache and if we find Inbox connection first it can easily handle this copyUrl. The code in nsImapProtocol::CanHandleURL doesn't have any way to know that destination folder is being created on a separate connection thread. Also for copyURL to be run we need the connection to be inSelected state so it can never be run on the other connection which is creating junk folder, so we will always hit Inbox connection. Scott, your comments would be appreciated to resolve this ? thx.
If you'd look at the code a little more closely, you'd see that we iterate over the connections. If a connection says it has to handle the url, but it's busy, then we queue the url. If it says it can handle the url and it's not busy, that means it's in the selected state for the desired folder. So, the inbox connection is not going to say that it can handle the url (if you write the code correctly so that it doesn't!), and your reasoning and conclusion are wrong. Of course, if you don't change CanHandleUrl as I said that you should, then it won't work, but that's really not good argument on your part. I can say that any approach isn't going to work if I write the code incorrectly! What you've done is add an extra loop looking through all the connections. The way the routine is designed and implemented is that it already loops through all the connections calling CanHandleUrl. If you fix CanHandleUrl to handle this case, then everything will work as designed.
CanHandleUrl approach will not work. I suggest you code it and try yourself. I have spent lot of time in coding this piece and this is the right way. Regarding looping one more, it is much more cleaner not to put in that other loop. I originally put in that existing loop but it was turning out to be more messy because of the several conditions already there.
OK, I give up; you've convinced me: doing it in the connection code was a very dumb idea and I'm very sorry I suggested it in the first place. The code is very sensitive and if anything goes wrong, the consequences are dire. The code needs to be clean and maintainable as possible, and having this special case code there turned out to be just a bad idea. So, just create it when the user picks a spam folder, like you do for creating a normal filter with a new folder. Or check in your code and I'll go clean it up later.
ok, CanHandleUrl is out but it would have never worked. I'll attach a patch for just purging junk folder. Creating junk folder is going to be driven from UI which has to be sorted out.
Attached patch proposed fix (obsolete) (deleted) — Splinter Review
I have added Get/SetStringProperty to get lastPurgeTime. I have added comments, rest of the patch remains same as last one.
Attachment #100431 - Attachment is obsolete: true
Attachment #101179 - Attachment is obsolete: true
David, can I get review ? thx.
Comment on attachment 101483 [details] [diff] [review] proposed fix I think you can get rid of the init method on your purge service interface if you have init get called automatically when the purge service is first invoked. You can do this by using the nifty marcro: NS_GENERIC_FACTORY_CONSTRUCTOR_INIT in nsMsgFactory. 2) tabbing issues in nsMsgPurgeService.h 3) The Get/SetStringProperty stuff on a db folder looked good to me.
Scott, that Init() is needed because if you do a turbo shutdown the services are not freed and when we start back up we have to call Init explicitly. We already do that from account manager. This is something we do for biffManager also.
this is wrong: + +//there is no spam settings for news folders - atleast for now +NS_IMETHODIMP +nsNntpIncomingServer::GetSpamSettings(nsISpamSettings **aSpamSettings) +{ + return NS_ERROR_NOT_IMPLEMENTED; //no assertion because generic code will not know +} + +NS_IMETHODIMP +nsNntpIncomingServer::SetSpamSettings(nsISpamSettings *aSpamSettings) +{ + return NS_ERROR_NOT_IMPLEMENTED; //no assertion because generic code will not know +} we are going to allow spam settings for all protocol types except none. (movemail, imap, pop3 and news) note, all those show up in the account picker for the junk controls. the difference with news is, you can move messages. I'm working on fixing the junk controls UI to disable the move UI for news. I haven;t read the full patch yet, but you can assume that for a news spam settings, the boolean value for purge will never be true. I'll take care of that.
a style nit: you can save yourself from if () { if () { if () { if () { by doing this: + //Only add it if it hasn't been added already. + if (serverIndex != -1) + return NS_OK; + nsCOMPtr <nsISpamSettings> spamSettings; + nsresult rv = server->GetSpamSettings(getter_AddRefs(spamSettings)); + NS_ENSURE_SUCCESS(rv,rv); + + PRBool purgeSpam; + rv = spamSettings->GetPurge(&purgeSpam); + NS_ENSURE_SUCCESS(rv,rv); + if (!purgeSpam) + return NS_OK; and so on. if you can avoid the nested if style, please do. it's hard to read.
I also removed path exists check and changed GetDBFolderInfoAndDB for imapFolder to do the right thing i.e open an existing db, not create one.
Attachment #100443 - Attachment is obsolete: true
Attachment #101483 - Attachment is obsolete: true
so you know that no code relies on GetDBFolderInfoAndDB to actually create the .msf file?
Comment on attachment 101606 [details] [diff] [review] patch addressing seth's comments ok, sr=bienvenu if we go back to using exists for now, instead of changing GetDBFolderInfoAndDB, just to be safe.
Attachment #101606 - Flags: superreview+
Comment on attachment 101606 [details] [diff] [review] patch addressing seth's comments r=mscott
Attachment #101606 - Flags: review+
fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Even beyond the Linux bustage (the nsInt64 constructors should probably be |explicit| to prevent that type of problem, although I'm not sure which compiler is right), there will probably be bustage on some of the ports because you can't initialize a 64-bit int based on a numeric literal, since not all ports have 64-bit ints. That's why we have prlong.h to begin with, and nsInt64 is an extension of that.
Comment on attachment 101773 [details] [diff] [review] patch to fix bustage on some platforms >+static const nsInt64 gOneDayInMilliseconds = 86400000000; >+static const nsInt64 gMinDelayBetweenPurges = 300000000; You can't do this since you can't use numeric literals this way. The portable way would probably be: static const PRInt64 gOneDayInMilliseconds = LL_INIT(20, 500654080); /* 86400000000 */ static const PRInt64 gMinDelayBetweenPurges = LL_INIT(0, 300000000); (fortunately 86400000000 mod (2^32) < (2^31), or else there wouldn't be a portable way given the current (slightly incorrect, unless I'm misreading) prlong.h macros) Do check my math... >+ nextPurgeTime += gOneDayInMilliseconds; Then all the things like this would have to be: nextPurgeTime += nsInt64(gOneDayInMilliseconds);
- purgeEntry->nextPurgeTime += MIN_DELAY_BETWEEN_PURGES * (mPurgeArray->Count()+1); //let us stagger them out by 5 mins if they happen to start at same time + for (PRInt32 i=0; i < mPurgeArray->Count()+1; i++) + purgeEntry->nextPurgeTime += nsInt64(gMinDelayBetweenPurges); //let us stagger them out by 5 mins if they happen to start at same time This looks like you're just incrementing the same value by the same amount multiple times.
Oh, never mind, that's what you meant to do, I think.
(But you could use the stuff in prlong.h to do it a much faster way.)
Update- Using trunk build 20030121 on winxp we are creating the Junk mail folder for IMAP POP and Web type accounts when the user enables Junk Mail controls (see bug 181397 for test scenarios). However I haven't tested the purging based on age yet which I believe is what the fix for this bug is.
Using trunk build 20030218 on winxp, macosx and linux purging of messages older than the value placed in the purge preference is working. It seems to purge 5 minutes after launching, haven't tested the intervals if app left open for long periods of time. Example, app left open 30 minutes after initial purge and did not purge again. Can someone tell me what the interval of purge are if the app is left opened for days (is it 24 hrs as suggested in comment 11). Spin off bugs will be logged for wording of purge feature and problems purging mail that is labeled as Not Junk but resides in the designated Junk folder.
Purging works, however spinoff bug 194090 where purging puges "not Junk" message too that reside in the Junk folder. this is verified
Status: RESOLVED → VERIFIED
QA Contact: laurel → esther
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: