Closed
Bug 856286
Opened 12 years ago
Closed 10 years ago
Same filename(storeToken) of Mbox/cur/nnnnnnnn is used for different UID/messageKey by "multiple mail copy from maildirstore/IMAP folder to maildirstore/IMAP/Offline-Use=On folder"
Categories
(MailNews Core :: Database, defect)
MailNews Core
Database
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: World, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: dataloss, Whiteboard: [maildir])
+++ This bug was initially created as a clone of Bug #753624 +++
This bug is spin-off of bug 753624 comment #12.
[Build ID]
> Mozilla/5.0 (Windows NT 5.1; rv:22.0) Gecko/20100101 Thunderbird/22.0a1
Same filename8storeToken) of Mbox/cur/nnnnnnnn is used for different UID/messageKey by "multiple mail copy from maildirstore/IMAP folder to maildirstore/IMAP/Offline-Use=On folder".
Duplicated cur file name is used by a simplest "multiple mail copy".
(1) MboxA: maildirstore, IMAP/Offline-Use=Off. "cur" is not used.
512 mails is held: mail-1, mail-2, ..., mail-512
unique Subject:, unique Message-ID:, is kept.
Note: Mails are crafted mail for testing of other bugs.
Message-ID:, Subject:, is incremented by 1 for each mail.
Date: header starts from Epoc Time.
Time stamp in Date: is incremented by 1 sec for each mail.
(2) MboxB: maildirstore, IMAP, Offline-Use=On. cur directory is used.
Copy all 512 mails in MboxA to MboxB. (in same IMAP account)
Because MboxA is Offline-use=Off, mail data is fetched.
At thread pane : All 512 mails are shown as expected.
In MboxB/cur : Only 484 files are created.
mail-491 : messageKey=488
mail-456 : messageKey=489
Message pane display : mail-491 shows content of mail-456
storeToken = 1364617150945000 is used for both mails.
content of MboxB/cur/1364617150945000 : data of mail-456
Dump data of msgDBHdr of mail-491
> [messageKey] = 488
> [statusOffset] = 0
> [messageOffset] = 0
> [messageSize] = 938
> [offlineMessageSize] = 1024
> [lineCount] = 28
> [date] = 0
> [dateInSeconds] = 0
> [StringProperty_pendingRemoval] =
> [flags] = 268435585
> [flag_Detail] = { FeedMsg = false, IMAPDeleted = false, MDNReportSent = false, Read = true, Replied = false, Marked = false, Expunged = false, HasRe = false, Elided = false, Offline = true, Watched = false, SenderAuthed = false, Partial = false, Queued = false, Forwarded = false, Priorities = false, New = false, Ignored = false, MDNReportNeeded = false, Template = false, Attachment = true, Labels = false, RuntimeOnly = false }
> [threadId] = 488
> [threadParent] = 4294967295
> [messageId] = 1KB-Mail.000491.000001
> [mime2DecodedSubject] = 1KB-Mail-000491
> [All_StringProperty] = { flags = 10000081, statusOfset = 0, sender = T-000491-M-000001@f.f.f, recipients = M-000001-T-000491@t.t.t, subject = 1KB-Mail-000491, message-id = 1KB-Mail.000491.000001, date = 0, dateReceived = 0, X-GM-MSGID = 1430885469959438933, X-GM-THRID = 1430885469959438933, X-GM-LABELS = Offline-Use-Off, sender_name = 85|T-000491-M-000001@f.f.f, priority = 1, size = 3aa, keywords = tag-1, threadParent = ffffffff, msgThreadId = 1e8, ProtoThreadFlags = 0, msgOffset = 0, storeToken = 1364617150945000, offlineMsgSize = 400, numLines = 1c, label = 0 }
Dump data of msgDBHdr of mail-456
> [messageKey] = 489
> [statusOffset] = 0
> [messageOffset] = 0
> [messageSize] = 938
> [offlineMessageSize] = 1024
> [lineCount] = 28
> [date] = 0
> [dateInSeconds] = 0
> [StringProperty_pendingRemoval] =
> [flags] = 268435585
> [flag_Detail] = { FeedMsg = false, IMAPDeleted = false, MDNReportSent = false, Read = true, Replied = false, Marked = false, Expunged = false, HasRe = false, Elided = false, Offline = true, Watched = false, SenderAuthed = false, Partial = false, Queued = false, Forwarded = false, Priorities = false, New = false, Ignored = false, MDNReportNeeded = false, Template = false, Attachment = true, Labels = false, RuntimeOnly = false }
> [threadId] = 489
> [threadParent] = 4294967295
> [messageId] = 1KB-Mail.000456.000001
> [mime2DecodedSubject] = 1KB-Mail-000456
> [All_StringProperty] = { flags = 10000081, statusOfset = 0, sender = T-000456-M-000001@f.f.f, recipients = M-000001-T-000456@t.t.t, subject = 1KB-Mail-000456, message-id = 1KB-Mail.000456.000001, date = 0, dateReceived = 0, X-GM-MSGID = 1430885437803636784, X-GM-THRID = 1430885437803636784, X-GM-LABELS = Offline-Use-Off, sender_name = 85|T-000456-M-000001@f.f.f, priority = 1, size = 3aa, keywords = tag-1, threadParent = ffffffff, msgThreadId = 1e9, ProtoThreadFlags = 0, msgOffset = 0, storeToken = 1364617150945000, offlineMsgSize = 400, numLines = 1c, label = 0 }
(3) If mails are sorted at MboxA in different order before Copy,
number of files of MboxB/cur/nnnnnnnn is different,
and "duplicate storeToken" occurs on different mails.
Note:
Tested using Gmail IMAP, with Offline-Use=Off of all Mbox, except some non-SpecialUse Mboxes(ordinal Gmail Label) for this test, withAuto-Expunging=On.
There are at least following garbage by "Copy to maildir".
(A) Empty directory of Mboxname/cur/nnnnnnnn where nnnnnnnn==storeToken
(B) Null file of Mboxname/cur/nnnnnnnn where nnnnnnnn==storeToken
(C) Duplicate use of Mboxname/cur/nnnnnnnn by different messageKey
(D) Other mail's header data use by different UID, different MessageKey
(storeToke, Mboxname/cur/nnnnnnnn is unique. )
(observed with two mails of same Message-ID:/different Subject: )
(which is crafted mail for Gmail's duplicate mail detection test)
(E) Smaller messageSize because length of "From - ...", X-Mozilla-Status:, X-Mozilla-Status2: is not added even though these headers are added if maildirstore/local mail folder. (X-Mozilla-Keys: is not added even though it's needed.)
Many of "funny Target\cur/nnnnnnnn file(size=0) or empty directory" in "Move mails between maildirstore/local folder or maildirstore/IMA/Offline-Use=On folder" looks caused by "Garbage by fault in Copy to maildirstore".
Detecting of (C), (D) is perhaps difficult, but detecting of (A) and (B) before Copy and after Copy is easy.
"Size check for in case of (E)" is similar to nstmp file size check in Compact of berkleystore/local mail folder.
Can someone add code to detect (A) and (B)?
Can someone add "error return code check" step after Copy?
Can someone add code to detect (E)?
Reporter | ||
Comment 1•12 years ago
|
||
Because Date: is Epoc Time, following occurs in test mail.
> [date] = 0
> [dateInSeconds] = 0
This occurs on mail of no Date: header, mail of malformed Date: header, and such mail is actually sent daily to many users.
So, even if cause is date/dateInSeconds=0, please don't cloas as INVALID or WONTFIX.
Reporter | ||
Updated•12 years ago
|
Summary: Same filename8storeToken) of Mbox/cur/nnnnnnnn is used for different UID/messageKey by "multiple mail copy from maildirstore/IMAP folder to maildirstore/IMAP/Offline-Use=On folder" → Same filename(storeToken) of Mbox/cur/nnnnnnnn is used for different UID/messageKey by "multiple mail copy from maildirstore/IMAP folder to maildirstore/IMAP/Offline-Use=On folder"
I think that the storeToken (and filename) is generated from the date received of the message. When more messages have the same date, some -XX suffix should be appended.
WADA, did this cause any dataloss, are those messages that do not have a file lost?
Reporter | ||
Comment 3•12 years ago
|
||
(In reply to :aceman from comment #2)
> WADA, did this cause any dataloss, are those messages that do not have a file lost?
Dataloss surely occurs in maildir message database.
- mail-1 and mail-2 points same cur/abcd file.
- data in cur/abcd is one of mail-1 or mail-2,
so data of one of mail-1 or mail-2 is lost in cur.
- because ::Offline flag=true && storeToken is set and file pointed by
storeTken exists, fetch of message header nor fetch of body[] will
never invoked, even by Repair Folder if maildirstore.
At server, I think no data loss occurs.
I believe "uid xx copy Target" is issued even if it's maildirstore, altough I'm not confident.
> I think that the storeToken (and filename) is generated from the date
> received of the message. When more messages have the same date,
> some -XX suffix should be appended.
If nnnn is existent file(file is closed, directory elements are fully writen in HDD), I think existence check works.
In mail copy fro maildirstore/local folder to maildirstore/local folder, and if same mail is copied to same folder, suffix is always added as you say.
But if newly created file and before close file or just after close file, OS's file system may not return "already exists".
Tb may request multiple unique file names at once without file creation.
Another guess.
As seen in msgDBHdr data, messageKey is continuous, and fetch body[] of multiple UIDs is requested by single "uid x:y fetch body[]" command.
So, both mail-1 and mail-2 may has same seed value for hashing.
mail-1 : messageKey=(N),
Date:==Epoc Time,
Timestamp of data fetch = yyyy/mm/dd hh:mm:ss.ttt
mail-2 : messageKey=(N+1),
Date:==Epoc Time <= same as mail-1 in my test mail
Timestamp of data fetch = yyyy/mm/dd hh:mm:ss.ttt
there is no gurantee that it's different from mail-1,
because Gmail IMAP is used(fast/eficient server),
and I use DualCore Chip(still not so slow for Win^XP),
and I use 70Mbps local link, 1Gbps external fiber link.
Reporter | ||
Updated•12 years ago
|
Whiteboard: [maildir]
Reporter | ||
Updated•12 years ago
|
No longer blocks: maildirblockers
Comment 4•10 years ago
|
||
I don't see anything new here that was not addressed in other bugs. The "same filename" has been largely addressed by using CreateUnique calls when moving to maildir directories. These issues were very apparent in esr31 and earlier, which is when this was filed.
So I'll leave it open for awhile, but after the current round of maildir bugs lands, we'll need someone to say this problem still exists or we'll resolve this as WFM.
Reporter | ||
Comment 6•10 years ago
|
||
In MailDir, file name of cur dir is a kind of "meta data of the mail", and any file name can be used as far as file system permits.
How about "cur file name with mail meta data"?
If Gmail, "MSGID_" + MSGID + "@" + something(no #) + "#" + hashed_value_from_mail_meta_data + ".eml"
If fetched via IMAP, "UID_" + UID + "@" + something(no #) + "#" + hashed_value_from_mail_meta_data + ".eml"
If downloded via POP, "UIDL_" + UIDL + "@" + something(no #) + "#" + hashed_value_from_mail_meta_data + ".eml"
If copied from other folder, "MessageKey_" + MessageKey + "@" + same something + "#" + same string as copy source + ".eml"
Uniqueness in a cur dir(a mail folder) is always guaranteed.
If starting string of mail file is limited, "putting meta data file in cur dir" is easy, although "/meta dir" like one can be easily introduced when maildirstore..
Comment 7•10 years ago
|
||
WADA, see my comment 4. I believe these issues are already fixed in current builds.
Reporter | ||
Comment 8•10 years ago
|
||
Unable to see phenomenon again in Tb trunk(3/15 build. not tested with Tb 31).
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Resolution: FIXED → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•