[meta] finish "maildir" message storage
Categories
(MailNews Core :: Database, enhancement)
Tracking
(Not tracked)
People
(Reporter: wsmwk, Assigned: benc)
References
(Depends on 25 open bugs, Blocks 2 open bugs)
Details
(Keywords: meta, user-doc-needed, Whiteboard: [maildir])
Updated•12 years ago
|
Comment 2•12 years ago
|
||
Comment 4•12 years ago
|
||
Comment 5•12 years ago
|
||
Reporter | ||
Updated•10 years ago
|
Updated•10 years ago
|
Updated•10 years ago
|
Updated•9 years ago
|
Comment 6•9 years ago
|
||
Comment 7•9 years ago
|
||
Comment 8•9 years ago
|
||
Comment 9•9 years ago
|
||
Reporter | ||
Comment 10•9 years ago
|
||
Reporter | ||
Updated•8 years ago
|
Comment 12•7 years ago
|
||
Reporter | ||
Comment 13•6 years ago
|
||
Updated•5 years ago
|
Comment 15•5 years ago
|
||
Reporter | ||
Updated•4 years ago
|
Comment 16•4 years ago
|
||
Guys, are you aware that in TB68 maildir implementation caused imap folders to have messages with unreadable attachments? For example many users reported that they can't open pdf files from attachments. EML files on disk were fine. Repairing folder helps same as deleting msf and allowing program to recreate it. So the problem was in improper msf files.
Another problem concerning maildir local folders was connected to msf files, too. When moving files from folder to folder (in local folders) (mostly more than 50 at a time), they were moved on disk, but not in TB (stayed in msf). TB showed them in src folder (but clicking them said msg unavailable). It doesn't happen "sometimes". It happened all the time. Generally after moving files (when manually sorting them year by year) we had to delete msf in order to get proper message list and view what was moved and what not.
My question is - are you aware of these problems and did you fix them? Is msf thoroughly tested? 78+ roadmap says that maildir is decent in 78, but i didn't find anything mentioned in changelog about bugs i stumbled upon. To be clear it wasn't on just my machine. With my coworker we deployed many migrations from mbox to maildir and had these problems on almost every machine.
Currently we use mbox for imap and maildir for local folders, but prefer to move emails around in mbox because it's more stable. We convert to maildir in the end after all work is done.
Last but not least, I would name maildir emails by date then msg id because it allows sorting them yearly and totally simplifies archiving. Trying to work on one big maildir local folder consisting of mails from many years is in tb68 almost impossible without converting to mbox first.
Comment 17•4 years ago
|
||
To explain last sentence "I would name maildir emails by date then msg id" - i mean files on the disk. That would allow to move them to folders by hand and not in TB. Moving many emails in maildir local folders is very unstable just as I said before. I did write a script that extracts email date from email message and renames files, but it would be super cool to not have to do that in the first place. This functionality is dicated by the fact that we archive mails mostly by year. People often say - delete/archive emails older than x years.
Another bug.
Reproduce: create account, set custom folder to C:\foo, close TB, delete manually C:\foo. Open TB. This time not only C:\foo is created also c:\foo.sbd is created and TB starts syncing emails and write files into c:\foo.sbd instead of c:\foo.
To get around you have to close TB, delete c:\foo.sbd and run TB again.
This time TB sees existing c:\foo and starts to write files to it.
Assignee | ||
Comment 18•4 years ago
|
||
First off, thanks for taking the time to write all that up - very useful and much appreciated!
(In reply to Zbigniew Gralewski from comment #16)
My question is - are you aware of these problems and did you fix them? Is msf thoroughly tested? 78+ roadmap says that maildir is decent in 78, but i didn't find anything mentioned in changelog about bugs i stumbled upon. To be clear it wasn't on just my machine. With my coworker we deployed many migrations from mbox to maildir and had these problems on almost every machine.
This bug is the overview one to track all the maildir related issues - see the "Depends on" list at the top to see all the unresolved maildir bugs.
There are definitely still enough rough edges that I'd be wary of using maildir in production.
A big maildir push is high on my TODO list.
If there are maildir issues not linked to this meta bug, then I recommend creating a new bug and adding it to the "depends on" list.
I don't see any existing bugs that obviously cover the imap-folders-have-messages-with-unreadable-attachments issue you mention - want to write it up? No problem if not - I'll go through your comments in more detail and write whatever we don't already have.
Last but not least, I would name maildir emails by date then msg id because it allows sorting them yearly and totally simplifies archiving. Trying to work on one big maildir local folder consisting of mails from many years is in tb68 almost impossible without converting to mbox first.
I think that's an interesting point - definitely something to look into.
There's a bigger question here: Is there any benefit to adhering to the maildir spec (all emails as files in a single flat directory), rather than, say, automatically stashing emails into subfolders. For example, "<YYYY>-<MM>/<msgid>.eml" would probably be manageable and rather useful to the user. You could get a more even distribution by, say, using subdirs based on hashing the messageid. But at the expense of making it an arse for the user to find emails in the filesystem (seems like a bad tradeoff).
Probably a discussion to break out into another bug or on the mailing list.
In any case, plain maildir is a good first step. There's still a bunch of places in the code that kind-of-sort-of assume mbox. So getting vanilla maildir solid and reliable makes it waaaaay simpler to add other potential storage schemes (either minor variants on maildir or stuff that's completely different in approach).
Comment 19•4 years ago
|
||
To explain last sentence "I would name maildir emails by date then msg id" - i mean files on the disk. That would allow to move them to folders by >hand and not in TB. Moving many emails in maildir local folders is very unstable just as I said before. I did write a script that extracts email date >from email message and renames files, but it would be super cool to not have to do that in the first place
I fully agree with this suggestion from Zbigniew Gralewski
@Zbigniew Gralewski :
You wrote about a script. I am interested.
Could you pass it to me?
thoste at email dot com
Thank you
Reporter | ||
Comment 20•4 years ago
|
||
(In reply to Ben Campbell from comment #18)
...
Last but not least, I would name maildir emails by date then msg id because it allows sorting them yearly and totally simplifies archiving. Trying to work on one big maildir local folder consisting of mails from many years is in tb68 almost impossible without converting to mbox first.
I think that's an interesting point - definitely something to look into.
There's a bigger question here: Is there any benefit to adhering to the maildir spec (all emails as files in a single flat directory), rather than, say, automatically stashing emails into subfolders. For example, "<YYYY>-<MM>/<msgid>.eml" would probably be manageable and rather useful to the user. You could get a more even distribution by, say, using subdirs based on hashing the messageid. But at the expense of making it an arse for the user to find emails in the filesystem (seems like a bad tradeoff).
There is indeed a "breaking point" on folder size where it takes forever to enumerate folder contents in the MS Windows environment.
Comment 21•4 years ago
|
||
Script that renames eml files massively: https://github.com/VerisZG/ahk_eml_rename_by_date/blob/master/__eml-rename-by-year.ahk
Comment 22•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #20)
I think that's an interesting point - definitely something to look into.
There's a bigger question here: Is there any benefit to adhering to the maildir spec (all emails as files in a single flat directory), rather than, say, automatically stashing emails into subfolders. For example, "<YYYY>-<MM>/<msgid>.eml" would probably be manageable and rather useful to the user. You could get a more even distribution by, say, using subdirs based on hashing the messageid. But at the expense of making it an arse for the user to find emails in the filesystem (seems like a bad tradeoff).There is indeed a "breaking point" on folder size where it takes forever to enumerate folder contents in the MS Windows environment.
Wayne, internally I would leave them as they are and in maildir spec as it is. Cur and tmp folders are fine. On dir in TB, two dirs on the disk (cur and tmp). In other words it is useful to have location of files on disk consistent with structure of folders in Thunderbird. Look at it this way, we have GDPR, we teach people how to archive and delete emails and TB can be configured to move them into yearly subfolders when archiving. Problem is when you have a user that does nothing just holds thousands of emails in big inbox. Admin has to be able to quickly move them into local folders, sort by year into subfolders and finally make user mailbox smaller so the user is forced to sort and archive in realtime or once a week. Admin can put maildir local foldes into sync by google drive, synology drive, dropbox etc (excluding MSF files) and you have realtime protection of local foldes then. I use that with success. So I would only use the date extracted from email as filename because it helps a lot with manual admin work. Renamed EML files reindex in TB just fine. Maybe use messageid for emails that don't have proper "Date:" field in headers or use "date_messageid". Look at my ahk script attached in recent post. We rename all files using it and sort into subfolders by date manually, then delete msf files, run TB and the job of sorting tousands of files is done. Admin work is quick and business rules apply. Maybe a bit offtopic but I wish TB to be admin and business rules implementation friendly.
Comment 23•4 years ago
|
||
(In reply to Ben Campbell from comment #18)
There's a bigger question here: Is there any benefit to adhering to the maildir spec (all emails as files in a single flat directory), rather than, say, automatically stashing emails into subfolders. For example, "<YYYY>-<MM>/<msgid>.eml" would probably be manageable and rather useful to the user. You could get a more even distribution by, say, using subdirs based on hashing the messageid. But at the expense of making it an arse for the user to find emails in the filesystem (seems like a bad tradeoff).
Probably a discussion to break out into another bug or on the mailing list.
One issue with the YYYY/MM folder how would you determine according to which timezone should the month change to the next one - local or UTC?
Updated•4 years ago
|
Updated•4 years ago
|
Comment 24•4 years ago
|
||
All folders of my IMAP accounts are set for offline use, so I always have a backup of all my e-mails. However, with maildir, after compressing folders, thunderbird tends to re-download a lot of those mails, and while everything looks fine in the UI, the on-disk folders contain lots of duplicates. Thunderbird just adds an ever increasing number in front of the ".eml" extension and downloads all the same e-mails again and again. I already noticed this years ago, when maildir was still in beta. Now I set up a new profile and am very disappointed to see it still doing the same shit.
Comment 25•4 years ago
|
||
(In reply to Bachsau from comment #24)
All folders of my IMAP accounts are set for offline use, so I always have a backup of all my e-mails.
No, you have a local cache copy for speed, should you loose the emails on the server or Thunderbird losses the ability to connect to the server you will see everything deleted. That is not a backup.
However, with maildir, after compressing folders,
Compact has no function under Maildir lite and should be disabled in IMAP accounts.
thunderbird tends to re-download a lot of those mails,
A response to the reindex that the compact process carries, just as a repair folder will see all the message headers downloaded again, but I have no access to compacting imap accounts, only repairing folders in Thunderbird 78
and while everything looks fine in the UI, the on-disk folders contain lots of duplicates.
Thunderbird just adds an ever increasing number in front of the ".eml" extension and downloads all the same e-mails again and again. I already noticed this years ago, when maildir was still in beta. Now I set up a new profile and am very disappointed to see it still doing the same shit.
Really maildir lite is still in beta as it has never been enabled by default. See https://support.mozilla.org/en-US/kb/maildir-thunderbird
I have one account I use with maildir and I find it has it's issues, but I do not see multiple email copies being downloaded from Gmail.
Perhaps if you have identified a bug that can be reproduced (you offer no steps) you might consider filing a bug for that issue.
This bug is a meta bug to monitor the bugs that are outstanding with regard to the implementation of the maildir lite feature so your comment is most unlikely to see anything happen with regard to the implementation. Filing a bug report for identified bugs is the appropriate approach. If you want to discuss the issue I would suggest you could perhaps use the Discourse forum for beta releases https://discourse.mozilla.org/c/thunderbird/beta/257 While this is considered experimental I would think the beta forum would be the appropriate place for discussion and support as the feature is unfortunately certainly not release quality
Assignee | ||
Comment 26•3 years ago
|
||
Breadcrumbs! I just wrote a big screed of mailstore plans over at https://bugzilla.mozilla.org/show_bug.cgi?id=1308335#c9 (which would probably have been more useful here :- )
Comment 27•3 years ago
|
||
Updated•3 years ago
|
Comment 28•2 years ago
|
||
Possibly already mentioned -
IMAP account - Maildir- Problem If use 'Shift+DEl' to bypass Trash then email not deleted off server, so other imap access to account still see email.
Need to exit Thunderbird to force an expunge (as per Account Settings expunge settings) or have to run a full compact on all folders after deleting to update the server. Not practical.
No ability to compact on a folder - no right click compact option.
Can Customise toolbar to add 'compact' button to toolbar, so able to select folder and click on compact.
Request auto expunge and synch to update server if using Shift+DEL.
Support Forum : https://support.mozilla.org/en-US/questions/1377121
Comment 29•2 years ago
|
||
i see that Maildir-type storage doesn't seem to handle messages with multiple labels in Gmail as expected... it will create separate (but otherwise identical) EML files, one for each Gmail label / IMAP folder.
this is covered in bugzilla bug #1554529 - Redundant copies of multi-labeled messages stored for GMail (maildir profile
https://bugzilla.mozilla.org/show_bug.cgi?id=1554529
... but i see that this bug is not mentioned above in the "Depends on" references? Is it fixed in current Thunderbird versions?
Comment 30•2 years ago
|
||
Fair enough. Based on Bug 1554529 comment 12 this does appear to be a maildir only issue
Updated•2 years ago
|
Description
•