Open Bug 1308335 Opened 8 years ago Updated 2 years ago

Replace sins of the past with modern day telemetry, was: Unexpected file position change detected - Please report your details here

Categories

(Thunderbird :: General, task)

Tracking

(Not tracked)

People

(Reporter: jorgk-bmo, Unassigned)

References

Details

User Story

- The steps you used to cause the error to be displayed in the Error Console.
- The folder name reported in the message.
- The operating system you are using and whether it is 32bit or 64bit.
- The file system you are using (NTFS, HFS+, ext4, etc.) to store this
  folder and whether it is perhaps stored on a network drive.

Attachments

(1 file)

Dear user, If you are visiting this page because you saw the follwing message in Thunderbird's Error Console "Unexpected file position change detected in folder _______. If you can reliably reproduce this, please report the steps you used to dev-apps-thunderbird@lists.mozilla.org or to bug (this bug) at bugzilla.mozilla.org. Resolving this problem will allow speeding up message downloads." then please add a bug comment with the information described above in https:#user-story. We might also ask you for additional information after you comment, and your feedback will be appreciated. Thank you for your cooperation.
Depends on: 1116055

No one has ever reported anything here, so it's about time to replace this code
https://searchfox.org/comm-central/rev/77a876e480a7d3367efc168652c91f5323feb32f/mailnews/local/src/nsPop3Sink.cpp#639-651
with proper telemetry, if it's still of interest. Any takers?

Check https://hg.mozilla.org/comm-central/rev/72efbbd0811ea56ea2fedf4c78a774c04cd7aa38 for all the console logging code that can be removed now. Check bug 1116055 comment #166 for all the follow-up bugs, particularly bug 1242030, which never went anywhere :-(

Type: defect → task
Flags: needinfo?(ishikawa)
Flags: needinfo?(benc)
Summary: Unexpected file position change detected - Please report your details here → Replace sins of the past with modern day telemetry, was: Unexpected file position change detected - Please report your details here

(In reply to Klaus B. from comment #1)

No one has ever reported anything here, so it's about time to replace this code
https://searchfox.org/comm-central/rev/77a876e480a7d3367efc168652c91f5323feb32f/mailnews/local/src/nsPop3Sink.cpp#639-651
with proper telemetry, if it's still of interest. Any takers?

First part:
Sorry, I don't know anything about telemetry. So if someone can change that code to telemetry and collect the information from the userbase, that would be great.

Check https://hg.mozilla.org/comm-central/rev/72efbbd0811ea56ea2fedf4c78a774c04cd7aa38 for all the console logging code that can be removed now. Check bug 1116055 comment #166 for all the follow-up bugs, particularly bug 1242030, which never went anywhere :-(

Second.
I have not forgotten about the patch set and the console code that is used there.

Now, I have been trying to check my patch set to make sure no serious issues are introduced by submitting
tryserver test jobs. I believe they are not.
However, I have not been able to finish bug 1242030 patch to completion yet.

Why? There are three reasons.

  1. local testing often reveals issues that are perplexing or outright problematic (read security-related) and
    I have taken up such issues sporadically so that my testing continues.

The change from |make mozmill| to mochitest framework was a big change that took me almost four months before any reasonable testing could continue. (I test more or less locally first.)

  1. There are smallish issues that pop up sporadically and I need to get it fixed first to continue testing.
    Most problematic is the compilation/linking issue which do happen from time to time and it takes almost a week or two for this
    occasional patch submitter to fix such toolchain problems. (I compile TB under linux using GCC mostly)

There are other issues that make the mochitest result not reliable because there are fatal errors that may mask
any errors/warnings coming from my patch set. So I want to get rid of them first.

The following is the local mercurial patch queue. (Yeah, I am still using MQUEUE locally, which I think is outdated, but for an occasional patch submitter, this works...)
It is long.

 0 A avoid-gcc-snprintf-check-bug.patch: avoid gcc snprintf format string check bug.
 1 A 1596036-GetOfflineFileStream-return-value-check-REV02.patch: Bug 1596036: Check GetOfflineFileStream() return value a...
 2 A 1677202-rev02.patch: Bug 1677202 - avoid accessing released heap, coping with timing issue.
 3 A contentChatNode.patch: bug NNNNNN - check if contentChatNode is undefined/null
 4 A container-undefined-check-rev01.patch: check container is defined
 5 A compactor-uninitialized-check-2.patch: Bug 1611708 statusOffset is uninitialized in one execution path.
 6 A selection-count-debug.patch: Bug 1158471 - ASSERTION: selection count is wrong.  imap unread message count dump when ...
 7 A CopyToNative-return-value-check.patch: CopyToNative return value check (one indirect)
 8 A MsgComposeCommands-debug.patch: Adding proper emacs modeline that fits the existing indentation to MsgComposeCommands.js
 9 A NEW-ldap-sign-compare.patch: Bug 1243121 - C-C ldap directory: fix -Werror=sign-compare: signed vs unsigned warnings ...
10 A ldap-format-issues.patch: Bug 1243121 (b) patch for printf format related issues in ldap source tree.
11 A new-1242030-DONT-USE-REUSABLE-new-part-1-rev04-with-the-original-uuid.patch: bug 1242030: DONT-USE-REUSABLE new part-...
12 A new-1242030-DONT-USE-REUSABLE-new-part-2-imap-JK-v1-rev04.patch: bug 1242030: DONT-USE-REUSABLE Improve error handlin...
13 A new-1242030-DONT-USE-REUSABLE-new-part-3-import-JK-v1-rev04.patch: bug 1242030: DONT-USE-REUSABLE Improve error handl...
14 A new-1242030-DONT-USE-REUSABLE-new-part-4-local-less-pop3-rev04.patch: bug 1242030: DONT-USE-REUSABLE new part-4 new s...
15 A new-1242030-DONT-USE-REUSABLE-new-part-5-pop3-rev04.patch: bug 1242030: DONT-USE-REUSABLE new part-5 new splitting of...
16 A enable-buffering-1st-step-rev03.patch: Bug 1242042: Enabling buffering for file stream to write message for C-C TB (E...
17 A removing-a-Seek-rev03.patch: Bug 1242046 - Removing unnecessary |Seek| that caused the C-C TB to operate slowly in te...
18 A verbose-test_logger.patch: Verbose test_logger failure case.
19 A 1595343-nsMsgAttachment-dump.patch: 1595343 print mUrl in DeleteAttachment.
20 A xpcshell-longer-timeout.patch: longer time out for xpcshell tests to run under valgrind.
21 A still-new-TRACE-STREAM-MACRO.patch: TRACE-STREAM-MACRO patch
22 A add-close-inputStream.patch: Adding Close before Remove (possible Windows problem)
23 A new-trace-entering-message-copy-functions-EARLY-SEQ.patch: Trace enter/leave of message copy functions
24 A 1285434-PR_FREEIF-mailboxName.patch: Bug 1285434 - frees mailboxName before return
25 A bad-format-print.patch: local debug - print message line when bad format is encountered.
26 A initalize-oldValue.patch: initialize oldValue patch.
27 A nsMsgSearchAdapter-check-the-return-value-of-GetAvailable.patch: Check the return value of GetAvailable as in other u...
28 A NEW-add-short-read-handling-base.patch: Bug 1170606 part-0: C-C T-B fixes to take care of short read: add short read ...
29 A NEW-add-short-read-handling-IMPORT-directory.patch: Bug 1170606 part-1: add short read handling under mailnews/import...
30 A NEW-add-short-read-handling-BASE-directory-part-A.patch: Bug 1170606 part-2: add short read handling to files under m...
31 A NEW-add-short-read-handling-BASE-directory-part-B.patch: Bug 1170606 part-2: add short read handling to files under m...
32 A NEW-add-short-read-handling-COMPOSE-directory-part-1.patch: Bug 1170606 part-3: add short read handling to some files...
33 A NEW-add-short-read-handling-IMAP-directory.patch: Bug 1170606 part-4: add short read handling to files under mailnews...
34 A add-short-read-handling-LOCAL-directory-REV03.patch: Bug 1170606 part-5: add short read handling to files under mailn...
35 A add-short-read-handling-MIME-directory.patch: Bug 1170606 part-6: add short read handling to files under mailnews/mim...
36 A add-short-read-handling-OTHER-directory.patch: Bug 1170606 part-7: add short read handling to files not covered by pr...
37 A check-InitWithFile-and-friends.patch: Bug 1307085-a - C-C TB: Tigher checking of InitWithFile and friends. A few chec...
38 A check_InitWithFile-import-files.patch: Bug 1307085-b - check the return value of InitWithFile() in files under 'impor...
39 A test_pop3FilterActions.patch: test_pop3FilterActions.js: dump preview1, preview2
40 A warn_updateFlag_shortcut_taken.patch: warning of short cut path taken during updateFolderFlag
41 A chasing-strange-win-error.patch: nsMessengerWinIntegration.cpp: Trying to trace strange windows error.
42 A cant-be-your-own-parent-dump.patch: Print a nsMsgKey when an assertion,  NS_ASSERTION(false, "can't be your own paren...
43 A GetSummaryValid-returns-without-setting-valid.patch: Bug 1246048 - Initialize a variable passed to GetSummaryValid()....
44 A trace-exists-etc.patch: Better tracing of errors of  Exists(), IsSummaryFileValid(), etc.
45 A AddrBookCard.patch: AddrBookCard insert console.trace()
46 A setbool-temp.patch: setbool temp patch.
47 U Ical-dump.patch: Ical.jsm dump

patch 0 is (GCC compilation issue, actually GCC bug), 1-8 are issues that I wanted to clean up to make the mochitest/xpcshell log cleaner for verifying that the patch in bug 1242030 is OK.
9-10 are for LDAP code.
They are the ones I want to clear up before the patch in bug 1242030, patch 11-19, can be finished.

However, I HAVE realized, of course, with the new bugs and old bugs and the scarcity of developer man-power for TB maintenance,
I am not going anywhere if I want to get rid of all the bugs.

So I decided to focus on the following issue, namely LDAP patches.
The last one, 9-10 is a big patch set for ldap code to compile cleanly to avoid signed vs. unsigned issue, which is an error
in my local compiler setting.: I have turned the warning of mixed signed and unsigned computation to an error to detect subtle issues.

  • My current plan for the last six months or so.

Right now, aside from those sporadic bugs (some actually very persistent, though), the only bug that nags me about my patch in bug 1242030.
is the subtle LDAP bug.

This has been the state of affairs for like six months.

So my idea was to resolve the patch set for LDAP code in bug 1243121 first.
Then if there IS a subtle issue in LDAP code, others would have noticed it and we can get the bug fixed there.

The codes touched by bug 1242030 and bug 1243121 are very independent, I think.
They should be testable independently.
But in my setting GCC WITH THE STRICT CHECKING cannot compile LDAP code without the patch . (And, for that matter CLANG would barf if strict checking of signed and unsigned is enabled.)
So LDAP code needs be patched first.
And so, without changing my patched source tree structure drastically, I can only test my patch in bug 1242030 only together with
the patch bug 1243121.

That was what I have been thinking about, but then again, I was ovewhelmed by warnings/bugs
that are probably caused by packaging mistakes in the latest mochitest/xpcshell tests and there have been so many of them, I was really distressed to report them in the first place.

That was when I noticed your post.

I wish people check the debug log of mochitest and xpchsell more carefully so that any
new warnings and errors are noticed quickly.
As some mentioned the debug log is a dumping ground for every thing as of now and not many people seem to take ta look at it.
I wonder who is tracking the number of runtime warnings and errors in total.

Splitting the log for no apparent reason on tryserver was not helpful at all IMHO. Maybe to reduce the log size and to narrow down the area of log files when a particular bug is analyzed?
But to check the status of overall health, having to download and check the log files separately is another psychological hurdle.
But I digress.

Since my patch in bug 1242030 is related to I/O operation (mostly read), I need to check if any bugs/warnings remotely related to I/O are caused by my patches. (So far, it seems any fatal errors are non-existent except for the mysterious failure of an LDAP-related test, which is probably caused by bug 1243121. But as I explained above, I need to clear the LDAP patch first so that I can test the patch in 1242030 independently if my local GCC test setting [which turned out to be very useful] is preserved.)

Oh well, one thing is clear.
The number of warnings/errors in debug log has steadily increased (not diminished unfortunately) since early 2020.
Even though I have been using a hand-crafted shell/awk script to look for pertinent warnings/bugs by wading through the myriad messages in log files for quite some time (7-8 years now?), the added number of messages from mochitest since early 2020, and the
increase of messages in the last couple of months has really make the assessment of a large patch set as in bug 1242030 very difficult.
Some may naively think that checking a failure/non-failure of test is good enough.
Unfortunately, NO. The test coverage is not complete and test framework failed to notice underlying failures sometimes and not report them.
Sigh.

If there is a silver bullet to speed up the patch testing and acceptance, I would love to hear it. Especially the testing part.

Flags: needinfo?(ishikawa)
Here is the list of sample from my shell script of analyzing the local log. ``` ```

Here is the list of 'errors' NOT warnings in the log files.
Some of them are legitimate errors reported, but mochitest and the mozilla test framework has not produced any whitelisting mechanism to
ignore such errors (IMHO, that is a grave sin.). With such mechanism, it would have been relatively easy to list up pertinent new bugs in one batch.

 NS_ERROR_ (except for NS_ERROR_FACTORY_NOT_REGISTERED, createKeyedServer duplicate server error, addons.xpi-utils)
 ========================================

First with NS_ENSURE_SUCCESS

  359: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070057 (NS_ERROR_ILLEGAL_VALUE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/netwerk/base/nsIOService.cpp:1730
  359: Thread] WARNING: NS_ENSURE_SUCCESS(rv, nullptr) failed with result 0x804B000A (NS_ERROR_MALFORMED_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/extensions/permissions/Permission.cpp:46
  281: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsCSPService.cpp:191
  281: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/chrome/nsChromeRegistry.cpp:180
  100: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/base/nsContentUtils.cpp:3896
   63: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsMailboxService.cpp:627
   63: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsMailboxService.cpp:610
   58: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/intl/nsCharsetConverterManager.cpp:111
   39: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgDBFolder.cpp:2959
   35: Thread] WARNING: NS_ENSURE_SUCCESS_VOID(rv) failed with result 0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/system/IOUtils.cpp:695
   30: Thread] WARNING: NS_ENSURE_SUCCESS(rv, -1) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/mime/src/mimemoz2.cpp:769
   28: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805303F4 (NS_ERROR_DOM_BAD_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgProtocol.cpp:542
   24: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805303F4 (NS_ERROR_DOM_BAD_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsContentSecurityManager.cpp:1222
   24: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805303F4 (NS_ERROR_DOM_BAD_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsContentSecurityManager.cpp:1079
   23: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805E0006 (NS_ERROR_CONTENT_BLOCKED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsContentSecurityManager.cpp:1083
   21: Thread] WARNING: NS_ENSURE_SUCCESS(rv, NS_ERROR_NOT_INITIALIZED) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgAccountManager.cpp:470
   19: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80780003 (NS_ERROR_BLOCKED_BY_POLICY): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsContentSecurityManager.cpp:1083
    9: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgAccountManager.cpp:2015
    8: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805303F4 (NS_ERROR_DOM_BAD_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/netwerk/protocol/http/nsCORSListenerProxy.cpp:882
    8: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805303F4 (NS_ERROR_DOM_BAD_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsContentSecurityManager.cpp:388
    8: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805303F4 (NS_ERROR_DOM_BAD_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/security/nsContentSecurityManager.cpp:1075
    8: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgAccount.cpp:83
    8: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgAccountManager.cpp:2050
    7: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0xC1F30001 (NS_ERROR_NOT_INITIALIZED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgAccount.cpp:92
    6: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/intl/nsCharsetConverterManager.cpp:145
    6: Thread] WARNING: NS_ENSURE_SUCCESS(rv, nullptr) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgProtocol.cpp:1482
    5: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/netwerk/base/nsNetUtil.cpp:2279
    5: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/caps/ContentPrincipal.cpp:448
    5: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/caps/BasePrincipal.cpp:111
    5: Thread] WARNING: NS_ENSURE_SUCCESS(rv, nsMsgViewIndex_None) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgDBView.cpp:4499
    3: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/db/msgdb/src/nsMsgDatabase.cpp:135
    3: Thread] WARNING: NS_ENSURE_SUCCESS(mStatus, mStatus) failed with result 0x80070057 (NS_ERROR_ILLEGAL_VALUE): file /NEW-SSD/ASAN-OBJ-DIR/objdir-tb3/dist/include/nsIURIMutator.h:629
    2: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgI18N.cpp:90
    2: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgProtocol.cpp:329
    2: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsMailboxService.cpp:630
    2: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgDBFolder.cpp:4702
    2: Thread] WARNING: NS_ENSURE_SUCCESS(rv, NS_ERROR_NOT_INITIALIZED) failed with result 0x8000FFFF (NS_ERROR_UNEXPECTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/base/src/nsMsgAccountManager.cpp:456
    1: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x804E03EF (NS_ERROR_HTMLPARSER_INTERRUPTED): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/xml/nsXMLContentSink.cpp:1323
    1: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x804B000A (NS_ERROR_MALFORMED_URI): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/netwerk/url-classifier/UrlClassifierCommon.cpp:405
    1: Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/compose/src/nsMsgCopy.cpp:375

Sorry for posting the list as comment. I have no idea why but there was an error from S3 for trying to list that as an attachment.
Errors that do not have NS_ENSURE_SUCCESS.

without NS_ENSURE_SUCCESS

   64: Console message: QuotaManager failure: 'CollectEachFile( *directory, [originScope = [this] { OriginScope originScope = mOriginScope.Clone(); if (originScope.IsOrigin()) { originScope.SetOrigin( MakeSanitizedOriginCString(originScope.GetOrigin())); } else if (originScope.IsPrefix()) { originScope.SetOriginNoSuffix(MakeSanitizedOriginCString( originScope.GetOriginNoSuffix())); } return originScope; }(), aPersistenceType, &aQuotaManager, &directoriesForRemovalRetry, this](nsCOMPtr<nsIFile>&& file) -> mozilla::Result<Ok, nsresult> { auto tryResult438 = (::mozilla::ToResultInvoke<nsAutoString>( ::std::mem_fn( &::mozilla::detail::DerefedType<decltype(file)>::GetLeafName), (file))); if ((__builtin_expect(!!(tryResult438.isErr()), 0))) { mozilla::dom::quota::HandleError("::mozilla::ToResultInvoke<nsAutoString>( ::std::mem_fn( &::mozilla::detail::DerefedType<decltype(file)>::GetLeafName), (file))", tryResult438.inspectErr(), "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8888); return tryResult438.propagateErr(); } const auto& leafName = tryResult438.inspect();; auto tryResult439 = (GetDirEntryKind(*file)); if ((__builtin_expect(!!(tryResult439.isErr()), 0))) { mozilla::dom::quota::HandleError("GetDirEntryKind(*file)", tryResult439.inspectErr(), "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8890); return tryResult439.propagateErr(); } const auto& dirEntryKind = tryResult439.inspect();; switch (dirEntryKind) { case nsIFileKind::ExistsAsDirectory: { if (!originScope.Matches(OriginScope::FromOrigin( NS_ConvertUTF16toUTF8(leafName)))) { break; } const bool persistent = aPersistenceType == PERSISTENCE_TYPE_PERSISTENT; auto tryResult440 = (aQuotaManager .GetDirectoryMetadataWithOriginMetadata2WithRestore( file, persistent)); if ((__builtin_expect(!!(tryResult440.isErr()), 0))) { mozilla::dom::quota::HandleError("aQuotaManager .GetDirectoryMetadataWithOriginMetadata2WithRestore( file, persistent)", tryResult440.inspectErr(), "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8907); return tryResult440.propagateErr(); } const auto& metadata = tryResult440.inspect();; if (!mClientType.IsNull()) { nsAutoString clientDirectoryName; { auto tryResult441 = ::mozilla::ToResult(OkIf(Client::TypeToText(mClientType.Value(), clientDirectoryName, fallible))); static_assert(std::is_empty_v<typename decltype(tryResult441)::ok_type>); if ((__builtin_expect(!!(tryResult441.isErr()), 0))) { auto tryTempError __attribute__((__unused__)) = tryResult441.unwrapErr(); mozilla::dom::quota::HandleError("OkIf(Client::TypeToText(mClientType.Value(), clientDirectoryName, fallible))", tryTempError, "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8914); return Err(NS_ERROR_FAILURE); } }; { auto tryResult442 = ::mozilla::ToResult(file->Append(clientDirectoryName)); static_assert(std::is_empty_v<typename decltype(tryResult442)::ok_type>); if ((__builtin_expect(!!(tryResult442.isErr()), 0))) { mozilla::dom::quota::HandleError("file->Append(clientDirectoryName)", tryResult442.inspectErr(), "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8916); return tryResult442.propagateErr(); } }; auto tryResult443 = (::mozilla::ToResultInvoke( (file), &::mozilla::detail::DerefedType<decltype(file)>::Exists)); if ((__builtin_expect(!!(tryResult443.isErr()), 0))) { mozilla::dom::quota::HandleError("::mozilla::ToResultInvoke( (file), &::mozilla::detail::DerefedType<decltype(file)>::Exists)", tryResult443.inspectErr(), "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8919); return tryResult443.propagateErr(); } const bool& exists = tryResult443.inspect();; if (!exists) { break; } } if (((bool)(__builtin_expect(!!(NS_FAILED_impl((file->Remove(true)))), 0)))) { NS_DebugBreak(NS_DEBUG_WARNING, "Failed to remove directory, retrying later.", nullptr, "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8929); directoriesForRemovalRetry.AppendElement(std::move(file)); } const bool initialized = aPersistenceType == PERSISTENCE_TYPE_PERSISTENT ? aQuotaManager.IsOriginInitialized( metadata.mOriginMetadata.mOrigin) : aQuotaManager.IsTemporaryStorageInitialized(); if (!initialized) { break; } if (aPersistenceType != PERSISTENCE_TYPE_PERSISTENT) { if (mClientType.IsNull()) { aQuotaManager.RemoveQuotaForOrigin( aPersistenceType, metadata.mOriginMetadata); } else { aQuotaManager.ResetUsageForClient( aPersistenceType, metadata.mOriginMetadata, mClientType.Value()); } } aQuotaManager.OriginClearCompleted( aPersistenceType, metadata.mOriginMetadata.mOrigin, mClientType); break; } case nsIFileKind::ExistsAsFile: if (!IsOSMetadata(leafName)) { NS_DebugBreak(NS_DEBUG_WARNING, nsPrintfCString("Something (%s) in the directory that doesn't belong!", NS_ConvertUTF16toUTF8(leafName).get()) .get(), nullptr, "/NEW-SSD/NREF-COMM-CENTRAL/mozilla/dom/quota/ActorsParent.cpp", 8968); } break; case nsIFileKind::DoesNotExist: break; } return Ok{}; })', file ActorsParent.cpp, line 8980 failed with result 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST)
   64: Console message: QuotaManager failure: '::mozilla::ToResultInvoke<nsCOMPtr<nsIDirectoryEnumerator>>( ::std::mem_fn( &::mozilla::detail::DerefedType<decltype(aDirectory)>::GetDirectoryEntries), (aDirectory))', file QuotaCommon.h, line 1177 failed with result 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST)
   43: console.debug: "exception in getMsgHdr: [Exception... \"Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMsgMessageService.messageURIToMsgHdr]\"  nsresult: \"0x80004005 (NS_ERROR_FAILURE)\"	location: \"JS frame :: chrome://openpgp/content/ui/enigmailMsgComposeOverlay.js :: getMsgHdr :: line 415\"  data: no]"
   28: 306993, QuotaManager IO] WARNING: QuotaManager failure: '::mozilla::ToResultInvoke<nsCOMPtr<nsIDirectoryEnumerator>>( ::std::mem_fn( &::mozilla::detail::DerefedType<decltype(aDirectory)>::GetDirectoryEntries), (aDirectory)) failed with result 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST)', file /NEW-SSD/ASAN-OBJ-DIR/objdir-tb3/dist/include/mozilla/dom/quota/QuotaCommon.h:1177
   28: 305490, QuotaManager IO] WARNING: QuotaManager failure: '::mozilla::ToResultInvoke<nsCOMPtr<nsIDirectoryEnumerator>>( ::std::mem_fn( &::mozilla::detail::DerefedType<decltype(aDirectory)>::GetDirectoryEntries), (aDirectory)) failed with result 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST)', file /NEW-SSD/ASAN-OBJ-DIR/objdir-tb3/dist/include/mozilla/dom/quota/QuotaCommon.h:1177
   16: + name (string) 'NS_ERROR_ILLEGAL_VALUE'
   16: + message (string) 'Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIObserverService.removeObserver]'
   16: "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIObserverService.removeObserver]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/MailInstrumentation.jsm :: uninit :: line 243"  data: no]
   14: Console message: [JavaScript Error: "[Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIObserverService.removeObserver]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/MailInstrumentation.jsm :: uninit :: line 243"  data: no]"]
   13: console.info: xpcshell: (void 0) "Error console says" ({type:"stackFrame", name:"[Exception... \"Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIObserverService.removeObserver]\"  nsresult: \"0x80070057 (NS_ERROR_ILLEGAL_VALUE)\"  location: \"JS frame :: resource:///modules/MailInstrumentation.jsm :: uninit :: line 243\"  data: no]", category:"XPConnect JavaScript", fileName:"", lineNumber:0, _jsonMe:true})
    8: 306221, QuotaManager IO] WARNING: QuotaManager failure: '::mozilla::ToResultInvoke<nsCOMPtr<nsIDirectoryEnumerator>>( ::std::mem_fn( &::mozilla::detail::DerefedType<decltype(aDirectory)>::GetDirectoryEntries), (aDirectory)) failed with result 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST)', file /NEW-SSD/ASAN-OBJ-DIR/objdir-tb3/dist/include/mozilla/dom/quota/QuotaCommon.h:1177
    6: JavaScript error: resource:///modules/activity/alertHook.jsm, line 49: NS_ERROR_UNEXPECTED: Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIMsgMailNewsUrl.server]
    6: Console message: [JavaScript Error: "NS_ERROR_UNEXPECTED: Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIMsgMailNewsUrl.server]" {file: "resource:///modules/activity/alertHook.jsm" line: 49}]
    4: JavaScript error: resource:///modules/calendar/calUtils.jsm, line 123: NS_ERROR_UNEXPECTED: Assert failed: [CalCalendarManager::registerCalendar] calendar already registered!
    4: Console message: [JavaScript Error: "NS_ERROR_UNEXPECTED: Assert failed: [CalCalendarManager::registerCalendar] calendar already registered!
    2: JavaScript error: resource://gre/modules/LoginManagerChild.jsm, line 190: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIAutoCompleteInput.popup]
    2: JavaScript error: chrome://messenger/content/folderDisplay.js, line 1958: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMsgDBView.doCommand]
    2: Console message: generator testmode failed with exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIStringBundle.GetStringFromName]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"	 location: "JS frame :: chrome://global/content/elements/stringbundle.js :: getString :: line 44"  data: no]
    2: Console message: [JavaScript Error: "[Exception... "Connection failure"	nsresult: "0x804b001e (NS_ERROR_UNKNOWN_HOST)"	location: "JS frame :: resource:///modules/CardDAVDirectory.jsm :: onStreamComplete :: line 928"  data: no]"]
    2: Console message: [JavaScript Error: "[Exception... "Authorization failure"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource:///modules/CardDAVDirectory.jsm :: onStreamComplete :: line 934"	data: no]"]
    2: Console message: [JavaScript Error: "[Exception... "Address book discovery failed"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://messenger/content/addressbook/abCardDAVDialog.js :: check :: line 213"  data: no]"]
    2: Console message: [JavaScript Error: "NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMsgDBView.doCommand]" {file: "chrome://messenger/content/folderDisplay.js" line: 1958}]
    2: Console message: [JavaScript Error: "NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIAutoCompleteInput.popup]" {file: "resource://gre/modules/LoginManagerChild.jsm" line: 190}]
    2: Console message: [JavaScript Error: "An error occurred executing the cmd_selectAll command: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIController.doCommand]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"	location: "JS frame :: chrome://global/content/globalOverlay.js :: goDoCommand :: line 101"  data: no]" {file: "chrome://global/content/globalOverlay.js" line: 104}]
    2: Console message: Discovering children for imap://alice@openpgp.example failed with exception: [Exception... "Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIMsgFolder.subFolders]"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"	location: "JS frame :: chrome://messenger/content/folderPane.js :: get children :: line 3200"  data: no]
    1: JavaScript error: resource://gre/modules/LoginHelper.jsm, line 1559: NS_ERROR_FAILURE: This login already exists.
    1: JavaScript error: resource:///modules/OTRUI.jsm, line 236: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [imIAccountsService.getAccounts]
    1: JavaScript error: resource:///modules/DBViewWrapper.jsm, line 1113: NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBView.open]
    1: JavaScript error: resource:///modules/AddrBookCard.jsm, line 182: NS_ERROR_NOT_AVAILABLE: PreferDisplayName: undefined - not a boolean
    1: JavaScript error: chrome://mochitests/content/browser/comm/mail/test/browser/notification/browser_notification.js, line 60: NS_ERROR_FAILURE:
    1: JavaScript error: chrome://messenger/content/msgHdrView.js, line 796: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMessenger.msgHdrFromURI]
    1: JavaScript error: chrome://calendar/content/calendar-alarm-dialog.js, line 422: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: [JavaScript Error: "can't access property "getProperty", cal.getCalendarManager().getCalendarById(...) is null" {file: "resource:///modules/CalAlarmService.jsm" line: 604}]'[JavaScript Error: "can't access property "getProperty", cal.getCalendarManager().getCalendarById(...) is null" {file: "resource:///modules/CalAlarmService.jsm" line: 604}]' when calling method: [calIAlarmService::isLoading]
    1: Console message: [JavaScript Error: "[Exception... "[JavaScript Error: "can't access property "getProperty", cal.getCalendarManager().getCalendarById(...) is null" {file: "resource:///modules/CalAlarmService.jsm" line: 604}]'[JavaScript Error: "can't access property "getProperty", cal.getCalendarManager().getCalendarById(...) is null" {file: "resource:///modules/CalAlarmService.jsm" line: 604}]' when calling method: [calIAlarmService::isLoading]"  nsresult: "0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS)"	location: "JS frame :: chrome://calendar/content/calendar-alarm-dialog.js :: closeIfEmpty :: line 422"	data: yes]"]
    1: Console message: [JavaScript Error: "[Exception... "Component returned failure code: 0x804b0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS) [nsIEffectiveTLDService.getBaseDomain]"  nsresult: "0x804b0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS)"  location: "JS frame :: resource:///actors/LinkHandlerChild.jsm :: handleEvent :: line 106"	data: no]"]
    1: Console message: [JavaScript Error: "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: [JavaScript Error: "can't access property "getProperty", cal.getCalendarManager().getCalendarById(...) is null" {file: "resource:///modules/CalAlarmService.jsm" line: 604}]'[JavaScript Error: "can't access property "getProperty", cal.getCalendarManager().getCalendarById(...) is null" {file: "resource:///modules/CalAlarmService.jsm" line: 604}]' when calling method: [calIAlarmService::isLoading]" {file: "chrome://calendar/content/calendar-alarm-dialog.js" line: 422}]
    1: Console message: [JavaScript Error: "NS_ERROR_NOT_AVAILABLE: PreferDisplayName: undefined - not a boolean" {file: "resource:///modules/AddrBookCard.jsm" line: 182}]
    1: Console message: [JavaScript Error: "NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBView.open]" {file: "resource:///modules/DBViewWrapper.jsm" line: 1113}]
    1: Console message: [JavaScript Error: "NS_ERROR_FAILURE: This login already exists." {file: "resource://gre/modules/LoginHelper.jsm" line: 1559}]
    1: Console message: [JavaScript Error: "NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMessenger.msgHdrFromURI]" {file: "chrome://messenger/content/msgHdrView.js" line: 796}]
    1: Console message: [JavaScript Error: "NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [imIAccountsService.getAccounts]" {file: "resource:///modules/OTRUI.jsm" line: 236}]
    1: Console message: [JavaScript Error: "NS_ERROR_FAILURE: " {file: "chrome://mochitests/content/browser/comm/mail/test/browser/notification/browser_notification.js" line: 60}]
    1: Console message: [JavaScript Error: "An error occurred executing the cmd_shiftDelete command: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMsgDBView.doCommand]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"	 location: "JS frame :: chrome://messenger/content/folderDisplay.js :: doCommand :: line 1958"	data: no]" {file: "chrome://global/content/globalOverlay.js" line: 104}]
    1: Console message: [JavaScript Error: "An error occurred executing the cmd_delete command: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIMsgDBView.doCommand]"	 nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://messenger/content/folderDisplay.js :: doCommand :: line 1958"  data: no]" {file: "chrome://global/content/globalOverlay.js" line: 104}]
    1:	  Bug 1693300 - Replace the NS_OBJC_BEGIN/END_TRY_ABORT_BLOCK_NSRESULT macros with the functionally identical NS_OBJC_BEGIN/END_TRY_BLOCK_RETURN(NS_ERROR_FAILURE). rs=bustage-fix

And please bear in mind that these are only errors.

As I explained in my previous comment, the warnings are often times are clue to serious problems, problems not reported as fatal errors of failed tests at all.

I really wish mozilla spent three intensive months of fixing these issues without introducing new features. But that is a pipe dream, I suppose.

Again, sorry for posting the list as comment. I have no idea why but there was an error from S3 for trying to list that as an attachment.

Anyway, testing the log files for any errors that may be introduced with my patch is not easy with the current log and testing framework. OT ONLY THE FAILED TEST, mind you, that is the easiest part, and I noticed this strange ldap test error)

Some of you may wonder why simply compare the log before an after my patch.
Oh well, easily said than done.
With all the changing timestamps and PIDs, etc., and timing issues,
comparing logs from different runs are almost impossible.
My crude approach of comparing the number of warnings/errors after classifying them (and even whitelisting some messages by separting them in a different section of the analysis output) was best I came up with.

I would look up the top couple of dozens offenders AND also look at the tail end of the list in descending order of appearances.
This is how I have verified my large patch set so far after I check the test failures.

If someone has a better idea or working tool to speed up the testing of large patch set at the individual error/warning level, I would love to know.

Of course, the other way is to simply declare that "There is no failed error in mochitest/xpcshell tests. Let's land the patch and see if someone reports problems in comm-central nightly build."
Basically, ignoring any added warning, etc. but use the eyeballs of beta testers.
I believe the patch in bug 1242030 in one ldap-related test failure away from that status to make that approach tenable.
That is why I want to re-submit the ldap patch sets step by step with the explanation of how the patch set evolved to avoid sign vs unsigned warning/error.

Ben, I'm sorry you're being dragged into this ongoing saga, but you're the backend guy now: Just a TL;DR summary:
Bug 1242030 and friends is basically about switching to buffered write (NS_NewBufferedOutputStream), see patch in bug 1242042. This bug here relates to bug 1116055 where the code referenced in comment #1 was introduced. As expected, no results were ever captured. As you can see, there are some side issues, like the seeking, and removing the aReusable from getMsgInputStream() and getNewMsgOutputStream(), see:
https://searchfox.org/comm-central/source/mailnews/base/public/nsIMsgPluggableStore.idl#129
The background is that for writing mbox files, the stream/file is kept open based on the assumption that this is more performant that closing the stream/file after each access and opening/seeking again. Chiaki did some performance measurements.

As you can see, a lot of work has gone into this area, so perhaps with a fresh set of eyes we can finally move this forward.

(In reply to Klaus B. from comment #8)

Ben, I'm sorry you're being dragged into this ongoing saga, but you're the backend guy now: Just a TL;DR summary:
Bug 1242030 and friends is basically about switching to buffered write (NS_NewBufferedOutputStream), see patch in bug 1242042. This bug here relates to bug 1116055 where the code referenced in comment #1 was introduced. As expected, no results were ever captured. As you can see, there are some side issues, like the seeking, and removing the aReusable from getMsgInputStream() and getNewMsgOutputStream(), see:
https://searchfox.org/comm-central/source/mailnews/base/public/nsIMsgPluggableStore.idl#129
The background is that for writing mbox files, the stream/file is kept open based on the assumption that this is more performant that closing the stream/file after each access and opening/seeking again. Chiaki did some performance measurements.

As you can see, a lot of work has gone into this area, so perhaps with a fresh set of eyes we can finally move this forward.

I've spent the last few weeks digging down into the mailstore code (ie maildir/mbox) and message parsing to try and figure out what it's doing and what it should be doing. I'm starting to get a handle on it now.

My current plans, as they relate to this issue:

  1. the streams returned by getNewMsgOutputStream() shouldn't be treated as seekable (Bug 1719996). Callers should never have to worry about seeking to the end, and should just be able to start writing to it immediately.
  2. the mbox version of getNewMsgOutputStream() should fail if any write operation is already outstanding. I don't think this condition requires telemetry just yet.
  3. Yes, aReusable should be removed. The mailStore should handle any caching of open streams behind the scenes, and not involve callers.
  4. buffered writing - yes! definitely. I was particularly inspired by the Ishikawa method in Bug 1116055 and will take a cue and use strace to check up on the changes I make here!

At the moment, I think all the getNewMsgOutputStream() callers are synchronous, so there should only ever be one message being written at a time... but the fact this position-checking code is in there kind of hints that this might not always be the case (or at least wasn't at some point in the past). There's no real recovery for when we detect someone else has moved the file position underneath us... there's going to be screwed-up data in any case. For now, point 2 above addresses this. In future, the mbox mailstore should queue up write requests by folder, and serialise them properly. But this probably also implies moving all the message reading/writing over to use asynchronous streams.
For now, restricting to a single active write per folder seems fine to me.

More mailstore stuff, while I'm thinking about it (feel free to stop reading here!):

  • getMsgInputStream() should return a stream which EOFs at the end of the message. Currently all the parsing code assumes it's looking at an mbox file. This means the mbox store inputstream needs to parse the message as it goes along, in order to determine the end of the message. But it needs to parse anyway in order to properly apply "From " unescaping to the message body (Bug 1719121).
  • conversely the mbox version of getNewMsgOutputStream() should return a stream which automagically applies "From " escaping to the message body.
  • I think the FinishNewMessage() and DiscardNewMessage() calls currently on nsIMsgPlugableMailStore should be moved out onto the stream object returned by getNewMsgOutputStream(). These streams already have some extra state to track (eg tmp filename for maildir, start position for mbox etc). Currently that state is either stored on the new nsIMsgDBHdr, or in the mailStore itself. Ugh to both.
  • I think implementing nsISafeOutputStream on the output stream might be a good replacement for FinishNewMessage()/DiscardNewMessage(). It provides a finish() call to commit the change. Calling close() (or just releasing the stream) says the message should be discarded. So if the writing code bails out, the message will automatically be discarded. Yay!
  • ultimately I suspect moving to fully async streams for reading/writing messages is the way to go - both for mailstore I/O and for IMAP/POP3/NNTP/etc. Let the UI get on with other stuff while big messages are handled. But I don't know what that looks like yet, exactly.
Flags: needinfo?(benc)

....

More mailstore stuff, while I'm thinking about it (feel free to stop reading here!):
...

  • I think implementing nsISafeOutputStream on the output stream might be a good replacement for FinishNewMessage()/DiscardNewMessage(). It provides a finish() call to commit the change. Calling close() (or just releasing the stream) says the message should be discarded. So if the writing code bails out, the message will automatically be discarded. Yay!

You are talking only about maildir case (and possibly the tmpfile writing case for viruschecker in pop3 case) right?
General mbox case does not meet the assumed preconditions.

I am NOT that keen on depending on nsISafeOutputStream, especially for buffered operation since
in the past, I recall (possibly incorrect use of ) nsISAfeOutputStream file operation did NOT catch ENOSPC (no free space of file system) condition and lost email messages when compactifying mbox file overflowed a file system on my 20GB note PC disk.
That was on Windows, but I recreated the condition under linux (thus ENOSPC error code from write operation) and that got me to start sending patches to TB. Almost a dozen years ago.

Almost a dozen years ago.
So probably it has been fixed. But I am not entirely sure, and another thing is, if we get an error due to remote file system access problem halfway through copying or file system problem (ENOSPC, for example), I would prefer to bail out explicitly and finish off under programmer's explicit control.

There ARE so many usage of write/read/close system calls WITHOUT bothering to check the return codes and I think that was caused by a mistaken notion of many past developers that the file system operation takes care of errors themselves a la nsISafeOutputStream when actually they are NOT safe operation at all.
It sucks.
It sucks not because the errors are not checked, but it sucks because there is no overall error handling strategy discernible in many parts of the code for such I/O errors. It is really difficult now to figure out what the best error recovery strategy and handling is.
Basically, the developers seemed to think no file system ever returns error code for these operations, and it seems to me that no serious test of error handling was attempted. THAT SUCKS
FinishNewMessage() and DiscardNewMessage() are a great attempt to handle errors for, say, pop3 download.
However, before I looked at it, it did not work at all due to coding errors. BAD.
I forgot whether it has been already fixed or not. That fix was part of the long series of patches for buffered writing. A first few were applied
and so the fix may already be in.

We certainly need a few set of fresh eyes. (I have not really checked IMAP error handling at all since I do not use IMAP often.)

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: