Open Bug 1242042 Opened 9 years ago Updated 2 years ago

Enabling buffering for file stream to write message for C-C TB

Categories

(Thunderbird :: OS Integration, defect)

defect

Tracking

(Not tracked)

People

(Reporter: ishikawa, Assigned: ishikawa)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: perf)

Attachments

(1 file, 4 obsolete files)

Attached patch enable-buffering-1st-step.patch (obsolete) (deleted) — Splinter Review
This patch should be applied after the error checks of |Close|, |Flush|, etc. are in place (See bug 1242030 that consolidate the patches from bug 1122698, bug 1134527, bug 1134529, and bug 1174500). I will post the roadmap of the patches shortly. TIA
Assignee: nobody → ishikawa
A full explanation of the updated patches and road map is in bug 116055 comment 107 https://bugzilla.mozilla.org/show_bug.cgi?id=1116055#c107 I am excerpting a relevant part for this bugzilla entry and a couple of other very related patches. ======================================== PLANNED ORDER of landing patches ======================================== bug 1116055 -> bug 1242030 (consolidation of 1122698, 1134527, 1134529, 1174500) -> bug 1242042 -> bug 1176857 -> bug 1242046 -> bug 1242050 -> bug 1170606 ======================================== An overview of patches in bugzilla entries. ---------------------------------------- [In February 2016, I fixed the errors observed ONLY UNDER WINDOWS when we use Maildir format for storage. Thus I needed to tweak my patch set somewhat and added a patch to take care of this issue and so I updated this write-up.] ...... [7] (New in 2016 ) Bug 1242042 - Enabling buffering for file stream to write message for C-C TB A enable-buffering-1st-step.patch: Enabling buffering first step. I have extracted the code to enable buffering to be applied after the additional error checks in the preceding patches [1]-[6]. After a couple of months of testing of the final patches, I realize a certain buffering operations that were in the patch are actually unnecessary, and so commented them out. (We can enable them back again if my understanding of the code flow is incorrect. There are many combination of execution paths, and I may misunderstood the nature of certain paths. My gripe about slow downloading of messages and saving of large attachment *ARE* handled properly by buffering now.) I have created the separate patch to enable buffering since it will be necessary to apply this patch ONLY AFTER previous patches to check for primitive I/O errors have landed. (This is because enabling the buffering requires proper error checking of |Close()| and |Flush()|. |Close| was never checked properly in the existing code. These are only taken care of AFTER patches [1]-[6] above are applied. BTW, I have not had the time to fix this issue of missing error checks of |Close()| in files under mailnews/import directory completely yet. If you don't believe me, read the source! Fixing this issue would be a good student project. [8] Bug 1176857 1116055-acelist.patch: One liner patch posted in Bug 1116055 to return a buffered stream from mbox::getNewOutputStream by aceman The patch has been posted to the following bugzilla entry. Bug 1176857 - C-C TB Enabling buffering for pop3 download (and possibly part of imap file operation) After the latest review of my local patches and testing, I realized that some buffering operations in my patch may be unnecessary for some operations. But I leave them as such. This patch from aceman could fall under that category, too. We may want to review the operation of the program more carefully to see if it is indeed unnecessary for some operations, but it does not hurt to have this buffering function for now. Lack of internal documentation of low-level protocol handling, etc. hurts. [9] (NEW in 2016) Bug 1242046: Removing unnecessary |Seek| that caused the C-C TB to operate slowly in terms of I/O removing-a-Seek-rev02.patch: removing a Seek that is performance pig! This was extracted into a new patch from the patches above [5]-[8]. Even if buffering is properly enabled, one offending |Seek()| needs to be disabled so that the buffering is effective for writing messages to a local file. Otherwise, this |Seek()| interferes with buffering, and essentially we call system |write| for EACH LINE of message, thus lowering output performance significantly. Killing this |Seek| did the trick. We now call ONE |Seek| per message, NOT one |Seek| per ONE LINE of message! This patch requires the landing of [1]-[8] before it. ...
Attachment #8711180 - Attachment is obsolete: true
Blocks: 1176857
Depends on: 1242030
Attached patch enable-buffering-1st-step.patch (obsolete) (deleted) — Splinter Review
Bitrot, rebase, etc.
Attachment #8721173 - Attachment is obsolete: true
Comment on attachment 8798563 [details] [diff] [review] enable-buffering-1st-step.patch Review of attachment 8798563 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/local/src/nsLocalMailFolder.cpp @@ +2178,5 @@ > + // sanity check > + NS_ASSERTION(mCopyState->m_fileStream, "mCopyState->m_fileStream should not be nullptr."); > +// let us buffer the m_fileStream > + mCopyState->m_fileStream->Flush(); // just in case > + mCopyState->m_fileStream = NS_BufferOutputStream(mCopyState->m_fileStream, 64 * 1024 ); This is the same change that was attempted in bug 769346: https://hg.mozilla.org/comm-central/rev/a9f658106adb#l1.28 This was later backed out since it caused bug 815012. Have we ever investigated why this has caused the corruption described in bug 815012? I guess all the other bugs in this series are there to ensure proper error handling so we would see problems early, but somehow I still don't see the big picture.
(In reply to Jorg K (GMT+2) from comment #3) > Comment on attachment 8798563 [details] [diff] [review] > enable-buffering-1st-step.patch > > Review of attachment 8798563 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mailnews/local/src/nsLocalMailFolder.cpp > @@ +2178,5 @@ > > + // sanity check > > + NS_ASSERTION(mCopyState->m_fileStream, "mCopyState->m_fileStream should not be nullptr."); > > +// let us buffer the m_fileStream > > + mCopyState->m_fileStream->Flush(); // just in case > > + mCopyState->m_fileStream = NS_BufferOutputStream(mCopyState->m_fileStream, 64 * 1024 ); > > This is the same change that was attempted in bug 769346: > https://hg.mozilla.org/comm-central/rev/a9f658106adb#l1.28 > > This was later backed out since it caused bug 815012. Have we ever > investigated why this has caused the corruption described in bug 815012? I think the rewinding of file pointer due to the function(s) that enables buffered outupt and not taking care of that effect was to blame. At least, after the proposed changes including the |Seek()| being necessary per message copy, the intense testing in the 1Q of 2015, I did not see any corruption. (Filtering, downloading/copying/moving between folders of intra and inter accounts copies [pop3 and imap accounts]. > I guess all the other bugs in this series are there to ensure proper error > handling so we would see problems early, but somehow I still don't see the > big picture. To be honest, nether do I, especially the error handling [or the lack of it] in imap subdirectory remains mostly a mystery to me: this is because I have not used imap server extensively and I don't have the gut feeling of its operation and I have not read up on the latest RFC regarding IMAP protocol. So I can't say for sure how imap server and client interaction should behave. (In the early 2000's, the imap server implementations available then was so buggy, I stayed away from imap server for a long time when I use TB. I think people today have started to use gmail in IMAP mode fairly casually but GMAIL seems to have its own quirks regarding the extesion of IMAP protocol or lack of some features, but I digress. Until I am FORCED to use an IMAP server, I don't think I will read every lines of imap subdirectory.)
Attached patch enable-buffering-1st-step.patch (obsolete) (deleted) — Splinter Review
Fixed bitrot and modified some comments so that they are more "appropriate" in C-C source tree.
Attachment #8798563 - Attachment is obsolete: true
So your patch(es) addresses the concerns of bug 815012 comment 45 and bug 815012 comment 49?
Flags: needinfo?(ishikawa)
I asked the same in comment #3 and the answer was in comment #4. My understanding is that Chiaki reckons that a seek was missing from the initial attempt in bug 769346. This seek was now added in bug 1116055. Also, missing error handling didn't detect that things went wrong in bug 769346. After Chiaki's other patches have landed, we will be in a better position to assess the side-effects of what is re-proposed here. Maybe we'll have it for Christmas ;-)
(In reply to Jorg K (GMT+1) from comment #7) > I asked the same in comment #3 and the answer was in comment #4. My > understanding is that Chiaki reckons that a seek was missing from the > initial attempt in bug 769346. This seek was now added in bug 1116055. Also, > missing error handling didn't detect that things went wrong in bug 769346. > After Chiaki's other patches have landed, we will be in a better position to > assess the side-effects of what is re-proposed here. Maybe we'll have it for > Christmas ;-) Yes, I still think my analysis is correct at least for mbox (Berkeley mailbox) case bacin in 2012 (and I suspect Maildir was not an issue back then yet). bug 815012 comment 51 singles out the following patch as the culprit for the corruption. https://hg.mozilla.org/comm-central/rev/a9f658106adb In it the change was this: rv = mCopyState->m_msgStore-> GetNewMsgOutputStream(this, getter_AddRefs(mCopyState->m_newHdr), &reusable, getter_AddRefs(mCopyState->m_fileStream)); NS_ENSURE_SUCCESS(rv, rv); + + /* Buffer the output if possible */ + /* BIG performance improvement on Windows with some network file systems */ + mCopyState->m_fileStream = NS_BufferOutputStream(mCopyState->m_fileStream, EIGHT_K); + if (mCopyState->m_parseMsgState) mCopyState->m_parseMsgState->SetNewMsgHdr(mCopyState->m_newHdr); Basically, a newly created filestream mCopyState->m_fileStream is turned into a buffered one. Note: GetNewMsgOutputStream() for mbox positioned the file seek pointer to the end of the file. https://dxr.mozilla.org/comm-central/source/mailnews/local/src/nsMsgBrkMBoxStore.cpp#610 see line 646 and line 662. However, it does not seem to me that NS_BufferOutputStream() [or other variants] honors this movement of file seek pointer to the end all the time. No, I think I should rephrase. Well, for a newly created file. seek position is at 0. But for a file that already contains data, seek position is not at zero. I am not sure if the buffered output routine can cope with the situation when the buffer is empty and the position of file seek pointer is not at zero after Init() very well. https://dxr.mozilla.org/comm-central/source/mozilla/netwerk/base/nsBufferedStreams.cpp?q=%2Bfunction%3A%22nsBufferedStream%3A%3AInit%28nsISupports+*%2C+uint32_t%29%22&redirect_type=single#68 I suspect it can't and get confused, and that is the reason for file data corruption. I think we better start with file position 0 OR before writing anything move the file seek position to the desired position by explicit seek to set the internal data structure straightened out, and my patch is taking the latter fix. Call seek once before each message copy/download begins. And it seems to work well. Come to think of it, we may want to make it robust in the future after this comment. When I think about it, IN A PERFECT WORLD where no disk error or network error occurs, we can simply add Seek() after the call to NS_BufferOutputStream() in the backed-out patch. Unfortunately, in the REAL WORLD where such errors are rampant, we have to check the errors that are now deferred until Close() time because of the buffering. Unfortunately, TB code in many places never bothered to check the error for Close(). That is why I had to add error checks in many places simply to make TB more robust when we enable buffering. > we will be in a better position Definitely.
Flags: needinfo?(ishikawa)
Keywords: perf

In bug 1242030,
the following patches will be uploaded: it contains all the changes that have taken place for the last couple of years including the modification to match the clang-formatted source tree.

1242030-DONT-USE-REUSABLE-new-part-1.patch: bug 1242030: DONT-USE-REUSABLE new part-1 new splitting of (consolidation of 1122698, 1134527, 1134529, 1174500)
1242030-DONT-USE-REUSABLE-new-part-2-imap-JK-v1-rev02.patch: bug 1242030: DONT-USE-REUSABLE Improve error handling in file/stream I/O. Part 2. r=jorgk.
1242030-DONT-USE-REUSABLE-new-part-3-import-JK-v1-rev03.patch: bug 1242030: DONT-USE-REUSABLE Improve error handling in file/stream I/O. Part 3 (was 2). r=jorgk. revision 2
1242030-DONT-USE-REUSABLE-new-part-4-local-less-pop3.patch: bug 1242030: DONT-USE-REUSABLE new part-4 new splitting of (consolidation of 1122698, 1134527, 1134529, 1174500)
1242030-DONT-USE-REUSABLE-new-part-5-pop3.patch: bug 1242030: DONT-USE-REUSABLE new part-5 new splitting of (consolidation of 1122698, 1134527, 1134529, 1174500)

The following two patches, including one here, will complete the buffered output finally (!).

The current one:
enable-buffering-1st-step.patch: Bug 1242042: Enabling buffering for file stream to write message for C-C TB (Enabling buffering first step.)

Another one is this.
removing-a-Seek-rev02.patch: Bug 1242046 - Removing unnecessary |Seek| that caused the C-C TB to operate slowly in terms of I/O

Updated patch.

Attachment #8807624 - Attachment is obsolete: true
Blocks: 693659
Flags: needinfo?(ishikawa)

Not sure if the proposed improvement in this bugzilla makes the issue in Bug 693659 speedier.
The reason I think so is as follows.

  • The improvement here is concerned with physical data copying for messages into external storage, not deletion.
  • Deletion in pop3 is handled (for mbox) by marking the message as deleted in DB and the storage is reclaimed later when
    the folder is compacted.
    I am not familiar with deletion in pop3 with maildir, but I think it should not be such a big deal or rather it should be straightforward:
    you will simply remove each such message. (Now, if the system call to remove the individual message file is expensive, then
    certainly the O(N) where N is the number of message can be expensive. We may want to run an asynchronous thread to call the remove system call and let it run in the background. But I am afraid that the code gets too complex.

Now, when it comes to IMAP, again, this bugzilla probably won't affect the issue in bug 693659 at all either.
I am afraid that with IMAP, the server setting and the setting of syncing the local operation with the status of messages on the server affect the performance very much and there is not much TB can do. For example, i.e., if we remove messages locally then should the server also SYNCHRONOUSLY delete the message?
If SYNCHRONOUS operation on the server is requested, THEN if one deletes 10K+ messages locally, obviously there will be much server activity which slows down the UI.

I am busily trying to fix the bitrot of the patches that must be applied before the patch here to accommodate last several months worth of M-C/C-C changes. Close to finishing it, but not cigar yet.

Flags: needinfo?(ishikawa)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: