1121842 - [META] RFC: C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends.

Assignee

Description

•

10 years ago

This is to be a meta bug entry for the following issues. I depend on TB for the workflow at the office and my personal correspondence on several PCs (under linux and Windows). So its correct operation (and smooth operation hopefully) as a cross-platform mailer is very important for me. So I files this bugzilla entry. Problem: There is a confusion as to which routine should be responsible for closing the file stream associated with a variable m_outFileStream in nsPop3Sink.cpp. Because of the confusion (?), there are multiple instances of unnecessary extra bogus Close calls across a few files. Also, Pop3 code is full of unchecked |Close|, |Flush|, and |Write| calls. (Imap code, too. That will be needed to be taken care of eventually. Sorry I am not using imap right now and not feel motivated to tackle that group of files immediately until nsPops3Sink.cpp is fixed. But someone can take the lead after seeing this.) If the underlying file system experiences a glitch, then it is quite likely that thunderbird treats failed download as success and delete the message on the POP3 server happily. Totally unacceptable behavior. [ The glitch of the file system: almost filled-up file system, transient network problems for remotely mounted file system, incorrect file system permission caused by administrative error an an NFS server, USB memory stick where file is to be stored fell of the PC (!) etc. I have personally experienced and confirmed the first three error scenarios with TB before in different context and experienced data loss. I know some people seemed to have suffered from the last one according to bugzilla entries. ] Background of discovery: I tried to enable output buffering in nsPop3Sink.cpp for performance reasons. [ Bug 1116055 - Performance issue: Failure to use buffered write (comm-central thunderbird) ] But When I enabled output buffering in nsPop3Sink.cpp in addition to the patch in bug 1116055, it caused the failure to incorporate download messages. It looks timing dependent. Anyway, I always found a stream closed prematurely before writing finishes. That prompted me to debug and study the code carefully. Then I found that there are bogus Close() calls to already closed file streams in the code. This bogus |Close()| became apparent as soon as I added error checking of returned value of |Close()| in several places. Such extra bogus |Close| calls are RAMPANT during execution, and I confirm it after static code analysis. But then, when I started to think of how to fix the situation, I noticed there seemed to be a confusion about which function should call |Close()| on the buffer stream variable, m_outFileStream in nsPop3Sink.cpp. So I traced the history of m_outFileStream (where it is set, where it is used as parameter to external functions [which in turn may Close it], when the file associated with it are Opened, Closed, etc.) After the analysis, I came up with a plan to clean up the current incorrect code that invokes bogus |Close| on already closed streams. (Such bogus calls interfere with smooth gdb debugging. There are simply too many such calls that would return NS_BASE_STREAM_CLOSED during execution. Oh yes, come to think of it buffered output routine does not report error situation completely to my taste. (See Bug 1120046 - RFC mozilla/netwerk/base/src/nsBufferedStreams.cpp: better error reporting and maybe adding thread-race lock ) And the source code is not quite right statically, too, even if calling Close on already closed stream should be NOOP in principle. It is too confusing to see an already closed stream closed only several lines down while reading the code. ) We should also check the return value of Close, Flush and Write properly. ======================================== Plan for Improvement ======================================== My Plan to clean up the code is as follows. There are four steps. Step 1. Remove Extra ->Close() calls (and Flush() before Close().). Let us remove extra / unnecessary / bogus |Close| calls. The removal is based on the two proposals below. PROPOSAL-1: DiscardNewMessage should not close the file stream passed as the 1st argument. It is caller's responsibility to close it.. PROPOSAL-2: FinishNewMessage should not close the file stream passed as the first argument. It is caller's responsibility to close it. The reasoning is given in a crude write-up I created from checking the code at the end. (This will be posted as the next comment.) I checked the usage of two functions above, and with a few patches, the proposals ought to work. This patch will be filed as a different bugzilla on which this meta bugzilla entry depends. After the removal of unnecessary calls of |Close|, we go to Step-2. Step-2. Add error value checking of Close() and Flush(); First step. Simply add NS_ERROR(). Better than nothing. At least, we will see the error printed during testing of DEBUG version of TB. (For better error handling, we will wait for Step 4.) The patch for this will be posted as a separate bugzilla entry. After Close() and Flush() are taken care of, we check the error return of Write() and the mismatch of the # of written bytes and requested bytes. Step-3. Add error value checking of Write(); Check the return value of Write and if the requested # of bytes matches the # of really written bytes. First step. Simply add NS_ERROR(). Better than nothing. At least, we will see the error printed during testing of DEBUG version of TB. (For better error handling, we will wait for Step 4.) The patch for this will be posted as a separate bugzilla entry. Step-4. More error processing I understand that some places may need elaborate error handling, better than NS_ERROR(), and error return path. Take care of them in Step-4. The patch for this will be posted as a separate bugzilla entry. I think I will need a few different bugzilla entries since in a few places, the error handling seems to be necessarily complex. Step-5. Enabling buffered output. (This is not a correctness fix. This is a performance improvement.) Additionally, introduce buffering output to the file stream created in these files appropriately. I will post the crude memo/write-up in the next comment.