Closed Bug 368325 Opened 18 years ago Closed 17 years ago

password prompt for encrypted mail crashes ThunderBird & SeaMonkey

Categories

(MailNews Core :: Security: S/MIME, defect, P1)

x86
All

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ChefChaudart, Assigned: KaiE)

References

(Blocks 1 open bug)

Details

(Keywords: dogfood, regression)

Attachments

(2 files, 2 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a2pre) Gecko/20070125 Minefield/3.0a2pre Build Identifier: version 3 alpha 1 (20070126) Since one week, I cannot open anymore my encrypted (and signed) mails. When I click on the mail, it ask me the code, after ok, TB crashes ; see TB28730338H Reproducible: Always Steps to Reproduce: 1. receive an encrypted mail 2. try to open it 3. TB ass your code, then crash Actual Results: crashes
NSS_CMSRecipientInfo_UnwrapBulkKey [mozilla\security\nss\lib\smime\cmsrecinfo.c, line 565] 0x0365f9f8 Cc'ing Nelson. It sounds like something is corrupt, but we shouldn't crash.
Status: UNCONFIRMED → NEW
Ever confirmed: true
> TB ass your code, then crash What does that mean? I send and receive lots of encrypted emails, and have never seen the problem reported here, AFAIK. That's not to say that I don't believe this is a real problem, but only that it seems rare. Unless I can reproduce it, I doubt we'll be able to solve this. More on that below. I'd guess that there is something quite unusual about the email Chef received. One possibility is that the email was encrypted using the cert(s) of one or more intended email recipients, and Chef does not have any of those certs (that is, the certs that Chef has, for which he has a private key, are not among the certs of the intended recipients). I agree that, whatever it is, we shouldn't crash when we encounter it. It also seems likely to me that the cause of the crash is in NSS, so I am making this into an NSS bug. Although the TB stack "trace" was empty, the crashing function is only called from one place, so it is possible to re-create some of the call stack (but not the call arguments). It looks like this: NSS_CMSRecipientInfo_UnwrapBulkKey NSS_CMSEnvelopedData_Decode_BeforeData nss_cms_before_data nss_cms_decoder_notify sec_asn1d_notify_before or sec_asn1d_notify_after ... There are several possibilities about reproducing this: a) maybe I could reproduce it with a copy of only the email message itself, or b) maybe I will also need a copy of Chef's cert8.db file, or c) maybe it will not be possible to reproduce without a copy of Chef's private key DB, key3.db, which I would not ask for, and would not want Chef to send me. (Please don't send me your key3.db file, Chef). SO, Chef, please save the entire encrypted message to a file. Then create a zip file containing the message and your cert8.db file (but not your key3.db file) and attach that zip file to this bug, or email it to me. I won't be able to read your encrypted email without your private key, but hope to be able to reproduce the problem without that.
Assignee: mscott → nobody
Component: Mail Window Front End → Libraries
Product: Thunderbird → NSS
QA Contact: front-end → libraries
Version: unspecified → 3.11
But first time I get the mail, I was able to read it. It's after an update of last week that suddenly I get the crash. see http://chefchaudart.trollprod.org/TBimapTK.png If I enter the code, or click Cancel, I get the crash
Chef sent me his cert DB and the message. I tested them with NSS's CMS decoder test tool, cmsutil, and it worked flawlessly. In short, I cannot reproduce the crash that Chef saw. I am no longer convinced that this is purely (or primarily) an NSS bug. In order to have failed where the TalkBack says it did, the code in NSS_CMSEnvelopedData_Decode_BeforeData would have to have gotten into an "impossible" state. It would have been necessary for that function to have been called with a list of non-NULL recipientInfo pointers, and then during the execution of that function, one of those pointers would have to change to become NULL or invalid. The function itself does not have that effect on the array of pointers, so some other code outside of the executing thread must have done that. The screen shot cited in comment 3 also shows something VERY strange. It shows an error message that is the result of having called NSS_CMSEnvelopedData_Decode_BeforeData and it returned error SEC_ERROR_NOT_A_RECIPIENT. But at the same time, it is showing a master password dialog. It appears to me that the SMIME code got to a place where it needed the master password, and called the password callback function, which put up tht dialog. But then apparently the password callback function returned before the password had been obtained. While the user was still seeing the password dialog, apparently the functino returned and NSS went ahead and tried to do the job with the missing/wrong master password, resulting in the SEC_ERROR_NOT_A_RECIPIENT failure. This would have destroyed various contexts, etc. Then, according to Chef, when he finally dismissed the password dialog, either by cancelling or by entering the password, apparently the code somehow tried to re-enter the NSS CMS functions, trying (again) to process the parsed (and now destroyed?) message structures. I think that's how/why this code crashed. I don't know enough about the thunderbird SMIME code to know how this could have occurred. I seem to recall that there was a change that makes password dialogs occur on a different thread than previously. I wonder if that caused this. At this point, I think the best I can offer is to add a bunch more of "sanity checks" inside of the SMIME code, checking that pointers that were previously seen to be non-NULL are still non-NULL, and not using them if so. That will avoid the crash, but it won't make Thunderbird work right. I think some mail guru needs to try this. Looks like all they need to do is to read an encrypted email message when they have not yet entered the master password.
Attached patch patch - detect "impossible" invalid pointers (obsolete) (deleted) — Splinter Review
This patch will prevent the crash, but won't make Thundebird work right.
Assignee: nobody → nelson
Status: NEW → ASSIGNED
Attachment #253071 - Attachment description: patch - detech "impossible" invalid pointers → patch - detect "impossible" invalid pointers
Chef, I have a question for you, and a request. Q: What version of TBird were you using before the "update of last week" ? Please answer that in this bug. In an effort to further determine if the problem is NSS or outside of NSS, I'd like you to download and try a pure-NSS program to read your email. If you can read your email with that program, and not with TBird, then I think this bug needs to changed back into a TBird bug. You can download the NSS and NSPR libraries from these URLs ftp://ftp.mozilla.org/pub/mozilla.org/security/nss/releases/NSS_3_11_4_RTM/msvc6.0/WIN954.0_OPT.OBJ/nss-3.11.4.zip ftp://ftp.mozilla.org/pub/mozilla.org/nspr/releases/v4.6.4/msvc6.0/WIN954.0_OPT.OBJ/nspr-4.6.4.zip Total download size is about 3.5 MB. Please email me directly for directions if you're willing to do this test. Thanks.
Summary: Encrypted mails are crashed TB → Encrypted mail crashes ThunderBird
I'm not sure, .. around 22 or 23 of January.
I asked another encrypted mail from the same user. I got it, I can open it, and I can open the old mail without any crash.
Resolving WORKS FOR ME, since Chef reports it works for him. If this problem reoccurs, please reopen this bug.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → WORKSFORME
(In reply to comment #9) > Resolving WORKS FOR ME, since Chef reports it works for him. > If this problem reoccurs, please reopen this bug. > I get another signed/encrypted message from someone else, I could read it the first time, but now, I cannot read anymore any message of that kind. I got the crash.
On Chefchaurd request, I have reproduced the problem with version 3 alpha 1 (20070203) see TB28988028Y. Steps to reproduce: 1. create a new user (to have a clean environment) 2. configure a new account (imap) 3. import certs 4. read an encrypted email (the mail is correctly decrypted) 5. close TB 6. start TB 7. open the same email 8. the password dialog window pops up 9. TB crashes if I cancel the dialog box or enter the right or wrong password
The problem does not occurred if I provide the "master password for Software Security Device" either by sending an encrypted mail or exporting the certificate before I read an encrypted mail.
Olivier, Thanks for that additional information. That sounds very familiar. SeaMonkey trunk had a similar problem for a while, but it got fixed. If you "logged in" to the "software security device" with the "master password" BEFORE doing something in mail/news that required the master password, then it was OK. But if you tried to read an encrypted email or send an encrypted email and didn't login to the "software security device" before going into mail/news, it would crash. As I recall, the problem had something to do with the handling of the cleanup of the password dialog window itself. All I really know for sure is that the problem came AND WENT in Seamonkey trunk without any NSS fixes being made.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Looking at Olivier's talkback, I see that (a) it's Linux, (b) it's a full stack trace, and (c) it died in a different place than Chef's. The stack trace has line number information for the parts of TBird other than NSS, but not for NSS. This makes me suspsect that TBird is not using the NSS shared libs that come with TBird, but perhaps is using some other NSS shared libs on the system, perhaps from another product? Olivier, please check and see if you have NSS shared libraries in some system directory, such as /usr/lib. Olivier, Chef, do either of you build your own Thunderbird from sources? If so are you willing to build and test the patch attached to this bug?
OS: Windows XP → All
I have replaced nss/nspr libs with a debug version from ftp://ftp.mozilla.org/pub/mozilla.org/security/nss/releases/NSS_3_11_4_RTM (see comment 6) I don't know if it's useful but it produces a different trace TB28991605Y NSS shared libraries come from thunderbird local directory (lsof) thunderbi 10787 test mem REG 8,19 1070487 1855097 /usr/local/trunk/thunderbird/libsoftokn3.so thunderbi 10787 test mem REG 8,19 2029965 1855160 /usr/local/trunk/thunderbird/libnss3.so thunderbi 10787 test mem REG 8,19 800855 1855163 /usr/local/trunk/thunderbird/libssl3.so thunderbi 10787 test mem REG 8,19 999658 1855162 /usr/local/trunk/thunderbird/libsmime3.so
Bug 353558 is another report of a crash with a very similar (not identical) call stack to Chef's call stack. It's another "impossible" invalid pointer crash, where a pointer is valid when the function starts, and invalid later. One of these bugs should probably be marked as a duplicate of the other. Olivier, a question about your master password. Is it empty? When it asks you for your master password, do you just click the OK button? or do you have to type in a non-empty password first?
The problem seen in Olivier's new talkback TB28991605Y seems to me to be the same UI problem I described above in comment 13. Bug 353558 comment 3 says that that crash can no longer be reproduced with the fix for bug 353597 present. That fix was committed on the trunk in September though. :-/
My master password is not empty (11 characters)
Olivier, it seems that talkback is utterly confused by the presence of shared libraries other than the ones with which you TBird was built. The symbol names for the NSS functions given in your talkback TB28991605Y are all incorrect. My idea of substituting debug libraries for the ones shipped with TBird on Windows is not a good idea on Linux. So let me ask you to revert to a clean trunk build of TB on Linux and try again. Maybe we'll get a better stack trace after that.
I replaced NSS libs with Debug version for Linux (i386). (ftp://ftp.mozilla.org/pub/mozilla.org/security/nss/releases/NSS_3_11_4_RTM/Linux2.6_x86_glibc_PTH_DBG.OBJ/nss-3.11.4.tar.gz) I'm now back to a clean version 3 alpha 1 (20070203) TB28992825G
I have build a debug version on a x86_64 system. My gdb knowledge is limited to type "bt" when the app crashes. version 3 alpha 1 (20070204) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 46949675506528 (LWP 4898)] 0x00002aaab386503d in NSS_CMSEnvelopedData_Decode_BeforeData (envd=0x15f3fb8) at cmsenvdata.c:362 362 ri = envd->recipientInfos[recipient->riIndex]; Current language: auto; currently c (gdb) bt #0 0x00002aaab386503d in NSS_CMSEnvelopedData_Decode_BeforeData (envd=0x15f3fb8) at cmsenvdata.c:362 #1 0x00002aaab3862475 in nss_cms_before_data (p7dcx=0xb66600) at cmsdecode.c:264 #2 0x00002aaab3862389 in nss_cms_decoder_notify (arg=0xb66600, before=1, dest=0x15f4038, depth=4) at cmsdecode.c:210 #3 0x00002aaab3d47c95 in sec_asn1d_notify_before (cx=0x15f74b0, dest=0x15f4038, depth=4) at secasn1d.c:451 #4 0x00002aaab3d4a665 in sec_asn1d_next_in_sequence (state=0x15f7820) at secasn1d.c:2016 #5 0x00002aaab3d4b6d2 in SEC_ASN1DecoderUpdate (cx=0x15f74b0, buf=0x15f4830 "<", len=22) at secasn1d.c:2672 #6 0x00002aaab3862d45 in NSS_CMSDecoder_Update (p7dcx=0xb66600, buf=0x15f4810 "X�_\001", len=54) at cmsdecode.c:670 #7 0x00002aaab3607485 in nsCMSDecoder::Update (this=0xb65f20, buf=0x15f4810 "X�_\001", len=7274598) at /old/tmp/mozilla/security/manager/ssl/src/nsCMS.cpp:813 #8 0x00002ab353b5b7d4 in MimeCMS_write (buf=0x15f4810 "X�_\001", buf_size=54, closure=<value optimized out>) at /old/tmp/mozilla/mailnews/mime/src/mimecms.cpp:510 #9 0x00002ab353b5acf5 in MimeEncrypted_parse_decoded_buffer (buffer=0x13ac9f8 "\004", size=1, obj=<value optimized out>) at /old/tmp/mozilla/mailnews/mime/src/mimecryp.cpp:193 #10 0x00002ab353b36054 in mime_decode_base64_buffer (data=0xb66a40, buffer=0x15f4810 "X�_\001", length=0) at /old/tmp/mozilla/mailnews/mime/src/mimeenc.cpp:325 #11 0x00002ab353b36d49 in MimeDecoderWrite (data=0x13ac9f8, buffer=0x1 <Address 0x1 out of bounds>, size=7274598) at /old/tmp/mozilla/mailnews/mime/src/mimeenc.cpp:837 #12 0x00002ab353b5b51f in MimeEncrypted_parse_buffer (buffer=0x0, size=7274598, obj=<value optimized out>) at /old/tmp/mozilla/mailnews/mime/src/mimecryp.cpp:171 #13 0x00002ab353b3faf2 in MimeMessage_parse_line (aLine=0x15f4810 "X�_\001", aLength=73, obj=0x13e0bb0) at /old/tmp/mozilla/mailnews/mime/src/mimemsg.cpp:240 #14 0x00002ab353b49523 in convert_and_send_buffer (buf=0x15f4810 "X�_\001", length=73, convert_newlines_p=1, per_line_fn=0x2ab353b3f8c6 <MimeMessage_parse_line>, closure=0x13e0bb0) at /old/tmp/mozilla/mailnews/mime/src/mimebuf.cpp:185 #15 0x00002ab353b497af in mime_LineBuffer ( net_buffer=0x15fd8dd "CSqGSIb3DQEHATAUBggqhkiG9w0DBwQI2/vE7XKgitOggASCFIj4rdOQbCzA0QaV/eiGGszY\n0kROnVZA+LvW89uihYGXG4Y9VPdD46XZtaQbjYiAtULG467n2iJaUanON7zk7ENHD98zUApd\nkjlocP7TXgp+j8sglGKJRZkmmX1HFQzURa6Ix8GBll4zISEXtBDots"..., net_buffer_size=<value optimized out>, bufferP=0x13e0bf0, buffer_sizeP=0x13e0c00, buffer_fpP=0x13e0c08, convert_newlines_p=1, per_line_fn=0x2ab353b3f8c6 <MimeMessage_parse_line>, closure=0x13e0bb0) at /old/tmp/mozilla/mailnews/mime/src/mimebuf.cpp:273 #16 0x00002ab353b427db in MimeObject_parse_buffer ( buffer=0x15fcc90 "Received: from mailsrv.trollprod.net ([192.168.0.5])\n by webmail.trollprod.com (Postfix) with ESMTP id 1WH22349\n for <olivn@trollprod.org>; Sat, 03 Feb 2007 17:23:12 +0100\nReceived: from"..., size=10336, obj=0x13e0bb0) at /old/tmp/mozilla/mailnews/mime/src/mimeobj.cpp:284 #17 0x00002ab353b4a03a in mime_display_stream_write (stream=<value optimized out>, buf=0x15fcc90 "Received: from mailsrv.trollprod.net ([192.168.0.5])\n by webmail.trollprod.com (Postfix) with ESMTP id 1WH22349\n for <olivn@trollprod.org>; Sat, 03 Feb 2007 17:23:12 +0100\nReceived: from"..., size=10336) at /old/tmp/mozilla/mailnews/mime/src/mimemoz2.cpp:953 #18 0x00002ab353b53533 in nsStreamConverter::OnDataAvailable (this=0x15e6d10, request=<value optimized out>, ctxt=<value optimized out>, aIStream=0x15f96a0, sourceOffset=<value optimized out>, aLength=10336) at /old/tmp/mozilla/mailnews/mime/src/nsStreamConverter.cpp:911 #19 0x00002aaaab71ab37 in nsDocumentOpenInfo::OnDataAvailable (this=<value optimized out>, request=0x15b47e0, aCtxt=0x0, inStr=0x15f96a0, sourceOffset=0, count=10336) at /old/tmp/mozilla/uriloader/base/nsURILoader.cpp:360 #20 0x00002ab354a045da in nsStreamListenerTee::OnDataAvailable (this=<value optimized out>, request=0x15b47e0, context=0x0, input=0x15e2378, offset=0, count=10336) at /old/tmp/mozilla/netwerk/base/src/nsStreamListenerTee.cpp:97 #21 0x00002ab34ca4b138 in NS_InvokeByIndex (that=0x2aaab4119aa0, methodIndex=5, paramCount=5, params=0x2aaab4145a20) at /old/tmp/mozilla/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_x86_64_linux.cpp:208 #22 0x00002ab34ca3e839 in nsProxyObjectCallInfo::Run (this=0x2aaab41451e0) at /old/tmp/mozilla/xpcom/proxy/src/nsProxyEvent.cpp:181 #23 0x00002ab34ca37b21 in nsThread::ProcessNextEvent (this=0x640a00, mayWait=1, result=0x7fff5ec51ecc) at /old/tmp/mozilla/xpcom/threads/nsThread.cpp:482 #24 0x00002ab34c9e4fd7 in NS_ProcessNextEvent_P (thread=0x13ac9f8, mayWait=1) at nsThreadUtils.cpp:225 #25 0x00002aaaae815000 in nsBaseAppShell::Run (this=0x669e30) at /old/tmp/mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:153 #26 0x00002aaaaf768633 in nsAppStartup::Run (this=0x7977a0) at /old/tmp/mozilla/toolkit/components/startup/src/nsAppStartup.cpp:171 #27 0x00002ab34c575a13 in XRE_main (argc=<value optimized out>, argv=<value optimized out>, aAppData=<value optimized out>) at /old/tmp/mozilla/toolkit/xre/nsAppRunner.cpp:2749 #28 0x00000000004007e8 in main (argc=20630008, argv=0x1) at /old/tmp/mozilla/mail/app/nsMailApp.cpp:62
http://www.mozilla.org/unix/debugging-faq.html is probably worth reading (google: unix debugging faq) p envd p envd->recipientInfos p recipient p recipient->riIndex
(gdb) p envd $2 = (NSSCMSEnvelopedData *) 0x14b1898 (gdb) p envd->recipientInfos $3 = (NSSCMSRecipientInfo **) 0x650072006f0066 (gdb) p recipient $4 = (NSSCMSRecipient *) 0xafbcf0 (gdb) p recipient->riIndex $5 = 0 http://olivn.trollprod.org/TB/cmsenvdata_ddd.png
Perhaps bug 353558 is a duplicate of this one, or vice versa? What these two bugs seem to have in commmon is that in both cases, the decoder calls NSS_CMSEnvelopedData_Decode_BeforeData which finds that the recipientInfos pointers are valid, then it calls the actual decryption function, which prompts for the password, and when that function returns, the relevant receipientInfos pointer is corrupted. Question: is this problem only seen on 64-bit builds? Has this been seen on any 32-bit builds?
I have produced a 64bit build because it's easier for me but I initially reproduce the problme with a 32bit build and it was initialy reported on win32.
Comment on attachment 253071 [details] [diff] [review] patch - detect "impossible" invalid pointers This patch is intended to detect corrupted pointers and avoid crasahing. It also cleans up the function code a little.
Attachment #253071 - Flags: superreview?(wtchang)
Attachment #253071 - Flags: review?(alexei.volkov.bugs)
I am now experiencing this problem 100% reproducibly with SeaMonkey 1.5a (trunk builds). When we return from the call to get the password, the heap is corrupted. This happens whether the password dialog had a password entered, or the user pressed cancel. This result shows that the problem in not in the decryption code, because the decryption code doesn't get called when the user cancels the password request. The challenge now is to find the code that corrupts the heap when the password dialog is presented to the user. We don't see this in tests of the NSS CMS (a.k.a. S/MIME) code that don't use PSM. We see this only with PSM. So I am rather stronly inclined to make this into a PSM bug, as soon as the above patch is reviewed.
Priority: -- → P2
Target Milestone: --- → 3.11.7
Comment on attachment 253071 [details] [diff] [review] patch - detect "impossible" invalid pointers Asking Kaie to review, in place of Wan-Teh.
Attachment #253071 - Flags: superreview?(wtchang) → superreview?(kengert)
I started to look at this bug. I am able to reproduce the crash on Linux trunk. The patch does not prevent the crash. I believe this bug might be caused by a new "order of events" when the application is decoding the message. (I'm making a very general statement here, without being able to provide details yet). I compared the behaviour of MOZILLA_1_8_BRANCH (Seamonkey 1.1.x) with the latest trunk code (Seamonkey 1.5a). I used these steps: - start seamonkey - login to mail account - (not yet entered master pw) - click encrypted message - prompt for master pw comes up - move the prompt window around, so that you can look at the mail message display area, in particular the header area At this point I inspected the state of the application and found an interesting difference. With the trunk, I get the "broken encryption" icon, and I believe this points us into the right direction. When you do the above with the stable branch, there is no such icon yet displayed. When a message is rendered for display, it gets passed into different decoders. One of them is the decryption decoder. Simply speaking, the decoder is supposed to block any other events until it is done. Only after we are done with the decoding, a decision will be made about the display and the crypto state icons. In other words: The code that tries to make the decision about the crypto state gets executed before we are done with decoding. This is wrong. We must find out why. The crypto decoder is given "UI context" pointers. Because the UI update has already been executed, it probably carries a dangling pointer. I agree this is NOT a NSS bug. Nelson, if you still believe we should drive that NSS patch in, let me know.
Assignee: nelson → kengert
Status: REOPENED → NEW
Component: Libraries → Security: S/MIME
Product: NSS → Core
QA Contact: libraries
Target Milestone: 3.11.7 → ---
Version: 3.11 → Trunk
(In reply to comment #30) > I started to look at this bug. > I am able to reproduce the crash on Linux trunk. > > The patch does not prevent the crash. Please tell me how/why it crashed. On what line did it crash? What variable contained an invalid value? Was that value NULL, or some other non-NULL but invalid value? With the patch applied, was the crash actually an assertion failure? My patch adds assertions (which will definitely crash in debug builds) and code to return without crashing in non-DEBUG builds. > Nelson, if you still believe we should drive that NSS patch in, let me know. In the stacks and core files I've previously seen, this crash was due to an invalid (NULL) value being passed to NSS_CMSRecipientInfo_UnwrapBulkKey or by an invalid array index being used to index the envd->recipientInfos or recipient_list arrays. My patch detects all those errors, and (in the non-debug builds) avoids crashing because of them. So, if the patch doesn't eliminate the crash in non-debug builds, then there must be yet-another cause of a crash that I have not yet seen. I'd appreciate any info on that new cause that you can supply.
QA Contact: s.mime
Kai, Using the MSVC6 Debug build you made, I was able to debug this problem with MSVC8. I found that the ASN.1 decoder context was freed while the ASN.1 decoder was waiting for the master password to be entered. I knew there was only one function that destroys that context, so i set a breakpoint in it, and found out how it happens. The attached stack trace tells the tale. While waiting for something to happen with the modal dialog, on that very same stack, the code processes the next event, which apparently is one signalling the end of the stream for the MIME message being parsed. So, we terminate and tear it all down, while this very thread is still in the middle of processing it. Looks like a consequence of bug 326273. See bug 338549 comment 1 and following comments for more info. I'm thinking maybe we want to add some code to detect and avoid recursive calls to the ASN.1 and CMS decoder functions.
Attachment #253071 - Attachment is obsolete: true
Attachment #253071 - Flags: superreview?(kengert)
Attachment #253071 - Flags: review?(alexei.volkov.bugs)
Yeah, this bug and bug 338549 have more or less the same cause, and the problem is bug 326273.
Flags: blocking1.9?
Summary: Encrypted mail crashes ThunderBird → password prompt for encrypted mail crashes ThunderBird & SeaMonkey
Flags: blocking1.9? → blocking1.9+
This bug is blocking 1.9. What is the plan to get it fixed for 1.9? The only workaround is to configure master passwords to be asked only once per process lifetime, and never time out, then do something to force a master password before reading the first encrypted email. Because of that, I have been more-or-less forced to set my password timeout to infinite, which I regard as very insecure. I would surely hope we won't ship FF or TB with this unfixed.
Nelson, thanks a lot for the stack, it's very helpful. I'm now trying to find a solution for this bug. As we can see in the stack, we arrive in nsStreamListenerTee::OnDataAvailable We go down the chain of data processing, and discover that we need to unlock the private key. We bring up a prompt. Now that network / data streams may proceed while blocking UI is shown, more data is being sent to us. So we arrive in nsStreamListenerTee::OnStopRequest (which tries to delete an object we are still using). Actually, it's worse! During the call to nsStreamListenerTee::OnDataAvailable that resulted in us showing the prompt, more data was fed to us using the same function call! It just happens we only crash once we arrive in OnStopRequest. Why is all this data being sent to us? Because the actual streams are being processed on a different thread (probably the socket transport thread). The requests to process the data get "proxied" to our UI thread. These requests are sent as events of the global event queue, so we can't filter them out on the receiving side (the thread showing the prompt). I see two different approaches to solve this bug. (1) Find a way to tell the caller: pause until we are done! I think the data is currently being set to us asynchronously. Maybe we can find the processing code and switch to a synchronous data delivery? (2) Avoid bringing up the prompt in the middle of stream processing. When a stream arrives and we notice that we must log in: - abort the stream - display the message as "can't decrypt" (temporarily) - post some UI event to ourself that triggers the login prompt (now decoupled from any streams, as if the user had clicked login) - after the login is done, we discover that our login came from that special event we posted, so we'll trigger a re-display of the current message - now we're logged in and can process without prompting Solution 2 means, we can no longer make use of the global login setting "ask every time". (Obviously, this would trigger a login-request each time we try to process the stream, and would never be able to)
Chef, Nelson, are you both using IMAP ?
Attached patch Patch v1 (obsolete) (deleted) — Splinter Review
This patch fixes it for me. With this change, the mail data is sent synchronously into the decoder. This way the data flow is blocked as long as our password prompt is shown. David, do you think this is ok to do?
Attachment #283655 - Flags: review?
Attachment #283655 - Flags: review? → review?(bienvenu)
Attached patch Patch v1, more context (deleted) — Splinter Review
same patch as before, but cvs diff -u10 -p
Attachment #283655 - Attachment is obsolete: true
Attachment #283657 - Flags: review?
Attachment #283655 - Flags: review?(bienvenu)
Attachment #283657 - Flags: review? → review?(bienvenu)
In answer to comment 36, I use both IMAP and POP3 all the time, in the same SM profile. Do you want/need me to determine if the problem happens only with one or only with the other?
I can reproduce this just by reading an old encrypted email in a (local) POP3 email folder. No need to use IMAP to reproduce it. I confirm an observation Kai made. When you try to read an encrypted email and you get the master password prompt, the encryption icon in the message header pane may already be showing the broken key icon, indicating unsuccessful decryption. If you see that icon when you get the master password prompt, you're doomed. The damage is already done. Whether you enter a password or just click cancel, it will crash. Knowing this, you may be able to close some windows to minimize the dataloss on crash.
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context at best, I think this could slow us down a bit; at worst, it could introduce some deadlocks. Unlikely, though. Is there a way to signal that you can't accept any more data? Maybe bisi has some thoughts...
(In reply to comment #41) > (From update of attachment 283657 [details] [diff] [review]) > at best, I think this could slow us down a bit; at worst, it could introduce > some deadlocks. Unlikely, though. If we consider this patch and have the risk of deadlocks, I will start to use ThunderBird trunk with this patch as my primary email client (currently using TB2), so we get more testing. > Is there a way to signal that you can't accept any more data? Maybe bisi has > some thoughts... You mean, switch the stream into a paused state, so the socket/stream tranport (or whoever is ansychronously pumping the data over) will pause until the prompt is ready? I had initially considered this idea, but I don't know how to do it. It would require to: - use a context object that is carried all the way down to the password-prompt-callback, so we have a pointer to the stream object that we switch to "paused" - decide which object will implement the paused state. Is there a feature available in the stream objects? (I couldn't fine one). If we implement it, should it be done in the stream implementation? Or should we switch the IMAP object to paused (which is producing the additional data)?
(In reply to comment #40) > I can reproduce this just by reading an old encrypted email in a (local) > POP3 email folder. No need to use IMAP to reproduce it. I removed the patch, then: - I tried "encrypted mail in POP3 folder". - I tried "encrypted mail in Local Folders" No crash. I crash when using IMAP only.
I have a "search folder" that shows me all the encrypted emails sitting in my POP3 mail folders. Clicking on any email in that search folder causes a crash with high probability. The filter rule for this folder is simply: Content-Description contains crypted Until this bug became common, I formerly had my master password timeout configured to ask me for my master password if 5 minutes had elapsed since the last time I had used it. But when this bug arose, I found that reading encrypted email caused many crashes per day. So I was forced by necessity to switch to having no timeout on my master password. Then I developed the habit of doing something that asks me for mey master password as soon as I start the program, so that I won't be asked for it again when I read encrypted email.
I had a IM conversation with Nelson. He pointed me to the "search folder" feature, which I had not seen before. Nelson claims the bug is not limited to IAMP. He said, with a 90% probability this happens with POP3 or "search folders", too. He said, his search folder are based on POP3 accounts, only. I tested, but I only crash with IMAP, never with POP3, never with search folders based on POP3. Within the mozilla/mail and mozilla/mailnews source directories, only the IMAP code uses the PROXY_ASYNC flag. (well, ldap and address book, too, but they don't cause us to display am email message) Could we land my patch into trunk and see if the crash goes away?
Kai, I agree that if you have fix for IMAP, we should try to land it. A fix for IMAP only is better than no fix at all. Using the latest trunk build, I see the problem on IMAP every time I try it (try to read an encrypted email without logging into token first). Am having difficulties reproducing with POP3 on this box. But I used to have this problem a LOT on my other box, so I want to try it there before I completely agree this is just IMAP.
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context Who else could review this patch?
Attachment #283657 - Flags: superreview?(mscott)
Attachment #283657 - Flags: review?(neil)
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context I really don't care for this, for the reasons I outlined earlier, but let's find out if it fixes the crash.
Attachment #283657 - Flags: review?(bienvenu) → review+
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context While I don't have any encrypted mail I can see how this would help so I would be prepared to sr this if mscott can't.
Attachment #283657 - Flags: review?(neil)
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context I'll defer to David on this.
Attachment #283657 - Flags: superreview?(mscott) → superreview+
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context Requesting approval for 1.9. (As this is mail-only, I'm not requesting firefox m9 endgame approval)
Attachment #283657 - Flags: approval1.9?
Comment on attachment 283657 [details] [diff] [review] Patch v1, more context a=drivers for after the M9 freeze
Attachment #283657 - Flags: approval1.9? → approval1.9+
P1, because fix is ready and already queued for check in.
Priority: P2 → P1
checked in
Status: NEW → RESOLVED
Closed: 18 years ago17 years ago
Resolution: --- → FIXED
Should this patch be backed out, now that bug 389931 (which fixed bug 326273 regression) is fixed ?
Serge, is there something about this patch that is extremely undesirable? What is the incentive to back this out?? This was a NASTY bug for almost a year before that fix was made. It was most welcome relief. Unless there's some REALLY GOOD reason to tempt the fates, I'd suggest we leave well enough alone.
(No extreme hurry.) Maybe I had simply misread your comment 32: I though this bug was caused by a regression which bug 389931 would fix; but it seems that the cause was an improvement from bug 326273... Is there any plan to go back to an async behavior ? Is the dependence on bug 265780 still needed ? Does this bug somehow depend on bug 338549 ?
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: