Port bug 1791633: separate TLS socket control from transport security info - comm/mailnews/imap/test/unit/test_starttlsFailure.js failing on debug
Categories
(Thunderbird :: General, defect)
Tracking
(thunderbird_esr102 unaffected, thunderbird109 wontfix, thunderbird111 affected)
Tracking | Status | |
---|---|---|
thunderbird_esr102 | --- | unaffected |
thunderbird109 | --- | wontfix |
thunderbird111 | --- | affected |
People
(Reporter: benc, Assigned: KaiE)
References
Details
(Keywords: intermittent-failure, Whiteboard: [TM:111.0b3])
Attachments
(7 files, 9 obsolete files)
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
wsmwk
:
approval-comm-beta-
|
Details |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/x-phabricator-request
|
wsmwk
:
approval-comm-beta-
|
Details |
Reporter | ||
Comment 1•2 years ago
|
||
Updated•2 years ago
|
Reporter | ||
Comment 2•2 years ago
|
||
Late here and I won't be awake to see if it builds with this patch, so I kicked off a try run:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=0a7527c83e45e96c57d5b80b077c3f3c533d9d4e
Pushed by martin@humanoids.be:
https://hg.mozilla.org/comm-central/rev/618f8ad3da6f
Handle changes from Bug 1791633, renameing nsISSLSocketControl to nsITLSSocketControl. r=freaktechnik
Comment 4•2 years ago
|
||
Comment 5•2 years ago
|
||
Even with my patch there's still one debug test failure left from this: mailnews/imap/test/unit/test_starttlsFailure.js. See for example https://treeherder.mozilla.org/logviewer?job_id=396973492&repo=try-comm-central&lineNumber=4952
This failure happens inside the native IMAP protocol implementation where it tries to call IsAlive
on the socket from the wrong thread from what I can tell. However I couldn't quite follow what thread that call happens in vs. what thread it should be in. As explained by :keeler in phabricator:
[...] debug builds [...] should abort with an assertion failure if the current thread isn't the same thread the socket control was created on.
I also wouldn't assert that all our calls are correct with my patch, there might still be untested code paths that don't behave correctly under the new restrictions.
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/59629a2d1298
followup - Pass security info from JS protocol clients. r=mkmelin
Comment 7•2 years ago
|
||
Right, mailnews/imap/test/unit/test_starttlsFailure.js is failing on debug.
https://treeherder.mozilla.org/logviewer?job_id=396973492&repo=try-comm-central&lineNumber=4966
I'm assuming m_transport was created on the main thread, and now it's being checked in the imap thread?
Could we do without the isAlive check? https://searchfox.org/comm-central/rev/5901d59728b25f72b55397ce192ae1adfb5d4057/mailnews/imap/src/nsImapProtocol.cpp#1330
Updated•2 years ago
|
Reporter | ||
Comment 8•2 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #7)
Right, mailnews/imap/test/unit/test_starttlsFailure.js is failing on debug.
https://treeherder.mozilla.org/logviewer?job_id=396973492&repo=try-comm-central&lineNumber=4966I'm assuming m_transport was created on the main thread, and now it's being checked in the imap thread?
Certainly looks like m_transport is being created in the main thread, via SetupWithUrlCallback()
, which called by SetupWithUrl()
, either directly, or as the result of a proxy lookup callback.
And yes, it's stupid that the sockettransport for the IMAP thread is being created on the main thread (the transport lifetime is more-or-less synced to the IMAP thread lifetime, right?)... but it seems to be done like this so it can get the proxy info from a mockchannel. Or something.
It basic boils down to the crazy way we deal with channels/requests and have stateful urls, and also the confusion of our nsImapProtocol
(which is really the IMAP connection interface, but also used for requests) and nsImapMockChannel
.
BUT, nsISocketTransport claims that it can be shared across threads:
* NOTE: This is a free-threaded interface, meaning that the methods on
* this interface may be called from any thread.
So there's a M-C bug there? Either the same-thread assertion we're hitting, or the comment is wrong.
Could we do without the isAlive check? https://searchfox.org/comm-central/rev/5901d59728b25f72b55397ce192ae1adfb5d4057/mailnews/imap/src/nsImapProtocol.cpp#1330
I don't know... there are a bunch of other isAlive checks called from the IMAP thread, which do seem important. I.e. "is the connection still up?"
Although these don't seem to cause problems? Maybe it's only an issue with a socket which has a failed TLS attempt?
Assuming it's not an M-C bug, I think a more "proper" solution would be to get the the transport created on the IMAP thread in the first place, instead of on-demand at SetupWithUrl()
time. But there's the issue of the proxy setup. A mock channel is being used to set up the proxy (the MsgExamineForProxyAsync() call in SetupWithUrl()). And I've not idea what's going on there. (I'd have to do a lot of digging)
Alternatively, we could try dispatching the SetupWithUrlCallback()
calls synchonously (ugh) to the IMAP thread... but I don't know how to do that. We've got a PR_Thread for the IMAP thread, but I don't know how (or if) to dispatch calls to that....
Comment 10•2 years ago
|
||
At least the try run was happy with this.
Comment 11•2 years ago
|
||
FWIW, current debug build triggers https://searchfox.org/mozilla-central/rev/513d9c0f27c0c3fdc9ed6c5fb222a3f90ae69d70/security/manager/ssl/CommonSocketControl.cpp#455.
Comment 12•2 years ago
|
||
... and the patch from comment #10 doesn't fix that. Please compile a debug version and run it with a few IMAP accounts.
Crash is:
Assertion failure: mOwningThread == PR_GetCurrentThread(), at C:/mozilla-source/mozilla-central/security/manager/ssl/CommonSocketControl.cpp:455
#01: CommonSocketControl::GetSecurityInfo (C:\mozilla-source\mozilla-central\security\manager\ssl\CommonSocketControl.cpp:455)
#02: nsImapProtocol::LoadImapUrlInternal (C:\mozilla-source\mozilla-central\comm\mailnews\imap\src\nsImapProtocol.cpp:2353)
#03: nsImapProtocol::SetupWithUrl (C:\mozilla-source\mozilla-central\comm\mailnews\imap\src\nsImapProtocol.cpp:958)
#04: nsImapProtocol::LoadImapUrl (C:\mozilla-source\mozilla-central\comm\mailnews\imap\src\nsImapProtocol.cpp:2308)
#05: nsImapIncomingServer::LoadNextQueuedUrl (C:\mozilla-source\mozilla-central\comm\mailnews\imap\src\nsImapIncomingServer.cpp:541)
#06: `anonymous namespace'::SyncRunnable2<nsIImapServerSink,nsIImapProtocol *,bool *>::Run (C:\mozilla-source\mozilla-central\comm\mailnews\imap\src\nsSyncRunnableHelpers.cpp:121)
Comment hidden (Intermittent Failures Robot) |
(In reply to Ben Campbell from comment #8)
BUT, nsISocketTransport claims that it can be shared across threads:
* NOTE: This is a free-threaded interface, meaning that the methods on * this interface may be called from any thread.
So there's a M-C bug there? Either the same-thread assertion we're hitting, or the comment is wrong.
Yeah, that might be a bug. nsSocketTransport::IsAlive
is calling PR_Recv
, so that should really be on the socket thread, if my understanding is correct.
Assuming it's not an M-C bug, I think a more "proper" solution would be to get the the transport created on the IMAP thread in the first place, instead of on-demand at
SetupWithUrl()
time. But there's the issue of the proxy setup. A mock channel is being used to set up the proxy (the MsgExamineForProxyAsync() call in SetupWithUrl()). And I've not idea what's going on there. (I'd have to do a lot of digging)
Can the IMAP thread be the same as the socket thread? That might solve all these issues at once (or it might introduce dealocks, if that's not a sound thing to do).
Reporter | ||
Comment 15•2 years ago
|
||
(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #14)
Yeah, that might be a bug.
nsSocketTransport::IsAlive
is callingPR_Recv
, so that should really be on the socket thread, if my understanding is correct.
Can the IMAP thread be the same as the socket thread? That might solve all these issues at once (or it might introduce dealocks, if that's not a sound thing to do).
Sorry, my shakey understanding of the threading and IO restrictions is showing here :-(
What do you mean by "socket" thread?
- the thread the socket was opened on?
- the background thread pool used for dispatching blocking IO via NS_DispatchBackgroundTask()?
- something else?
If the first one, then yes - it seems that it'd be prudent to open the socket on the thread that's using it (it's currently opened on the main thread, then passed over to the IMAP thread).
Is there any particular documentation on this stuff that I should be reading?
The IMAP thread is a bit annoying, TBH - there's a new thread opened up for each IMAP connection, and it just loops, running a (messy) state machine and doing blocking IO in a loop. It should be using NS_DispatchBackgroundTask()
instead of dedicated IMAP threads, I think (Bug 1618934). But that'd be a big job - the IMAP code is pretty hairy and brittle, and has loads of hard-won real-world-server hacks in there which nobody really has a good grasp on :-) A rewrite in JS is in the works, but isn't ready for prime time yet.
Reporter | ||
Comment 16•2 years ago
|
||
(In reply to Ben Campbell from comment #15)
Sorry, my shakey understanding of the threading and IO restrictions is showing here :-(
What do you mean by "socket" thread?
- the thread the socket was opened on?
- the background thread pool used for dispatching blocking IO via NS_DispatchBackgroundTask()?
- something else?
Or The Socket Transport Service (STS) thread? (That's the one I was trying to think of!)
Yes, the socket transport service thread. My understanding is any and all networking should happen on that thread.
Comment 18•2 years ago
|
||
I'm out of ideas. I think we should disable on debug until we can fix it, if we ever fix it.
Comment 19•2 years ago
|
||
Please note comment #12. Any debug build with an IMAP account is unusable since it constantly runs into this MOZ_ASSERT:
https://searchfox.org/mozilla-central/rev/ce78234f5e653a5d3916813ff990f053510227bc/security/manager/ssl/CommonSocketControl.h#21
Developers wishing to work with a debug build will forever need to remove this MOZ_ASSERT.
Comment 20•2 years ago
|
||
Sure, but with no fix in sight, having the test on is just polluting the tree making it harder to notice other new problems that may come up.
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 22•2 years ago
|
||
Comment 23•2 years ago
|
||
Maybe M-C would accept a patch similar to this one to avoid MOZ_ASSERT() in a debug build. However, the patch creates very noisy output.
Comment 24•2 years ago
|
||
(In reply to Ben Campbell from comment #8)
Alternatively, we could try dispatching the
SetupWithUrlCallback()
calls synchonously (ugh)
to the IMAP thread... but I don't know how to do that. We've got a PR_Thread for the IMAP thread,
but I don't know how (or if) to dispatch calls to that....
We didn't find a way to dispatch to a PRThread, but one can dispatch to an nsIThread. The attached patch compiles but doesn't work. It's unclear show to spin the event loop on the IMAP thread, the dispatch call with NS_DISPATCH_SYNC never returns.
Ben, do you know whether this is totally wrong or whether this approach can be made to work?
Comment 26•2 years ago
|
||
Comment on attachment 9306938 [details] [diff] [review]
1801067-proxy-transport-creation.patch
Thanks for the comment. Apologies, we didn't read the bug carefully enough, in particular comment #17. We're also not familiar with the socket thread. Looks like this search gives some pointers to code using the sts thread. Which "networking" needs to be proxied there? Opening input/output streams and reading/writing to them?
The protocols (POP3, SMTP, NNTP) recently re-implemented in JS all use TCPSocket. Looks like the code is here which must be using the correct thread already. Maybe that's a good example to study. Overall, making things happen on the right thread in the legacy C++ code may be a big task, so depending on the timing of the new IMAP implementation in JS it might not be worth the effort. Maybe in the meantime just take a patch like in attachment 9306930 [details] [diff] [review].
Comment 28•2 years ago
|
||
The test works in imap-js because ImapClient checks capability first and doesn't send STARTTLS if no capability found. But ImapProtocol sends STARTTLS directly.
It crashes when creating a new account with imap-js or pop3-js at https://searchfox.org/mozilla-central/rev/abcee8d2c97a5c8a1fbeaf84607ea427be72497a/security/manager/ssl/NSSSocketControl.cpp#277
Attachment 9310585 [details] is an example of something that could be done as a temporary workaround while IMAP gets rewritten in JS.
Comment 31•2 years ago
|
||
(In reply to Ping Chen (:rnons) from comment #28)
The test works in imap-js because ImapClient checks capability first and doesn't send STARTTLS if no capability found. But ImapProtocol sends STARTTLS directly.
It crashes when creating a new account with imap-js or pop3-js at https://searchfox.org/mozilla-central/rev/abcee8d2c97a5c8a1fbeaf84607ea427be72497a/security/manager/ssl/NSSSocketControl.cpp#277
So it's still a problem for the protocols rewritten in js, that use TCPSocket? In that case, a bug in TCPSocket?
Comment 32•2 years ago
|
||
Doesn't work yet, please help.
Comment 33•2 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #31)
So it's still a problem for the protocols rewritten in js, that use TCPSocket? In that case, a bug in TCPSocket?
TCPSocket uses the same SocketTransportService under the hood.
(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #30)
Attachment 9310585 [details] is an example of something that could be done as a temporary workaround while IMAP gets rewritten in JS.
Thanks, tlsSocketControl->StartTLS()
needs to be updated as well https://searchfox.org/comm-central/rev/0fefd9ea7a9c8e805c833ce935cb54df597453df/mailnews/imap/src/nsImapProtocol.cpp#1820
smtp-js/nttp-js/pop3-js/ldap-js also use TCPSocket, so I think it's better to update TCPSocket.cpp. In attachment 9310605 [details] I tried to dispatch to the socket thread, but it seems to make no difference, do you know why? Thanks
Yeah, TCPSocket should probably be fixed. Looking at https://searchfox.org/mozilla-central/source/dom/webidl/TCPSocket.webidl, many (all?) of those methods/attributes should probably actually return promises instead of directly doing whatever operation on the main thread (as opposed to the socket thread).
Comment 36•2 years ago
|
||
Thanks, I filed bug 1809755 to the best of my understanding of the problem.
Comment hidden (offtopic) |
Comment hidden (offtopic) |
Assignee | ||
Comment 39•2 years ago
|
||
Dana, thank you very much for your patch (comment 29), it works well for me. No more assertions when using a debug build.
Ping, regarding comment 32, I guess you were testing a StartTLS configuration (I haven't yet).
I'll attach an additional patch for the place you've pointed out, copying Dana's code.
Assignee | ||
Comment 40•2 years ago
|
||
Assignee | ||
Comment 41•2 years ago
|
||
Comment on attachment 9306930 [details] [diff] [review]
1801067-no-moz-assert-for-TB.patch
I'm marking this patch comment 24 as obsolete, I don't think we want to disable the integrity/correctness assertions for Thunderbird. We need a real fix.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 42•2 years ago
|
||
Assignee | ||
Comment 43•2 years ago
|
||
With D165915 and D168111 applied, test_starttlsFailure.js works again in a debug build, so we should undo commit from comment 22, I've attached revision D168114 to do so.
I think we should get these three revisions r+'ed and landed, as a partial fix.
Assignee | ||
Comment 44•2 years ago
|
||
(In reply to Ping Chen (:rnons) from comment #32)
Doesn't work yet, please help.
Ping, indeed your revision D165926 doesn't work yet.
That code apparently creates some temporary variables, and attempts to destroy them on the wrong thread, with the following output:
Hit MOZ_CRASH(DOMEventTargetHelper not thread-safe) at /home/user/moz/comm-beta/mozilla/xpcom/base/nsISupportsImpl.cpp:43
I made another patch which seems to work, I'll attach it to bug 1809755.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 45•2 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #41)
I'm marking this patch comment 24 as obsolete, I don't think we want to disable the integrity/correctness assertions for Thunderbird. We need a real fix.
Kai, with the three patches applied, can yon run a debug TB with an IMAP/SSL account or does it still crash? Note that attachment 9306930 [details] [diff] [review] was intended to allow developers to work (crashes made any work impossible, see for bug 1805111 comment #2).
Assignee | ||
Comment 46•2 years ago
|
||
(In reply to BB from comment #45)
Kai, with the three patches applied, can yon run a debug TB with an IMAP/SSL account or does it still crash?
Yes it runs fine. No crash.
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 48•2 years ago
|
||
The three accepted patches are ready for landing together - remaining work will be done in bug 1809755.
Comment 49•2 years ago
|
||
Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/5904645b6d72
run GetSecurityInfo and IsAlive synchronously on the socket thread in imap. r=kaie
https://hg.mozilla.org/comm-central/rev/2276f3710694
run StartTLS synchronously on the socket thread in imap. r=keeler
https://hg.mozilla.org/comm-central/rev/93e86bff6c50
Re-enable imap test test_starttlsFailure.js on cpp debug. r=mkmelin
Updated•2 years ago
|
Comment 50•2 years ago
|
||
Comment 51•2 years ago
|
||
I can confirm, the landed patches fix the startup crash observed in Bug 1814054.
Comment 52•2 years ago
|
||
Updated•2 years ago
|
Comment 53•2 years ago
|
||
I think there is one other place where the getSecurityInfo() changes was not fixed. I haven't seen a crash or lockup since I changed this but not 100% sure this is the only problem.
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 54•2 years ago
|
||
Given the risk of deadlocks, I think we need a more fundamental change.
I think we should stop running IMAP code on the main thread.
nsSyncRunnableHelpers has code to dispatch many functions to the main thread.
I think that should be changed to dispatch to the socket thread instead.
I'll make that change locally and test what happens.
Assignee | ||
Comment 55•2 years ago
|
||
Well, the ideal solution from comment 54 seems very difficult to do. In my experiment I immediately run into problems.
Some code must run on the main thread, and we're reaching assertions when running it from either the IMAP thread or the socket thread.
The IMAP code has a lot of assumptions baked in about the thread it is running on...
So, let's try harder with the pragmatic approach (minimal changes).
I've made more experiments.
Both Emilio and Magnus had suggested to skip the call to IsAlive.
I agree we could remove that from nsImapProtocol::TellThreadToDie()
In nsImapProtocol::ImapThreadMainLoop() the result of the call is used to decide about retrying. I don't have a great idea how to change that logic. However, that code runs on the IMAP thread, and the call to IsAlive should be simple and quick. I'm hoping it's safe to dispatch IsAlive to the socket thread in that place.
Then we have the calls to GetSecurityInfo. When Emilio hit the deadlock in bug 1816633 comment 1, it was inside nsImapProtocol::LoadImapUrlInternal(), running on the main thread, dispatching GetSecurityInfo() to the socket thread.
As a wild experiment I simply disabled that code, which skips adding that info to the mock channel.
In my manual testing so far I didn't find a negative side effect yet.
Let me allow to cross fingers and hope that we indeed could disable it, and maybe follow up later with a solution, should it be necessary.
Gene pointed out another place in comment 53, which we weren't yet handling. That's true, I tested a connection that requires a certificate exception, and Gene's addition is necessary in that place. Luckily, that place runs on the IMAP thread, so again (hopefully) it's safe to proxy to the socket thread from there. In my manual testing, adding the exception worked.
Let's see if that combination of changes survives a try build.
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=41769cab39516687f216bd8a537ffa6aede17b24
Assignee | ||
Comment 56•2 years ago
|
||
Try build looks good (no test failures related to imap).
I've been running a local m-c build with the patches (from that try build) for the last 12 hours, and it didn't hang yet.
The code that obtains the security from the socket (and stores it into the m_mockChannel),
which had caused the deadlock in (bug 1816633),
and which these patches disable,
seems to be unnecessary.
In an experiment, I've reenabled the above code (risking the deadlock, which is rare),
and I've set a breakpoint on nsImapMockChannel::GetSecurityInfo().
Whenever that function was reached, mSecurityInfo was still null !
The callers are the document, which attempt to obtain securityInfo early, and keep a copy "for potentially using it later", and those always get a nullptr.
So apparently it isn't a problem that this returns null,
and setting the mockChannel's securityInfo later on seemed useless.
I'm not surprised, because our mail window doesn't have any place to report the TLS status of the server connection, which means, we probably don't have a reason to query/use the securityInfo for good connections anywhere.
I cannot tell if there are other scenarios, which I haven't covered in my interactive testing, in which it would be necessary.
We could either remove it now completely, and hope there won't be any regression reports, or we could use a different strategy for obtaining the securityInfo.
If we want to do a latter (try to be carefuly, and obtain and set it anyway), here is a potential approach:
The securityInfo is related to the server connection.
I don't know if one IMAP thread is used for connections to more than one server.
@Gene: Do you know that?
If an IMAP thread is limited to one specific server, then we could cache the security info at an earlier time, and reuse it for every URL that is processed on that thread.
Assignee | ||
Comment 57•2 years ago
|
||
Gene, do you know the answer to the question from the previous comment?
Assignee | ||
Comment 58•2 years ago
|
||
Updating attached patches/phab:
Test patch: Remains unchanged.
Call start TLS: Applied linting, and added an assertion for not being on the main thread, carrying forward r+
Old patch for GetSecurityInfo/IsAlive: Dropping
New patch for GetSecurityInfo/IsAlive: As explained in previous comments, will ask Gene for review.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 59•2 years ago
|
||
Depends on D168111
Assignee | ||
Comment 60•2 years ago
|
||
Emilio, because you ran into the deadlock, would you like to apply revisions D168111 and D170206 and test if it works for you?
Comment 61•2 years ago
|
||
The securityInfo is related to the server connection.
I don't know if one IMAP thread is used for connections to more than one server.
@Gene: Do you know that?
If an IMAP thread is limited to one specific server, then we could cache the security info at an earlier time, and reuse it for every URL that is processed on that thread.
Each imap thread handles a connection to only one server. There may be up to 5 connections (threads) per server with each thread imap selected to a single mailbox (a.k.a., folder).
Comment 62•2 years ago
|
||
So far, mostly looks ok with Kai's latest changes applied.
Don't know is this is relevant but I often see this forced crash:
Hit MOZ_CRASH(nsWeakReference not thread-safe) at /home/gene/mozilla/xpcom/base/nsISupportsImpl.cpp:43
Attached is the backtrace from gdb. Doesn't look like it's caused by any comm code.
Since I run with a debug build I usually have this MOZ_CRASH commented out. I've seen this (maybe since Nov 2022) with and without the recent changes and churn to imap code. However, not sure if I saw it before the fundamental changes were made to moz code, maybe here: bug 1793841 ??
Edit: FYI, on console output I see this when this crash occurs:
Hit MOZ_CRASH(nsWeakReference not thread-safe) at /home/gene/mozilla/xpcom/base/nsISupportsImpl.cpp:43
Comment 63•2 years ago
|
||
I also see this quite a bit. I change it from MOZ_ASSERT to NS_ASSERTION so it doesn't crash. No idea what a "mutated DOM" is.
[Parent 120379, Main Thread] ###!!! ASSERTION: gds: Don't mutate the DOM while using a ShadowIncludingTreeIterator: '!mMutationGuard.Mutated(0)', file /home/gene/mozilla/obj-x86_64-pc-linux-gnu/dist/include/mozilla/dom/ShadowIncludingTreeIterator.h:45
Anyhow, without this and the item in comment 62 ignored I'm unable to run a debug build without constant crashes. Don't know if this is related to the the recent moz network changes.
Assignee | ||
Comment 64•2 years ago
|
||
Yes, I have also seen DOM crash with a debug build when loading HTML email. I switched to plain text (menu / view /message body as / plain text).
Comment 65•2 years ago
|
||
Hmm those are both problematic assertions... The stack from comment 62 is not of the right thread, but those seem worth at least looking into
Comment 66•2 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #64)
Yes, I have also seen DOM crash with a debug build when loading HTML email. I switched to plain text (menu / view /message body as / plain text).
Ok, haven't tried that yet. I'll check that I only see the DOM assert on emails opened with HTML.
(In reply to Emilio Cobos Álvarez (:emilio) from comment #65)
Hmm those are both problematic assertions... The stack from comment 62 is not of the right thread, but those seem worth at least looking into
I didn't think the stack looked right either. The crash output from tb just said to run the command shown to attach gdb to the thread. How do you tell which thread the crash is for and then bt it's stack?
Comment 67•2 years ago
|
||
(In reply to gene smith from comment #66)
I didn't think the stack looked right either. The crash output from tb just said to run the command shown to attach gdb to the thread. How do you tell which thread the crash is for and then bt it's stack?
You can do thread apply all bt
to get a backtrace of all threads, then see which thread is asserting
Updated•2 years ago
|
Comment 68•2 years ago
|
||
Emilio, thanks for thread apply all bt
hint. Here's the crashing thread "DOM Worker" due to "nsWeakReference not thread-safe". When this happened I was opening a huge PDF into a tab. It appears the crash happened during the imap URL destructor. Before opening the PDF, the message had been successfully imap fetched and stored in cache.
To me, this doesn't seem to have anything to do with the threading issues for the current bug.
Edit: I see this assert for a pdf of any size opened to a tab. I didn't try to open the pdf into an external app but opening other attachment types into external apps doesn't cause the assert/crash.
Comment 69•2 years ago
|
||
Changes at https://phabricator.services.mozilla.com/D170206 look OK to me and work when I tested them. After testing this diff some I made a few changes and tested again to see if it still works without any lock-ups or crashes.
I made each of the separate in-line dispatching code chunks a separate private nsImapProtocol member function. I think it makes the changes easier to understand.
I also put back in the "is alive" determinations in the nsImapProtocol::TellThreadToDie() since it runs on imap thread.
Also, FWIW, I see non-null m_securityInfo placed into mock channel.
I tested extensively with this diff and didn't seen any crashes or deadlocks.
Assignee | ||
Comment 70•2 years ago
|
||
(In reply to gene smith from comment #69)
Also, FWIW, I see non-null m_securityInfo placed into mock channel.
Yes, I also saw that. But it happened after the other code had already queried it (when it was still null). (At least in the scenarios I have tested.)
Assignee | ||
Comment 71•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 72•2 years ago
|
||
Pushed by kaie@kuix.de:
https://hg.mozilla.org/comm-central/rev/0cbcf924b4a1
Re-enable imap test test_starttlsFailure.js on cpp debug. r=mkmelin
https://hg.mozilla.org/comm-central/rev/9856e8bc702c
Use safe calling to GetSecurityInfo and IsAlive in IMAP code. r=gds
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 73•2 years ago
|
||
Wayne, because imap-js is being enabled by default on Daily, it means we won't get test coverage for this on Daily.
From now on, I think we should uplift all changes to the IMAP/C++ code immediately to beta.
Assignee | ||
Comment 74•2 years ago
|
||
Comment on attachment 9314560 [details]
Bug 1801067 - Re-enable imap test test_starttlsFailure.js on cpp debug. r=mkmelin
need beta uplift to test imap/c++ patches
Assignee | ||
Comment 75•2 years ago
|
||
Comment on attachment 9318773 [details]
Bug 1801067 - Use safe calling to GetSecurityInfo and IsAlive in IMAP code. r=gds
need beta uplift to test imap/c++ patches
Comment 76•2 years ago
|
||
IMAP-JS will not be permanently enabled on Daily, FYI.
Comment 77•2 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #73)
Wayne, because imap-js is being enabled by default on Daily, it means we won't get test coverage for this on Daily.
Daily has test coverage for both CPP and JS IMAP.
test_starttlsFailure.js
is in xpcshell-shared.ini
. That file gets included by xpcshell-cpp.ini
which runs the tests with mailnews.imap.jsmodule=false
as well as xpcshell.ini
where mailnews.imap.jsmodule=true
.
Assignee | ||
Comment 78•2 years ago
|
||
Well, I'm sorry if I used "test coverage" in the wrong way.
My point is, the code won't be tested by users of Daily (for an unknown amount of time, see comment 76).
Updated•2 years ago
|
Comment 79•2 years ago
|
||
Comment on attachment 9314560 [details]
Bug 1801067 - Re-enable imap test test_starttlsFailure.js on cpp debug. r=mkmelin
[Triage Comment]
Let's revisit this in a few days.
(In reply to Kai Engert (:KaiE:) from comment #78)
My point is, the code won't be tested by users of Daily (for an unknown amount of time, see comment 76).
Quite correct. However this is not a zero risk change, so perhaps instead we should consider again disabling imap-js (just as easy as having enabled it)
Other factors:
- 111.0b1 when it does release (it hasn't yet), it is going to be a very low rate.
- beta is already overloading users with supernova changes
- with Rob's comment there is less urgency to test via beta
Comment 80•2 years ago
|
||
Comment on attachment 9318773 [details]
Bug 1801067 - Use safe calling to GetSecurityInfo and IsAlive in IMAP code. r=gds
[Triage Comment]
We can revisit this in a few days.
Comment 81•2 years ago
|
||
FWIW, I do think the patch is working. I had very frequent lockups earlier, but nothing while running dailies with the latest with the newest patch.
Comment 82•2 years ago
|
||
imap-js was disabled on daily as of two days ago.
I'd still feel better letting this ride to next Monday's merge, unless someone has objections.
Description
•