Avoid shutdown hangs caused by Prompter.jsm
Categories
(Toolkit :: General, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox88 | --- | fixed |
People
(Reporter: jstutte, Assigned: jstutte)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
It seems that modal dialogs can block the shutdown progress.
And indeed openPromptSync spins the event loop without further condition.
We should investigate different options here.
Comment 1•4 years ago
|
||
It looks like this promise isn't being resolved or rejected during the quit-application
shutdown phase, meaning that the application hangs when trying to quit: https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/toolkit/components/prompts/src/Prompter.jsm#1054
I'm not entirely certain which promise step in the openPrompt
code is the one which is hanging, as it's a bit difficult to tell right now, but I don't see any code which makes a conscious effort to ensure that if quit-application
is fired, but I'm guessing we're hanging here (https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/toolkit/components/prompts/src/Prompter.jsm#1174-1177) due to being in the parent process, but we could theoretically also be here: https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/browser/base/content/browser.js#9379, waiting for the window modal dialog to be closed.
An "easy" option here might be to leave the nested event loop early with an error when we enter the quit-application
phase, and perhaps put the entire Prompter.jsm
into a new state where it can't open new prompts anymore due to being late enough we don't want to be showing the user more prompts, so we don't open a new prompt after this point.
ni? :pbz because I remember you recently touching this code.
Comment 2•4 years ago
|
||
Thanks!
If openPrompt
is hanging and doesn't resolve for some reason, I'd still expect spinEventLoopUntilOrShutdown
to handle the shutdown and return?
Even for the case where a prompt is spawned in the shutdown phase it should exit the loop: https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/xpcom/threads/nsThreadManager.cpp#762-767
Perhaps I'm misunderstanding spinEventLoopUntilOrShutdown
here?
EDIT: Ok I've just seen https://bugzilla.mozilla.org/show_bug.cgi?id=1678330#c12 , so we can't rely on that method to always return on shutdown? In that case we should either try to fix that or add a comment to warn about this unexpected behavior.
In Nightly I can reproduce the shutdown hang if I close the browser window via the window close button, exit out of the resulting tab close warning dialog via esc and hit the close button again in a very short timeframe.
Comment 3•4 years ago
|
||
(In reply to Paul Zühlcke [:pbz] from comment #2)
EDIT: Ok I've just seen https://bugzilla.mozilla.org/show_bug.cgi?id=1678330#c12 , so we can't rely on that method to always return on shutdown? In that case we should either try to fix that or add a comment to warn about this unexpected behavior.
Yeah, the method is unfortunately not reliable for that kind of thing because it won't cause the loop to exit when quit-application
has been notified, spinEventLoopUntilOrShutdown
only stops spinning at xpcom-shutdown
, which is much later.
It looks like the method is only used in 3 places:
- https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/toolkit/components/prompts/src/Prompter.jsm#1065
- https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/toolkit/components/passwordmgr/LoginManagerAuthPrompter.jsm#836
- https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/toolkit/content/widgets/browser-custom-element.js#1709
All of these places (perhaps with the exception of the browser-custom-element.js
?) seem like they would benefit from having spinEventLoopOrShutdown
aborting the event loop in quit-application
, so perhaps we should fix this bug by changing the behaviour of that method? If we want to do that we'll just need to add more topics to the ShutdownObserveHelper
used by that method here: https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/xpcom/threads/nsThreadManager.cpp#269-304
In Nightly I can reproduce the shutdown hang if I close the browser window via the window close button, exit out of the resulting tab close warning dialog via esc and hit the close button again in a very short timeframe.
Good to know we have a reliable STR here!
Assignee | ||
Comment 4•4 years ago
|
||
Sounds good! In order to avoid any problems with the order of notifications during quit-application we might want to react on quit-application-granted here.
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
Assignee | ||
Comment 6•4 years ago
|
||
Hi, I just gave it a try to make some more cleanup (see WIP patch), in the hope to harmonize things a bit more with AppShutdown
and ShutdownPhases
.
Ideally this would give us the possibility to remove the ShutdownObserveHelper
entirely. But unfortunately there is also the "content-child-will-shutdown"
notification case which still needs it? I know too little about a content process's shutdown sequence and if AppShutdown
is involved there at all.
Thoughts?
BTW: There is also KillClearOnShutdown
which has its own static status variable for the current shutdown phase, this could be further harmonized.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 7•4 years ago
|
||
I spent some more time on code reading, my current understanding is:
- "content-child-will-shutdown" is roughly comparable to a "quit-application-granted" but happening only in the child process
- "content-child-shutdown" is roughly comparable to a "quit-application" but again in the child process
- There are only very few observers for both, but there is an interesting case in nsJSEnvironment that kind of seconds the equivalence
- There is already an indirect link between
ContentChild
andAppShutdown
throughProcessChild::QuickExit
ContentChild
has its ownmShuttingDown
flag set as a consequence ofContentChild::RecvShutdown
which fires also "content-child-will-shutdown".
But besides the conceptual similarities it still seems we have process-type-specific sets of shutdown phases and very different code paths to reach them. Thus it would be probably not really helping the ease of understanding if we just add some more phases to ShutdownPhase
(breaking also the steadily increasing order of phases).
My conclusion would be then to choose between the following two options:
- something on the lines of the WIP patch already attached here, without trying to further align the child process handling, too
- just add topics to the original handling as suggested by Nika
Updated•4 years ago
|
Assignee | ||
Comment 8•4 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #7)
My conclusion would be then to choose between the following two options:
- something on the lines of the WIP patch already attached here, without trying to further align the child process handling, too
- just add topics to the original handling as suggested by Nika
So there is always a third way. I propose to remove the ShutdownObserveHelper
entirely from nsThreadManager
in favor of faking a ShutdownPhase::AppShutdownConfirmed
from within child process shutdown. This feels less scary to me than having child process specific notification messages as a hard dependency of SpinEventLoopUntilOrQuit
and maybe paves the way for more thorough harmonization work here.
Patch is up for review, thus clearing the ni's (but feel free to dissent on the patch, of course).
I also launched a try push FWIW.
:pbz, maybe you want to try it out with your STR (I was not able to reproduce here)?
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 9•4 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #8)
:pbz, maybe you want to try it out with your STR (I was not able to reproduce here)?
Thanks for working on this! With the patch I still managed to end up in some weird modal state, but it no longer hangs on shutdown.
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to Paul Zühlcke [:pbz] from comment #9)
With the patch I still managed to end up in some weird modal state, but it no longer hangs on shutdown.
Hmm, we might want to return from SpinEventLoopUntilOrQuit with an error in that case? Otherwise the calling code might continue to try something weird?
Assignee | ||
Comment 11•4 years ago
|
||
:pbz, I made builds on try that include the return of an NS_ERROR_ILLEGAL_DURING_SHUTDOWN
(not in the patch on phabricator yet) in case we abort the loop. Maybe you want to check if this makes behave the calling code better?
Comment 12•4 years ago
|
||
Thanks! I've tested with the latest try build you linked. I still managed to produce a shutdown hang:
https://crash-stats.mozilla.org/report/index/06b06173-0fc1-4301-884a-111f00210315
Looks like it's still hanging in spinEventloopUntil
.
Not sure if related, but testing with the previous try build I ran into another crash (can't reproduce it anymore):
https://crash-stats.mozilla.org/report/index/7d10935c-4e37-442c-b882-de15e0210315
Comment 13•4 years ago
|
||
Sorry, I might have switched up the builds. I've tested again with the latest build from try and the patch seems to fix the shutdown hang issue.
I can still get into a state where the browser window is partially disabled, we can track that separately though. There is Bug 1696793 which seems related. Maybe there are still issues that we haven't fixed yet.
Updated•4 years ago
|
Assignee | ||
Comment 14•4 years ago
|
||
Thanks for checking! So to remain consistent with the old behavior of the former SpinEventLoopUntilOrShutdown
the patch now does not return an NS_ERROR_ILLEGAL_DURING_SHUTDOWN
error in case of abort, but just NS_OK. However, I left a comment there to indicate that we might want to prefer a more explicit handling here.
Assignee | ||
Comment 15•4 years ago
|
||
Summarizing the status of this bug:
-
The patch D107619 mitigates the shutdown hang as such, it seems. There remains only a question whether we should report an explicit error here (which would semantically fit with
SpinEventLoopUntilOrQuit
, as a caller probably wants to have a way to know, what caused the exit). There might be a wish to convert as much javascriptSpinEventLoopUntil
calls toSpinEventLoopUntilOrQuit
as possible, too, but I would consider this out of scope here (and rather look out for otherXPCOMSpinEventLoopStack
annotations on shutdown hangs first). -
There still should be some action on the lines of
put the entire Prompter.jsm into a new state where it can't open new prompts anymore due to being late enough we don't want to be showing the user more prompts, so we don't open a new prompt after this point."
as of nika from comment 1. This might maybe just be an explicit handling of the aforementioned error code? :pbz, I'd propose, that this is handled by a separate patch, as patch D107619 can land as such and already improves things (and I am not capable of doing the needed javascript magic, too).
Comment 16•4 years ago
|
||
If we don't throw an error the prompt return value will fall back to the default. For example if a browser component shows a confirmation prompt, upon shutdown it would be closed and the result would be "cancel" (not accept). I think that's a reasonable approach and I don't expect any behavior change switching from spinEventLoopUntilOrShutdown
to SpinEventLoopUntilOrQuit
.
If we throw an error it would still up to the prompt callers (all consumers of nsIPromptService) to catch it. So ensuring compatibility would be ensuring all consumers can handle this specific exception. Prompter.jsm may already throw for "Prompt aborted by user" (for example upon navigation), so I wouldn't expect that much breakage.
Landing patch D107619 sounds good to me and I think it can close this bug, since we're concerned with shutdown hangs here.
Assignee | ||
Comment 17•4 years ago
|
||
(In reply to Paul Zühlcke [:pbz] from comment #16)
If we don't throw an error the prompt return value will fall back to the default. For example if a browser component shows a confirmation prompt, upon shutdown it would be closed and the result would be "cancel" (not accept). I think that's a reasonable approach and I don't expect any behavior change switching from
spinEventLoopUntilOrShutdown
toSpinEventLoopUntilOrQuit
.
That sounds like a reasonable flow also to me, and probably we should not raise an error then and I can remove the XXX comments.
Landing patch D107619 sounds good to me and I think it can close this bug, since we're concerned with shutdown hangs here.
OK, obviously first I need an r+.
Comment 18•4 years ago
|
||
Comment 19•4 years ago
|
||
bugherder |
Assignee | ||
Updated•3 years ago
|
Description
•