Intermittent Assertion failure: false (ClearOnShutdown for phase that already was cleared), at /builds/worker/checkouts/gecko/xpcom/base/ClearOnShutdown.cpp:20
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: jstutte)
References
Details
(Keywords: assertion, intermittent-failure)
Attachments
(2 files, 1 obsolete file)
Filed by: malexandru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=338567772&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/KP3aTPQOSQqESyjzLKEDKA/runs/0/artifacts/public/logs/live_backing.log
[task 2021-05-03T19:00:29.410Z] 19:00:29 INFO - TEST-OK | docshell/test/mochitest/test_bug529119-1.html | took 1897ms
[task 2021-05-03T19:00:29.439Z] 19:00:29 INFO - GECKO(4429) | [Child 4433, Main Thread] WARNING: early callback, or time went backwards: '!aAllowIdleDispatch', file /builds/worker/checkouts/gecko/xpcom/threads/IdleTaskRunner.cpp:179
[task 2021-05-03T19:00:29.448Z] 19:00:29 INFO - TEST-START | docshell/test/mochitest/test_bug529119-2.html
[task 2021-05-03T19:00:29.457Z] 19:00:29 INFO - GECKO(4429) | [Child 4433, Main Thread] WARNING: early callback, or time went backwards: '!aAllowIdleDispatch', file /builds/worker/checkouts/gecko/xpcom/threads/IdleTaskRunner.cpp:179
[task 2021-05-03T19:00:29.469Z] 19:00:29 INFO - GECKO(4429) | [Child 4433, Main Thread] WARNING: early callback, or time went backwards: '!aAllowIdleDispatch', file /builds/worker/checkouts/gecko/xpcom/threads/IdleTaskRunner.cpp:179
[task 2021-05-03T19:00:29.479Z] 19:00:29 INFO - GECKO(4429) | [Child 4433, Main Thread] WARNING: early callback, or time went backwards: '!aAllowIdleDispatch', file /builds/worker/checkouts/gecko/xpcom/threads/IdleTaskRunner.cpp:179
[task 2021-05-03T19:00:29.489Z] 19:00:29 INFO - GECKO(4429) | [Child 4433, Main Thread] WARNING: early callback, or time went backwards: '!aAllowIdleDispatch', file /builds/worker/checkouts/gecko/xpcom/threads/IdleTaskRunner.cpp:179
[task 2021-05-03T19:00:29.605Z] 19:00:29 INFO - GECKO(4429) | [Child 4478, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/caps/BasePrincipal.cpp:111
[task 2021-05-03T19:00:29.612Z] 19:00:29 INFO - GECKO(4429) | [Child 4478, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/caps/BasePrincipal.cpp:111
[task 2021-05-03T19:00:29.637Z] 19:00:29 INFO - GECKO(4429) | [Child 4478, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3357
[task 2021-05-03T19:00:29.637Z] 19:00:29 INFO - GECKO(4429) | [Child 4478, Main Thread] WARNING: '!obs', file /builds/worker/checkouts/gecko/toolkit/components/sessionstore/RestoreTabContentObserver.cpp:58
[task 2021-05-03T19:00:29.649Z] 19:00:29 INFO - GECKO(4429) | [Child 4478, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4442
[task 2021-05-03T19:00:29.833Z] 19:00:29 INFO - GECKO(4429) | [Child 4481, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/caps/BasePrincipal.cpp:1349
[task 2021-05-03T19:00:29.845Z] 19:00:29 INFO - GECKO(4429) | [Child 4481, Main Thread] WARNING: nsAppShell::Exit() called redundantly: file /builds/worker/checkouts/gecko/widget/cocoa/nsAppShell.mm:751
[task 2021-05-03T19:00:29.846Z] 19:00:29 INFO - GECKO(4429) | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2021-05-03T19:00:29.848Z] 19:00:29 INFO - GECKO(4429) | [Child 4481, Main Thread] ###!!! ASSERTION: Using observer service after XPCOM shutdown!: 'Error', file /builds/worker/checkouts/gecko/xpcom/ds/nsObserverService.cpp:176
[task 2021-05-03T19:00:29.853Z] 19:00:29 INFO - Initializing stack-fixing for the first stack frame, this may take a while...
[task 2021-05-03T19:00:38.134Z] 19:00:38 INFO - GECKO(4429) | #01: NS_DebugBreak [xpcom/base/nsDebugImpl.cpp:431]
[task 2021-05-03T19:00:38.135Z] 19:00:38 INFO - GECKO(4429) | #02: nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [xpcom/ds/nsObserverService.cpp:274]
[task 2021-05-03T19:00:38.135Z] 19:00:38 INFO - GECKO(4429) | #03: mozilla::dom::Document::DispatchContentLoadedEvents() [dom/base/Document.cpp:7657]
[task 2021-05-03T19:00:38.136Z] 19:00:38 INFO - GECKO(4429) | #04: mozilla::detail::RunnableMethodImpl<mozilla::dom::Document*, void (mozilla::dom::Document::*)(), true, (mozilla::RunnableKind)0, >::Run() [xpcom/threads/nsThreadUtils.h:1204]
[task 2021-05-03T19:00:38.136Z] 19:00:38 INFO - GECKO(4429) | #05: mozilla::SchedulerGroup::Runnable::Run() [xpcom/threads/SchedulerGroup.cpp:143]
[task 2021-05-03T19:00:38.136Z] 19:00:38 INFO - GECKO(4429) | #06: mozilla::RunnableTask::Run() [xpcom/threads/TaskController.cpp:483]
[task 2021-05-03T19:00:38.137Z] 19:00:38 INFO - GECKO(4429) | #07: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [xpcom/threads/TaskController.cpp:766]
[task 2021-05-03T19:00:38.137Z] 19:00:38 INFO - GECKO(4429) | #08: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [xpcom/threads/TaskController.cpp:621]
[task 2021-05-03T19:00:38.138Z] 19:00:38 INFO - GECKO(4429) | #09: mozilla::TaskController::ProcessPendingMTTask(bool) [xpcom/threads/TaskController.cpp:405]
[task 2021-05-03T19:00:38.138Z] 19:00:38 INFO - GECKO(4429) | #10: mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() [xpcom/threads/nsThreadUtils.h:535]
[task 2021-05-03T19:00:38.139Z] 19:00:38 INFO - GECKO(4429) | #11: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1163]
[task 2021-05-03T19:00:38.139Z] 19:00:38 INFO - GECKO(4429) | #12: NS_ProcessPendingEvents(nsIThread*, unsigned int) [xpcom/threads/nsThreadUtils.cpp:496]
[task 2021-05-03T19:00:38.139Z] 19:00:38 INFO - GECKO(4429) | #13: nsThreadManager::Shutdown() [xpcom/threads/nsThreadManager.cpp:372]
[task 2021-05-03T19:00:38.140Z] 19:00:38 INFO - GECKO(4429) | #14: mozilla::ShutdownXPCOM(nsIServiceManager*) [xpcom/build/XPCOMInit.cpp:657]
[task 2021-05-03T19:00:38.140Z] 19:00:38 INFO - GECKO(4429) | #15: XRE_TermEmbedding() [toolkit/xre/nsEmbedFunctions.cpp:216]
[task 2021-05-03T19:00:38.140Z] 19:00:38 INFO - GECKO(4429) | #16: mozilla::ipc::ScopedXREEmbed::Stop() [ipc/glue/ScopedXREEmbed.cpp:91]
[task 2021-05-03T19:00:38.141Z] 19:00:38 INFO - GECKO(4429) | #17: XRE_InitChildProcess(int, char**, XREChildData const*) [toolkit/xre/nsEmbedFunctions.cpp:748]
[task 2021-05-03T19:00:38.141Z] 19:00:38 INFO - GECKO(4429) | #18: main [ipc/app/MozillaRuntimeMain.cpp:72]
[task 2021-05-03T19:00:38.142Z] 19:00:38 INFO - GECKO(4429) | Assertion failure: false (ClearOnShutdown for phase that already was cleared), at /builds/worker/checkouts/gecko/xpcom/base/ClearOnShutdown.cpp:20
[task 2021-05-03T19:00:38.142Z] 19:00:38 INFO - GECKO(4429) | #01: mozilla::ClearOnShutdown_Internal::InsertIntoShutdownList(mozilla::ClearOnShutdown_Internal::ShutdownObserver*, mozilla::ShutdownPhase) [xpcom/base/ClearOnShutdown.cpp:20]
[task 2021-05-03T19:00:38.142Z] 19:00:38 INFO - GECKO(4429) | #02: mozilla::dom::ServiceWorkerManager::GetInstance() [dom/serviceworkers/ServiceWorkerManager.cpp:1623]
[task 2021-05-03T19:00:38.143Z] 19:00:38 INFO - GECKO(4429) | #03: mozilla::dom::Document::DispatchContentLoadedEvents() [dom/base/Document.cpp:7760]
[task 2021-05-03T19:00:38.143Z] 19:00:38 INFO - GECKO(4429) | #04: mozilla::detail::RunnableMethodImpl<mozilla::dom::Document*, void (mozilla::dom::Document::*)(), true, (mozilla::RunnableKind)0, >::Run() [xpcom/threads/nsThreadUtils.h:1204]
[task 2021-05-03T19:00:38.144Z] 19:00:38 INFO - GECKO(4429) | #05: mozilla::SchedulerGroup::Runnable::Run() [xpcom/threads/SchedulerGroup.cpp:143]
[task 2021-05-03T19:00:38.144Z] 19:00:38 INFO - GECKO(4429) | #06: mozilla::RunnableTask::Run() [xpcom/threads/TaskController.cpp:483]
[task 2021-05-03T19:00:38.144Z] 19:00:38 INFO - GECKO(4429) | #07: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [xpcom/threads/TaskController.cpp:766]
[task 2021-05-03T19:00:38.145Z] 19:00:38 INFO - GECKO(4429) | #08: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [xpcom/threads/TaskController.cpp:621]
[task 2021-05-03T19:00:38.145Z] 19:00:38 INFO - GECKO(4429) | #09: mozilla::TaskController::ProcessPendingMTTask(bool) [xpcom/threads/TaskController.cpp:405]
[task 2021-05-03T19:00:38.146Z] 19:00:38 INFO - GECKO(4429) | #10: mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() [xpcom/threads/nsThreadUtils.h:535]
[task 2021-05-03T19:00:38.146Z] 19:00:38 INFO - GECKO(4429) | #11: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1163]
[task 2021-05-03T19:00:38.146Z] 19:00:38 INFO - GECKO(4429) | #12: NS_ProcessPendingEvents(nsIThread*, unsigned int) [xpcom/threads/nsThreadUtils.cpp:496]
[task 2021-05-03T19:00:38.147Z] 19:00:38 INFO - GECKO(4429) | #13: nsThreadManager::Shutdown() [xpcom/threads/nsThreadManager.cpp:372]
[task 2021-05-03T19:00:38.147Z] 19:00:38 INFO - GECKO(4429) | #14: mozilla::ShutdownXPCOM(nsIServiceManager*) [xpcom/build/XPCOMInit.cpp:657]
[task 2021-05-03T19:00:38.148Z] 19:00:38 INFO - GECKO(4429) | #15: XRE_TermEmbedding() [toolkit/xre/nsEmbedFunctions.cpp:216]
[task 2021-05-03T19:00:38.148Z] 19:00:38 INFO - GECKO(4429) | #16: mozilla::ipc::ScopedXREEmbed::Stop() [ipc/glue/ScopedXREEmbed.cpp:91]
[task 2021-05-03T19:00:38.149Z] 19:00:38 INFO - GECKO(4429) | #17: XRE_InitChildProcess(int, char**, XREChildData const*) [toolkit/xre/nsEmbedFunctions.cpp:748]
[task 2021-05-03T19:00:38.149Z] 19:00:38 INFO - GECKO(4429) | #18: main [ipc/app/MozillaRuntimeMain.cpp:72]
[task 2021-05-03T19:00:38.149Z] 19:00:38 INFO - GECKO(4429) | [Parent 4429, Unnamed thread 125018ed0] WARNING: Resource acquired is being released in non-LIFO order; why?
[task 2021-05-03T19:00:38.150Z] 19:00:38 INFO - GECKO(4429) | : file /builds/worker/checkouts/gecko/xpcom/threads/BlockingResourceBase.cpp:290
[task 2021-05-03T19:00:38.150Z] 19:00:38 INFO - GECKO(4429) | --- Mutex : dumpSafetyLock (currently acquired)
[task 2021-05-03T19:00:38.150Z] 19:00:38 INFO - GECKO(4429) | calling context
[task 2021-05-03T19:00:38.151Z] 19:00:38 INFO - GECKO(4429) | [stack trace unavailable]
[task 2021-05-03T19:00:38.151Z] 19:00:38 INFO - GECKO(4429) | [4488, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/checkouts/gecko/xpcom/base/nsTraceRefcnt.cpp:204
[task 2021-05-03T19:00:38.152Z] 19:00:38 INFO - GECKO(4429) | [4488, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/checkouts/gecko/xpcom/base/nsTraceRefcnt.cpp:204
[task 2021-05-03T19:00:38.152Z] 19:00:38 INFO - GECKO(4429) | ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /var/folders/ds/9_p3r7s95p7_1rx9lt0_5wg0000014/T/tmppfg2c9j5.mozrunner/runtests_leaks_tab_pid4488.log
[task 2021-05-03T19:00:38.152Z] 19:00:38 INFO - GECKO(4429) | [4488, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/checkouts/gecko/xpcom/base/nsTraceRefcnt.cpp:204
[task 2021-05-03T19:00:38.153Z] 19:00:38 INFO - GECKO(4429) | [4488, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/checkouts/gecko/xpcom/base/nsTraceRefcnt.cpp:204
[task 2021-05-03T19:00:38.153Z] 19:00:38 INFO - GECKO(4429) | [4488, Main Thread] WARNING: XPCOM_MEM_BLOAT_LOG is set, disabling native allocations.: file /builds/worker/checkouts/gecko/tools/profiler/core/platform.cpp:254
[task 2021-05-03T19:00:38.154Z] 19:00:38 INFO - GECKO(4429) | MEMORY STAT | vsize 6745MB | residentFast 105MB | heapAllocated 14MB
[task 2021-05-03T19:00:38.154Z] 19:00:38 INFO - Buffered messages logged at 19:00:29
[task 2021-05-03T19:00:38.155Z] 19:00:38 INFO - TEST-PASS | docshell/test/mochitest/test_bug529119-2.html | A valid string reason is expected
[task 2021-05-03T19:00:38.155Z] 19:00:38 INFO - TEST-PASS | docshell/test/mochitest/test_bug529119-2.html | Reason cannot be empty
[task 2021-05-03T19:00:38.155Z] 19:00:38 INFO - TEST-FAIL | docshell/test/mochitest/test_bug529119-2.html | The author of the test has indicated that flaky timeouts are expected. Reason: untriaged
[task 2021-05-03T19:00:38.156Z] 19:00:38 INFO - TEST-PASS | docshell/test/mochitest/test_bug529119-2.html | Waiting for error page succeeded
[task 2021-05-03T19:00:38.156Z] 19:00:38 INFO - TEST-FAIL | docshell/test/mochitest/test_bug529119-2.html | The author of the test has indicated that flaky timeouts are expected. Reason: untriaged
[task 2021-05-03T19:00:38.157Z] 19:00:38 INFO - Buffered messages logged at 19:00:30
[task 2021-05-03T19:00:38.157Z] 19:00:38 INFO - TEST-PASS | docshell/test/mochitest/test_bug529119-2.html | Waiting for original page succeeded
[task 2021-05-03T19:00:38.157Z] 19:00:38 INFO - TEST-PASS | docshell/test/mochitest/test_bug529119-2.html | Is on the previous page
[task 2021-05-03T19:00:38.158Z] 19:00:38 INFO - Buffered messages finished
[task 2021-05-03T19:00:38.158Z] 19:00:38 INFO - TEST-UNEXPECTED-ERROR | docshell/test/mochitest/test_bug529119-2.html | unexpected-crash-dump-found - This test left crash dumps behind, but we weren't expecting it to!
[task 2021-05-03T19:00:38.159Z] 19:00:38 INFO - Found unexpected crash dump file /var/folders/ds/9_p3r7s95p7_1rx9lt0_5wg0000014/T/tmppfg2c9j5.mozrunner/minidumps/9E446040-33D9-44D5-AE13-CFC962C5C4AC.dmp.
[task 2021-05-03T19:00:38.159Z] 19:00:38 INFO - TEST-UNEXPECTED-CRASH | docshell/test/mochitest/test_bug529119-2.html | Finished in 1553ms
[task 2021-05-03T19:00:38.159Z] 19:00:38 INFO - {'runtime': 1553}
[task 2021-05-03T19:00:38.159Z] 19:00:38 INFO - TEST-INFO took 1553ms
[task 2021-05-03T19:00:38.160Z] 19:00:38 INFO - TEST-START | docshell/test/mochitest/test_bug530396.html```
Comment 1•4 years ago
|
||
It looks like Document::DispatchContentLoadedEvents() is running very late in shutdown.
Assignee | ||
Comment 2•4 years ago
|
||
Yeah, and deeper inside in the event processing we try to create a new service worker registrar which then fails during resurrection, despite the comment in the beginning saying, we want to avoid resurrection.
To my feeling, we should improve and use AppShutdown::IsShuttingDown()
more often. I see two ToDos here:
-
The current implementation of
AppShutdown::IsShuttingDown()
relies onAppShutdown::OnShutdownConfirmed();
being called. Instead it should probably just check, ifAppShutdown::GetCurrentShutdownPhase() >= ShutdownPhase::AppShutdownConfirmed
. -
Introduce more checks for shutdown in the following places:
- generation of new global objects (like the service worker registrar here)
- dispatching of particular events, throwing
NS_ERROR_ILLEGAL_DURING_SHUTDOWN
A more advanced architectural change could be, to associate to each event type a latest possible shutdown phase for this event and check that directly during dispatch (with an assert in debug and an if in release). The default would be again ShutdownPhase::AppShutdownConfirmed
.
Comment 3•4 years ago
|
||
I think this is https://bugzilla.mozilla.org/show_bug.cgi?id=1697972#c13 where I found that I think this set of symptoms is happening because a hidden window is being accidentally created that races xpcshell shutdown on windows and this starts up the ServiceWorkerManager. Fixing that fell off my radar but is likely the right fix to fix a whole class of this problem.
Updated•4 years ago
|
Comment 4•4 years ago
|
||
That said, I do agree that ServiceWorkerManager and other singleton getters should check the shutdown phase and return an error when in shutdown. And potentially we should make sure xpcshell's shutdown routine makes sure that the shutdown checks correctly indicate this.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 9•3 years ago
|
||
We should trace a path forward here (at least in terms of todos/bugs).
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 14•3 years ago
|
||
The high frequency failure that started a few hours ago with the same assertion but a different stack was handled in bug 1753598.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 22•2 years ago
|
||
(In reply to Intermittent Failures Robot from comment #21)
4 failures in 3497 pushes (0.001 failures/push) were associated with this bug in the last 7 days.
Repository breakdown:
- autoland: 2
- mozilla-central: 2
Platform and build breakdown:
- windows10-32-2004-qr: 4
- debug: 4
For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1709184&startday=2022-07-18&endday=2022-07-24&tree=all
At least one instance here seems to be caused by a late initialization of MCSInfo
. This singleton has been introduced by bug 1766307.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 27•2 years ago
|
||
Remove XPCOM shutdown listener from MCSInfo to remove possibility of the listener starting after the shutdown process has already begun.
Updated•2 years ago
|
Comment 28•2 years ago
|
||
Comment on attachment 9287599 [details]
Bug 1709184 - Intermittent Assertion failure: false (ClearOnShutdown for phase that already was cleared), at /builds/worker/checkouts/gecko/xpcom/base/ClearOnShutdown.cpp:20
Beta/Release Uplift Approval Request
- User impact if declined: None
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Small patch that simplifies deallocation logic + adds several additional asserts. Treeherder link showing failure reduction: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2022-07-17&endday=2022-07-28&tree=all&bug=1709184
- String changes made/needed:
- Is Android affected?: Unknown
Comment 29•2 years ago
|
||
Moved the patch over from Bug 1781259 as it fixes this issue.
Comment 30•2 years ago
|
||
Would like to update the patch's commit message after moving it over, but Phabricator isn't cooperating and I have to call it a night. I'll check back tomorrow + plan to update it to be:
Remove XPCOM shutdown listener from MCSInfo to remove possibility of the listener starting after the shutdown process has already begun.
Thanks!
Assignee | ||
Comment 31•2 years ago
|
||
(In reply to az [:az] from comment #29)
Moved the patch over from Bug 1781259 as it fixes this issue.
As shown by the past, there might be different causes for this assertion, not just one (though this is probably the most frequent currently). I'd prefer in these cases to create new bugs for specific fixes and make this bug depend on them.
Comment 32•2 years ago
|
||
Reopening as this will be fixed in Bug 1782313
Comment hidden (Intermittent Failures Robot) |
Comment 35•2 years ago
|
||
Comment on attachment 9287599 [details]
Bug 1709184 - Intermittent Assertion failure: false (ClearOnShutdown for phase that already was cleared), at /builds/worker/checkouts/gecko/xpcom/base/ClearOnShutdown.cpp:20
Revision D152954 was moved to bug 1782313. Setting attachment 9287599 [details] to obsolete.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 41•2 years ago
|
||
These stopped failing beginning with this merge https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=cbd753d186199d816e1d097631573f601932b96e
TH link.
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 45•2 years ago
|
||
There seem to be some cases where we dispatch a RunOnShutdown
to the main thread well before we have entered shutdown but end up executing that runnable only after we reached XPCOMShutdown
.
We can either check inside the lambda of the "MCSInfo::MCSInfo" runnable if we are in shutdown (and just do the deletion in case) or we might want to check if this assert is actually needed or if the release behavior isn't always the right behavior. Or we want to have a thread-safe InsertIntoShutdownList
such that we can avoid the thread hopping just for inserting?
Comment 46•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #45)
There seem to be some cases where we dispatch a
RunOnShutdown
to the main thread well before we have entered shutdown but end up executing that runnable only after we reachedXPCOMShutdown
.We can either check inside the lambda of the "MCSInfo::MCSInfo" runnable if we are in shutdown (and just do the deletion in case) or we might want to check if this assert is actually needed or if the release behavior isn't always the right behavior. Or we want to have a thread-safe
InsertIntoShutdownList
such that we can avoid the thread hopping just for inserting?
The assertion is needed, unfortunately. Without it, we could run into all sorts of undefined behavior when things run too late in shutdown. The most common manifestation, though, is that someone calls a getter that calls ClearOnShutdown
after the target shutdown phase, ClearOnShutdown
immediately clears the target, the getter returns null, and we then null pointer crash instead of asserting. The assertion is mainly there to make it clearer where the failure really is, rather than trying to work backward from a null pointer crash.
For this particular case, I think the solution may be to call NS_ProcessPendingEvents
before advancing to the XPCOMShutdown
phase like we do before progressing to later shutdown phases. That way any runnable dispatched before XPCOMShutdown
will be guaranteed to run in time to safely register its shutdown observers.
That said... Are you sure that runnable is actually guaranteed to be dispatched before shutdown has started? Because, without a shutdown blocker, you generally can't guarantee that for background thread code. And I don't see a shutdown blocker anywhere in that code.
Assignee | ||
Comment 47•2 years ago
|
||
(In reply to Kris Maglione [:kmag] from comment #46)
The assertion is needed, unfortunately. Without it, we could run into all sorts of undefined behavior when things run too late in shutdown. The most common manifestation, though, is that someone calls a getter that calls
ClearOnShutdown
after the target shutdown phase,ClearOnShutdown
immediately clears the target, the getter returns null, and we then null pointer crash instead of asserting. The assertion is mainly there to make it clearer where the failure really is, rather than trying to work backward from a null pointer crash.
It seems to me that in release builds InsertIntoShutdownList
(used by RunOnShutdown
) handles this by just calling aObserver->Shutdown();
immediately. Obviously this can still have any sorts of unexpected side-effects, so I think the intention of that assert is generally right.
For this particular case, I think the solution may be to call
NS_ProcessPendingEvents
before advancing to theXPCOMShutdown
phase like we do before progressing to later shutdown phases. That way any runnable dispatched beforeXPCOMShutdown
will be guaranteed to run in time to safely register its shutdown observers.
Thanks, that seems like something we should do, yes. Probably also before XPCOMWillShutdown
.
That said... Are you sure that runnable is actually guaranteed to be dispatched before shutdown has started? Because, without a shutdown blocker, you generally can't guarantee that for background thread code. And I don't see a shutdown blocker anywhere in that code.
At least MCSInfo::GetInstance
does check AppShutdown::IsInOrBeyond(ShutdownPhase::AppShutdownConfirmed)
at its beginning, but as we seem to be off-main-thread here that can race right afterwards with whatever is happening on the main thread? This would be probably a candidate singleton for something like bug 1796566 (and gives me some input for the thread safety questions I have there). Instantiating singletons and thus adding a blocker for them should be protected by a mutex that prevents us from advancing shutdown phases in parallel in such situations, I think. But in the meantime let's try with processing the pending events before each AdvanceShutdownPhase
to improve the situation here (and potentially for other off-main-thread instantiated singletons we already have and that follow the same pattern).
Assignee | ||
Comment 48•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 49•2 years ago
|
||
Depends on D160176
Comment 50•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #47)
It seems to me that in release builds
InsertIntoShutdownList
(used byRunOnShutdown
) handles this by just callingaObserver->Shutdown();
immediately. Obviously this can still have any sorts of unexpected side-effects, so I think the intention of that assert is generally right.
Yes, that's what I mean. The typical pattern is:
static StaticRefPtr<Thing> sThing;
if (!sThing) {
sThing = new Thing();
ClearOnShutdown(&sThing);
}
return sThing;
Then the shutdown callback runs immediately, the function returns null, and the caller crashes on the null deref. But there are often worse problems from just calling the constructor during shutdown, since it often winds up reinitializing things which have already been shutdown, or calling into code which isn't designed to work during shutdown.
That said... Are you sure that runnable is actually guaranteed to be dispatched before shutdown has started? Because, without a shutdown blocker, you generally can't guarantee that for background thread code. And I don't see a shutdown blocker anywhere in that code.
At least
MCSInfo::GetInstance
does checkAppShutdown::IsInOrBeyond(ShutdownPhase::AppShutdownConfirmed)
at its beginning, but as we seem to be off-main-thread here that can race right afterwards with whatever is happening on the main thread?
Yeah, technically that could race, but I guess it's unlikely enough that it probably isn't especially relevant to the assertion. Processing the pending events is likely enough to at least make the intermittent go away.
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•2 years ago
|
Comment 52•2 years ago
|
||
Comment 53•2 years ago
|
||
Backed out for xpcshell failures.
Assignee | ||
Comment 54•2 years ago
|
||
(In reply to Natalia Csoregi [:nataliaCs] from comment #53)
* [SUMMARY: AddressSanitizer: SEGV /builds/worker/checkouts/gecko/xpcom/base/nsISupportsImpl.cpp:53:12 in nsAutoOwningEventTarget](https://treeherder.mozilla.org/logviewer?job_id=396846889&repo=autoland&lineNumber=3527)
This one is interesting. IIUC we want to shut down the component manager and in doing so we want to create an AutoSafeJSContext
that ends up trying to assign something to mUnprivilegedJunkScope
.
This wants to create a WeakReference
and the constructor does a MOZ_WEAKPTR_INIT_THREAD_SAFETY_CHECK
which wants to create a nsAutoOwningEventTarget
that will find GetCurrentSerialEventTarget()
being nullptr
.
In terms what the patch might have changed here: Indeed we removed this NS_ProcessPendingEvents(thread);
which inside AdvanceShutdownPhase
is only executed, if we had to notify our observers - which is unlikely here, as we just killed the observer service a few lines above.
This would be repaired by D160628, I think, so landing the entire stack might be the better call here.
Comment 55•2 years ago
|
||
Comment 56•2 years ago
|
||
Backed out 3 changesets (Bug 1768581, Bug 1709184) for causing multiple failures on LateWriteChecks.cpp.
Backout link
Push with failures <--> gpu
Failure Log
Also R1 Failure Log, Also R2 Failure Log, ...
Comment 58•2 years ago
|
||
Comment 59•2 years ago
|
||
Backed out for causing xpcshell failures
- Backout link
- Push with failures
- Failure Log
- Failure line: TEST-UNEXPECTED-FAIL | extensions/pref/autoconfig/test/unit/test_autoconfig.js | xpcshell return code: -6
SUMMARY: ThreadSanitizer: SEGV /builds/worker/checkouts/gecko/xpcom/base/nsISupportsImpl.cpp:53:12 in nsAutoOwningEventTarget
TEST-UNEXPECTED-FAIL | extensions/pref/autoconfig/test/unit/test_autoconfig_nonascii.js | xpcshell return code: -6
SUMMARY: ThreadSanitizer: SEGV /builds/worker/checkouts/gecko/xpcom/base/nsISupportsImpl.cpp:53:12 in nsAutoOwningEventTarget
TEST-UNEXPECTED-FAIL | extensions/pref/autoconfig/test/unit/test_autoconfig_no_sandbox.js | xpcshell return code: -6
SUMMARY: ThreadSanitizer: SEGV /builds/worker/checkouts/gecko/xpcom/base/nsISupportsImpl.cpp:53:12 in nsAutoOwningEventTarget
Can you also please take a look at this? https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&searchStr=android%2Cbrowsertime&revision=bd3bc1478107b9a72ce5dded92b7821e87a0e052&selectedTaskRun=TQNO9jTxSQ2KK0ouVXURHA.0
Comment 60•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Comment 61•2 years ago
|
||
Comment 62•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/9dcaa7753fc6
https://hg.mozilla.org/mozilla-central/rev/dffbc74bbaa9
Updated•2 years ago
|
Description
•