Closed Bug 1379081 Opened 7 years ago Closed 5 years ago

Intermittent test_tabbar_session_restore_button.py TestTabbarSessionRestoreButton.test_click_restore | application crashed [@ mozilla::(anonymous namespace)::RunWatchdog(void*)]

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox56 --- affected
firefox57 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(4 keywords)

This is actually a hang during shutdown of the application. I see a couple of warnings/errors and a final assertion: 19:02:39 INFO - 1499392959713 Marionette TRACE 2 -> [0,174,"quit",{"flags":["eAttemptQuit"]}] 19:02:39 INFO - 1499392959716 Marionette INFO New connections will no longer be accepted 19:02:39 INFO - [Parent 1612] WARNING: NS_ENSURE_TRUE(weak) failed: file /home/worker/workspace/build/src/dom/base/nsFrameMessageManager.cpp, line 343 19:02:40 INFO - --DOCSHELL 0x12527f000 == 8 [pid = 1612] [id = {51337ebd-ca6a-104e-b30e-499164194a11}] 19:02:40 INFO - 1499392960284 Marionette TRACE 2 <- [1,174,null,{"cause":"shutdown"}] 19:02:40 INFO - [Parent 1612] WARNING: cannot post event if not initialized: file /home/worker/workspace/build/src/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 290 19:02:40 INFO - [Parent 1612] WARNING: NS_ENSURE_TRUE(weak) failed: file /home/worker/workspace/build/src/dom/base/nsFrameMessageManager.cpp, line 343 19:02:40 INFO - 1499392960355 Marionette DEBUG Closed connection 2 19:02:40 INFO - [Child 1613] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:40 INFO - ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost 19:02:40 INFO - [Child 1613] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:40 INFO - [Child 1614] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:40 INFO - ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost 19:02:40 INFO - [Child 1614] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:40 INFO - [Child 1615] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:40 INFO - [Child 1615] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:40 INFO - [Child 1613] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330 [..] 19:02:40 INFO - Assertion failure: 0 == rv, at /home/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:870 19:02:40 INFO - Redirecting call to abort() to mozalloc_abort 19:02:40 INFO - Hit MOZ_CRASH() at /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33 19:02:40 INFO - nsStringStats 19:02:40 INFO - => mAllocCount: 18517 19:02:40 INFO - => mReallocCount: 344 19:02:40 INFO - => mFreeCount: 18517 19:02:40 INFO - => mShareCount: 15093 19:02:40 INFO - => mAdoptCount: 1967 19:02:40 INFO - Assertion failure: 0 == rv, at /home/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:292 19:02:40 INFO - Redirecting call to abort() to mozalloc_abort 19:02:40 INFO - Hit MOZ_CRASH() at /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33 [..] 19:02:41 INFO - *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping 19:02:43 INFO - --DOCSHELL 0x118aaa800 == 7 [pid = 1612] [id = {4aff02b3-457b-d14f-8ed3-9f932689ed2e}] 19:02:43 INFO - --DOCSHELL 0x126b71800 == 6 [pid = 1612] [id = {73ba26ce-d4f9-fb45-8d34-ee8c815103e3}] 19:02:43 INFO - 1499392963204 Marionette DEBUG Received observer notification "xpcom-shutdown" 19:02:43 INFO - 1499392963205 Marionette DEBUG Received observer notification "xpcom-shutdown" 19:02:43 INFO - [Parent 1612] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:43 INFO - [Parent 1612] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:44 INFO - [Parent 1612] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330 19:02:44 INFO - [Parent 1612] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330 Benjamin, is that information helpful to see what could have been caused this hang?
Component: Marionette → Startup and Profile System
Flags: needinfo?(benjamin)
Keywords: assertion, hang
Product: Testing → Toolkit
Version: Version 3 → unspecified
Without stacks this is mostly meaningless. Automation should be providing stacks from assertions (typically because they save minidumps). For the hang itself, if the harness is killing the process after a hang it should take a minidump before it kills it off and dump the stack.
Flags: needinfo?(benjamin)
Hm, we have a 65s timeout for the process to be killed. Isn't Firefox killed after 60s if a thread hangs? Has this timeout been increased? Or should we wait a bit longer similar to application update which has 70s set?
Flags: needinfo?(benjamin)
Don't count on the internal shutdown hang monitor to be reliable. Depending on the context, it might itself deadlock or otherwise fail. The harness should do a dump+kill procedure after some reasonable timeout.
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4) > Don't count on the internal shutdown hang monitor to be reliable. Depending > on the context, it might itself deadlock or otherwise fail. The harness > should do a dump+kill procedure after some reasonable timeout. Which timeout would you find reasonable? I don't think it would hurt to increase it to somewhat 90 or even 120s.
Oh, I don't care about the actual value. It could be 30s for all I care. What I care about is the harness collecting a stack.
Ok, lets see if I can get to bug 1348961 soon.
Depends on: 1348961
Component: Startup and Profile System → Untriaged
Product: Toolkit → Core
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
The problem here is actually a too long shutdown which causes a crash: 01:18:42 INFO - Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:160 01:18:42 INFO - #01: PR_GetThreadName[/Users/cltbld/tasks/task_1505808525/build/application/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x144389] 01:18:42 INFO - #02: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x405a] 01:18:42 INFO - #03: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x3fd7]
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Summary: Intermittent test_tabbar_session_restore_button.py TestTabbarSessionRestoreButton.test_click_restore | IOError: Process killed because a requested application quit did not happen within 65s. Check gecko.log for errors. → Intermittent test_tabbar_session_restore_button.py TestTabbarSessionRestoreButton.test_click_restore | application crashed [@ mozilla::(anonymous namespace)::RunWatchdog(void*)]
Moving from Core::Untriaged to Core::General https://bugzilla.mozilla.org/show_bug.cgi?id=1407598
Component: Untriaged → General
Status: REOPENED → RESOLVED
Closed: 7 years ago5 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.