Closed
Bug 1396803
Opened 7 years ago
Closed 6 years ago
Intermittent test_tabbar_session_restore_button.py TestNoTabbarSessionRestoreButton.test_pref_off_button_does_not_show | application crashed [@ mozilla::(anonymous namespace)::RunWatchdog(void*)]
Categories
(Testing :: Firefox UI Tests, defect, P5)
Tracking
(firefox57 disabled, firefox58 disabled)
RESOLVED
INVALID
People
(Reporter: intermittent-bug-filer, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash, intermittent-failure, Whiteboard: [stockwell disabled])
Crash Data
Comment 1•7 years ago
|
||
So there is a MOZCRASH assertion first which triggers a shutdown of Firefox:
19:12:41 INFO - Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:160
19:12:41 INFO - #01: PR_GetThreadName[/Users/cltbld/tasks/task_1504400945/build/application/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x142359]
19:12:41 INFO - #02: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x405a]
19:12:41 INFO - #03: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x3fd7]
But then the shutdown takes that long that Firefox gets killed:
19:14:01 INFO - PROCESS-CRASH | test_tabbar_session_restore_button.py TestNoTabbarSessionRestoreButton.test_pref_off_button_does_not_show | application crashed [@ mozilla::(anonymous namespace)::RunWatchdog(void*)]
19:14:01 INFO - Crash dump filename: /Users/cltbld/tasks/task_1504400945/build/tmppoVHnH.mozrunner-1504404463/minidumps/FB125D99-01D5-488B-ABD7-F971BE297EF7.dmp
19:14:01 INFO - Operating system: Mac OS X
19:14:01 INFO - 10.10.5 14F27
19:14:01 INFO - CPU: amd64
19:14:01 INFO - family 6 model 69 stepping 1
19:14:01 INFO - 4 CPUs
19:14:01 INFO -
19:14:01 INFO - GPU: UNKNOWN
19:14:01 INFO -
19:14:01 INFO - Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
19:14:01 INFO - Crash address: 0x0
19:14:01 INFO - Process uptime: 298 seconds
19:14:01 INFO -
19:14:01 INFO - Thread 6 (crashed)
19:14:01 INFO - 0 XUL!mozilla::(anonymous namespace)::RunWatchdog(void*) [nsTerminator.cpp:4147186b278a : 160 + 0x0]
19:14:01 INFO - rax = 0x0000000000000000 rdx = 0x00007fff7435f1f8
19:14:01 INFO - rcx = 0x0000000000000000 rbx = 0x000000000000003f
19:14:01 INFO - rsi = 0x00049b0000049b00 rdi = 0x00049a0000049b03
19:14:01 INFO - rbp = 0x000000012ed80ec0 rsp = 0x000000012ed80eb0
19:14:01 INFO - r8 = 0x000000012ed80e60 r9 = 0x000000012ed81000
19:14:01 INFO - r10 = 0x00007fff849953ef r11 = 0x00007fff849953c0
19:14:01 INFO - r12 = 0x0000000000022a27 r13 = 0x00000000000008ff
19:14:01 INFO - r14 = 0x000000011b744ff0 r15 = 0x0000000000000001
19:14:01 INFO - rip = 0x000000010fc62130
19:14:01 INFO - Found by: given as instruction pointer in context
19:14:01 INFO - 1 libnss3.dylib!_pt_root [ptthread.c:4147186b278a : 216 + 0x3]
19:14:01 INFO - rbx = 0x000000012925f4a0 rbp = 0x000000012ed80ef0
19:14:01 INFO - rsp = 0x000000012ed80ed0 r12 = 0x0000000000022a27
19:14:01 INFO - r13 = 0x00000000000008ff r14 = 0x000000012ed81000
19:14:01 INFO - r15 = 0x0000000000000001 rip = 0x000000010b442359
19:14:01 INFO - Found by: call frame info
19:14:01 INFO - 2 libsystem_pthread.dylib!_pthread_body + 0x83
19:14:01 INFO - rbx = 0x000000012ed81000 rbp = 0x000000012ed80f10
19:14:01 INFO - rsp = 0x000000012ed80f00 r12 = 0x0000000000022a27
19:14:01 INFO - r13 = 0x00000000000008ff r14 = 0x000000012925f4a0
19:14:01 INFO - r15 = 0x000000010b442240 rip = 0x00007fff8499805a
19:14:01 INFO - Found by: call frame info
19:14:01 INFO - 3 libsystem_pthread.dylib!_pthread_start + 0xb0
19:14:01 INFO - rbp = 0x000000012ed80f50 rsp = 0x000000012ed80f20
19:14:01 INFO - rip = 0x00007fff84997fd7
19:14:01 INFO - Found by: previous frame's frame pointer
19:14:01 INFO - 4 libsystem_pthread.dylib!thread_start + 0xd
19:14:01 INFO - rbp = 0x000000012ed80f78 rsp = 0x000000012ed80f60
19:14:01 INFO - rip = 0x00007fff849953ed
19:14:01 INFO - Found by: previous frame's frame pointer
19:14:01 INFO - 5 libnss3.dylib + 0x142240
19:14:01 INFO - rsp = 0x000000012ed81030 rip = 0x000000010b442240
19:14:01 INFO - Found by: stack scanning
Isn't the watchdog thread used to terminate hanging threads after ~60s? So is it hanging itself here?
I will leave this bug in fx-ui tests for now given the low number of crashes. I will take another look if the crash is spiking up.
Comment 2•7 years ago
|
||
Maybe this is related to bug 1395504, which doesn't let Firefox successfully shutdown.
Depends on: 1395504
Updated•7 years ago
|
Blocks: RunWatchdogShutdownhang
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 6•7 years ago
|
||
this started on september 26 as osx/debug failures for en-US local tests.
here is a recent log file:
https://public-artifacts.taskcluster.net/QkNSqMfwT8Sm446EvbEp6w/0/public/logs/live_backing.log
and data from the log:
14:58:54 INFO - Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:160
14:58:55 INFO - #01: PR_GetThreadName[/Users/cltbld/tasks/task_1506462723/build/application/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x1443a9]
14:58:55 INFO - #02: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x405a]
14:58:55 INFO - #03: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x3fd7]
14:59:50 INFO - TEST-UNEXPECTED-ERROR | test_tabbar_session_restore_button.py TestNoTabbarSessionRestoreButton.test_pref_off_button_does_not_show | IOError: Process killed because a requested application quit did not happen within 120s. Check gecko.log for errors.
14:59:50 INFO - Traceback (most recent call last):
14:59:50 INFO - File "/Users/cltbld/tasks/task_1506462723/build/venv/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py", line 137, in run
14:59:50 INFO - self.setUp()
14:59:50 INFO - File "/Users/cltbld/tasks/task_1506462723/build/tests/firefox-ui/tests/functional/sessionstore/test_tabbar_session_restore_button.py", line 99, in setUp
14:59:50 INFO - super(TestNoTabbarSessionRestoreButton, self).setUp(restore_button_pref=0)
14:59:50 INFO - File "/Users/cltbld/tasks/task_1506462723/build/tests/firefox-ui/tests/functional/sessionstore/test_tabbar_session_restore_button.py", line 24, in setUp
14:59:50 INFO - self.marionette.quit(in_app=True)
14:59:50 INFO - File "/Users/cltbld/tasks/task_1506462723/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 23, in _
14:59:50 INFO - return func(*args, **kwargs)
14:59:50 INFO - File "/Users/cltbld/tasks/task_1506462723/
I am not familiar with the crash/error, possibly this is known already.
I did some backfilling/retriggers:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=osx%20debug%20en-us%20local&tochange=44c358447666fd00d38fe9b72724d194b874e682&fromchange=eab929170ab0898bc23d88509efef9f8b3817152
Ideally when those results come in we will have an idea of when this started and how frequent it is.
:whimboo, can you help prioritize this and ensure the root cause is documented.
Flags: needinfo?(hskupin)
Updated•7 years ago
|
Whiteboard: [stockwell needswork]
Comment 7•7 years ago
|
||
This crash happens because a shutdown happened longer than 120s! Usually this should not happen because the background monitor of Firefox should kill any thread itself after ~60s during shutdown. But that didn't happen here.
Maybe that is related to bug 1395504 and there was added on condition which also triggers it.
But checking recent logs I also see such kind of assertions:
11:42:52 INFO - Assertion failure: tree.empty(), at /builds/worker/workspace/build/src/js/src/ds/MemoryProtectionExceptionHandler.cpp:67
11:42:52 INFO - Assertion failure: 0 == rv, at /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:292
This brings me to bug 1404258, which was also filed recently due to the exact assertion error. Bobby removed the diagnostic code yesterday via bug 1403397. So in that case we should see a drop of failures today for that overlaying issue.
Depends on: 1404258
Flags: needinfo?(hskupin)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 11•7 years ago
|
||
Not sure whom exactly to ask about this crash during a long shutdown. But maybe Mike can give us a hint.
Flags: needinfo?(mconley)
Comment hidden (Intermittent Failures Robot) |
Comment 13•7 years ago
|
||
The test here is covering session restore. It's component might not be the right, but better than Testing/Marionette.
Comment 14•7 years ago
|
||
:whimboo any idea why this would be osx only?
Comment 15•7 years ago
|
||
It might be bug 1395504 here too which is blocking Firefox from shutting down correctly. It's already in the dependency list.
Comment hidden (Intermittent Failures Robot) |
Comment 17•7 years ago
|
||
If it helps, the spike started on Beta with this push:
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?changeset=6e3b901aec51
Comment hidden (Intermittent Failures Robot) |
Comment 19•7 years ago
|
||
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a27d06c28634
Disable test_tabbar_session_restore_button.py TestNoTabbarSessionRestoreButton.test_pref_off_button_does_not_show on osx for frequent failures. r=me, a=test-only
Comment 20•7 years ago
|
||
please remember to enable this test when fixing the failure!
Keywords: leave-open
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Comment 21•7 years ago
|
||
status-firefox57:
--- → disabled
status-firefox58:
--- → disabled
Comment 22•7 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 25•7 years ago
|
||
no traction in bug 1395504, I pinged again there- we will be at danger of disable-recommended by the end of the month.
Comment 26•7 years ago
|
||
in looking at this again, it looks fixed on oct 12th when I disabled the test- my mistake for overlooking the high number of failures.
Flags: needinfo?(mconley)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 29•6 years ago
|
||
This test doesn't exist anymore. Bug 1451709 removed it.
Status: NEW → RESOLVED
Closed: 6 years ago
Depends on: 1451709
Keywords: leave-open
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•