Hit MOZ_CRASH(Shutdown hanging after all known phases and workers finished.) at src/toolkit/components/terminator/nsTerminator.cpp:233 | application crashed [@ PR_NativeRunThread(void*)]
Categories
(Core :: DOM: Workers, defect, P2)
Tracking
()
People
(Reporter: tsmith, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])
Crash Data
We have seen this a few times recently while fuzzing. It just randomly happened while trying to get an rr trace for another issue.
A Pernosco session is available here: https://pernos.co/debug/pB_A9fWne6ZRjl2hOCO05Q/index.html
Hit MOZ_CRASH(Shutdown hanging after all known phases and workers finished.) at src/toolkit/components/terminator/nsTerminator.cpp:233
#0 0xe4356b707c4 in mozilla::(anonymous namespace)::RunWatchdog(void*) /home/twsmith/code/mozilla-central/toolkit/components/terminator/nsTerminator.cpp:233:5
#1 0x7f188f590444 in _pt_root /home/twsmith/code/mozilla-central/nsprpub/pr/src/pthreads/ptthread.c:201:5
#2 0x535c1cb5a6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
#3 0x535c1d0a4a3e in clone /build/glibc-2ORdQG/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Comment 1•4 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Updated•4 years ago
|
Updated•3 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•3 years ago
|
Updated•3 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 10•3 years ago
|
||
Update:
There have been 30 failures within the last 7 days:
- 3 failures on Windows 10 x86 WebRender debug
- 4 failures on windows10-64-2004-qr debug
- 23 failures on Windows 10 x64 WebRender debug
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=347032414&repo=mozilla-central&lineNumber=44423
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - PROCESS-CRASH | Last test finished | application crashed [@ PR_NativeRunThread(void*)]
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - Mozilla crash reason: MOZ_CRASH(Shutdown hanging after all known phases and workers finished.)
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - Crash dump filename: C:\Users\task_1627813005\AppData\Local\Temp\tmput6xzc5c.mozrunner\minidumps\483a0348-c7bb-4eb2-b5d2-a34ba6ae7dc5.dmp
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - Operating system: Windows NT
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - 10.0.17134
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - CPU: amd64
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - family 6 model 85 stepping 7
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - 8 CPUs
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO -
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO - GPU: UNKNOWN
[task 2021-08-01T10:47:19.563Z] 10:47:19 INFO -
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - Crash reason: EXCEPTION_BREAKPOINT
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - Crash address: 0xc086e128
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - Process uptime: 207 seconds
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO -
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - Thread 20 (crashed) - Shutdown Hang Terminator 0 xul.dll!mozilla::`anonymous namespace'::RunWatchdog(void*) [nsTerminator.cpp:c59236b26192d1299ae1353fefbdb9c147e01aa8 : 246 + 0x0]
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rax = 0x00007ffdc4395e41 rdx = 0x0000000000000000
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rcx = 0x00007ffdf0498880 rbx = 0x00007ffde60790f7
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rsi = 0x00007ffdfb1e3ca0 rdi = 0x0000000000000276
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rbp = 0x00007ffde60d1d08 rsp = 0x0000009d8393fbd0
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - r8 = 0x0000009d8393fd80 r9 = 0x0000000000000021
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - r10 = 0x0000009d8393fd30 r11 = 0x00007ffdfdae0000
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - r12 = 0x000002564c849138 r13 = 0x000002564c849148
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - r14 = 0x0000000000000000 r15 = 0x00007ffde6079118
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rip = 0x00007ffdc086e128
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - Found by: given as instruction pointer in context
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - 1 nss3.dll!PR_NativeRunThread(void*) [pruthr.c:c59236b26192d1299ae1353fefbdb9c147e01aa8 : 399 + 0xe]
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rbx = 0x00007ffde60790f7 rbp = 0x00007ffde60d1d08
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - rsp = 0x0000009d8393fc20 r12 = 0x000002564c849138
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - r13 = 0x000002564c849148 r14 = 0x0000000000000000
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - r15 = 0x00007ffde6079118 rip = 0x00007ffde5f29462
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - Found by: call frame info
[task 2021-08-01T10:47:19.564Z] 10:47:19 INFO - 2 nss3.dll!pr_root(void*) [w95thred.c:c59236b26192d1299ae1353fefbdb9c147e01aa8 : 139 + 0xd]
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - rbx = 0x00007ffde60790f7 rbp = 0x00007ffde60d1d08
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - rsp = 0x0000009d8393fca0 r12 = 0x000002564c849138
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - r13 = 0x000002564c849148 r14 = 0x0000000000000000
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - r15 = 0x00007ffde6079118 rip = 0x00007ffde5f19e41
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - Found by: call frame info
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - 3 ucrtbase.dll!RtlpHpSegPageRangeShrink + 0xda
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - rbx = 0x00007ffde60790f7 rbp = 0x00007ffde60d1d08
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - rsp = 0x0000009d8393fcd0 r12 = 0x000002564c849138
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - r13 = 0x000002564c849148 r14 = 0x0000000000000000
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - r15 = 0x00007ffde6079118 rip = 0x00007ffdfae6c4be
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO - Found by: call frame info
[task 2021-08-01T10:47:19.565Z] 10:47:19 INFO -
Jens, could you help us assign this to someone?
Thank you.
Comment hidden (Intermittent Failures Robot) |
Reporter | ||
Updated•3 years ago
|
Updated•3 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 14•3 years ago
|
||
So this one looks interesting.
If I read the stacks right, the main thread is waiting for the IO thread to terminate while the IO thread is waiting for process_
to terminate while processing a ChildReaper
runnable. I assume, that the process we are waiting for is a child process (can we see this somewhere?). This wait has been introduced in bug 1268559 fairly recently (5y ago) wrt the age of the IPC code around the join (9y).
IIUC the situation, it feels wrong to me that we wait endlessly for a child process to terminate (endlessly until the shutdown terminator triggers). I would expect it to be totally irrelevant for the parent process (and its final task to cleanly save the session state) if there is any child process alife after a certain shutdown stage (and we are in a very late stage here). We rather might want to kill them forced if they do not react to the quit message timely?
Comment 15•3 years ago
|
||
I believe this may be a similar issue to bug 1719481, as :mccr8 noted when linking the bugs together, so I think my bug 1719481 comment 10 also applies here. I'm guessing that the main difference is just that that bug is running under ccov, and this one is under debug, so the crashes probably look a bit different.
I think I'm going to dupe the bugs for now, and we can re-open this one if it turns out to be different.
Comment 16•3 years ago
|
||
Thanks for clarifying, I did not read well through the other bug.
Description
•