Open Bug 1805613 Opened 2 years ago Updated 1 year ago

Intermittent Main app process exited normally application crashed [@ __pthread_cond_wait] after [Parent 1500, IPC I/O Parent] WARNING: Process X hanging at shutdown; attempting crash report (fatal error)

Categories

(Core :: DOM: Workers, defect, P2)

Unspecified
Linux
defect

Tracking

()

ASSIGNED
Tracking Status
firefox-esr102 --- unaffected
firefox108 --- unaffected
firefox109 --- unaffected
firefox110 --- affected

People

(Reporter: intermittent-bug-filer, Assigned: edenchuang)

References

(Depends on 1 open bug, Regression)

Details

(4 keywords, Whiteboard: [retriggered][stockwell disable-recommended])

Crash Data

Attachments

(1 file)

Filed by: nfay [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=399589863&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ASQC1iyGRsycGczAI9cpiQ/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ASQC1iyGRsycGczAI9cpiQ/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2022-12-14T04:41:46.474Z] 04:41:46     INFO - [Child 1770, Main Thread] WARNING: '!ipcActor->SendRecordDiscardedData(discardedData)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:297
[task 2022-12-14T04:41:55.926Z] 04:41:55     INFO - [Parent 1500, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:207
[task 2022-12-14T04:41:58.968Z] 04:41:58     INFO - [Parent 1500, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3387
[task 2022-12-14T04:41:59.014Z] 04:41:59     INFO - [Parent 1500, Main Thread] WARNING: NS_ENSURE_TRUE(InitStaticMembers()) failed: file /builds/worker/workspace/obj-build/dist/include/mozilla/Preferences.h:129
[task 2022-12-14T04:41:59.044Z] 04:41:59     INFO - [Parent 1500, IPC I/O Parent] WARNING: Process 1770 may be hanging at shutdown; will wait for up to 8000ms: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:175
[task 2022-12-14T04:42:02.924Z] 04:42:02     INFO - [Child 1770, Main Thread] WARNING: IPC message 'PContent::Msg_AccumulateChildHistograms' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:504
[task 2022-12-14T04:42:02.925Z] 04:42:02     INFO - [Child 1770, Main Thread] WARNING: '!ipcActor->SendAccumulateChildHistograms(histogramsToSend)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:278
[task 2022-12-14T04:42:02.926Z] 04:42:02     INFO - [Child 1770, Main Thread] WARNING: IPC message 'PContent::Msg_RecordDiscardedData' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:504
[task 2022-12-14T04:42:02.928Z] 04:42:02     INFO - [Child 1770, Main Thread] WARNING: '!ipcActor->SendRecordDiscardedData(discardedData)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:297
[task 2022-12-14T04:42:07.048Z] 04:42:07     INFO - [Parent 1500, IPC I/O Parent] WARNING: Process 1770 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:198
[task 2022-12-14T04:42:07.055Z] 04:42:07     INFO - [Parent 1500, IPC I/O Parent] WARNING: ignoring non-exit event for process 1770: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:251
[task 2022-12-14T04:42:08.055Z] 04:42:08     INFO - [Parent 1500, IPC I/O Parent] WARNING: process 1770 exited on signal 6: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:256
[task 2022-12-14T04:42:08.059Z] 04:42:08     INFO - [Parent 1500, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4660
[task 2022-12-14T04:42:08.094Z] 04:42:08     INFO - REFTEST INFO | Copy/paste: /builds/worker/fetches/minidump-stackwalk/minidump-stackwalk --symbols-url=https://symbols.mozilla.org/ --human /tmp/tmpao_o041x.mozrunner/minidumps/368bdbc0-c8bd-d1dc-925b-bf2eef2d1f69.dmp /builds/worker/workspace/build/symbols
[task 2022-12-14T04:42:13.119Z] 04:42:13     INFO - REFTEST INFO | Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/368bdbc0-c8bd-d1dc-925b-bf2eef2d1f69.dmp
[task 2022-12-14T04:42:13.237Z] 04:42:13     INFO - REFTEST PROCESS-CRASH | xpcom/string/crashtests/1113005.html (finished) | application crashed [@ __pthread_cond_wait]
[task 2022-12-14T04:42:13.237Z] 04:42:13     INFO - Crash dump filename: /tmp/tmpao_o041x.mozrunner/minidumps/368bdbc0-c8bd-d1dc-925b-bf2eef2d1f69.dmp
[task 2022-12-14T04:42:13.237Z] 04:42:13     INFO - Operating system: Linux
[task 2022-12-14T04:42:13.238Z] 04:42:13     INFO -                   4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018
[task 2022-12-14T04:42:13.238Z] 04:42:13     INFO - CPU: amd64
[task 2022-12-14T04:42:13.238Z] 04:42:13     INFO -      family 6 model 85 stepping 7
[task 2022-12-14T04:42:13.238Z] 04:42:13     INFO -      2 CPUs
[task 2022-12-14T04:42:13.239Z] 04:42:13     INFO - Linux Ubuntu 18.04 - bionic (Ubuntu 18.04.6 LTS)
[task 2022-12-14T04:42:13.239Z] 04:42:13     INFO - 
[task 2022-12-14T04:42:13.239Z] 04:42:13     INFO - Crash reason:  SIGABRT
[task 2022-12-14T04:42:13.240Z] 04:42:13     INFO - Crash address: 0x3e8000005dc
[task 2022-12-14T04:42:13.240Z] 04:42:13     INFO - Crashing instruction: `cmp rax, -0x1000`
[task 2022-12-14T04:42:13.240Z] 04:42:13     INFO - No memory accessed by instruction
[task 2022-12-14T04:42:13.240Z] 04:42:13     INFO - Process uptime: not available
[task 2022-12-14T04:42:13.240Z] 04:42:13     INFO - 
[task 2022-12-14T04:42:13.241Z] 04:42:13     INFO - Thread 0 file:// Content (crashed)
[task 2022-12-14T04:42:13.241Z] 04:42:13     INFO -  0  libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
[task 2022-12-14T04:42:13.241Z] 04:42:13     INFO -      rax = 0xfffffffffffffffc    rdx = 0x0000000000000000
[task 2022-12-14T04:42:13.241Z] 04:42:13     INFO -      rcx = 0x00007ff78ccda9f3    rbx = 0x00007ff78b937400
[task 2022-12-14T04:42:13.241Z] 04:42:13     INFO -      rsi = 0x0000000000000080    rdi = 0x00007ff78b937428
[task 2022-12-14T04:42:13.241Z] 04:42:13     INFO -      rbp = 0x00007ff78b937424    rsp = 0x00007ffc32a8ec00
[task 2022-12-14T04:42:13.242Z] 04:42:13     INFO -       r8 = 0x0000000000000000     r9 = 0x0000000000000000
[task 2022-12-14T04:42:13.242Z] 04:42:13     INFO -      r10 = 0x0000000000000000    r11 = 0x0000000000000246
[task 2022-12-14T04:42:13.242Z] 04:42:13     INFO -      r12 = 0x00007ff78b937428    r13 = 0x0000000000000000
[task 2022-12-14T04:42:13.242Z] 04:42:13     INFO -      r14 = 0x00007ff78b937300    r15 = 0x0000000000000040
[task 2022-12-14T04:42:13.242Z] 04:42:13     INFO -      rip = 0x00007ff78ccda9f3
[task 2022-12-14T04:42:13.242Z] 04:42:13     INFO -     Found by: given as instruction pointer in context
[task 2022-12-14T04:42:13.243Z] 04:42:13     INFO -  1  firefox-bin!mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) [ConditionVariable_posix.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 106 + 0xa]
[task 2022-12-14T04:42:13.243Z] 04:42:13     INFO -      rbx = 0x00007ff78b937400    rbp = 0x00007ffc32a8ece0
[task 2022-12-14T04:42:13.243Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8ecd0    r12 = 0x7fffffffffffffff
[task 2022-12-14T04:42:13.243Z] 04:42:13     INFO -      r13 = 0x00007ff78b9894c0    r14 = 0x00007ffc32a8ed68
[task 2022-12-14T04:42:13.243Z] 04:42:13     INFO -      r15 = 0x00007ff78b937400    rip = 0x0000560cfeb23879
[task 2022-12-14T04:42:13.244Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.244Z] 04:42:13     INFO -  2  firefox-bin!mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) [ConditionVariable_posix.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 113 + 0x7]
[task 2022-12-14T04:42:13.244Z] 04:42:13     INFO -      rbx = 0x00007ff78b9373e0    rbp = 0x00007ffc32a8ed50
[task 2022-12-14T04:42:13.244Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8ecf0    r12 = 0x7fffffffffffffff
[task 2022-12-14T04:42:13.244Z] 04:42:13     INFO -      r13 = 0x00007ff78b9894c0    r14 = 0x00007ffc32a8ed68
[task 2022-12-14T04:42:13.244Z] 04:42:13     INFO -      r15 = 0x00007ff78b937400    rip = 0x0000560cfeb238f6
[task 2022-12-14T04:42:13.245Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.245Z] 04:42:13     INFO -  3  libxul.so!mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) [BlockingResourceBase.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 534 + 0xf]
[task 2022-12-14T04:42:13.245Z] 04:42:13     INFO -      rbx = 0x00007ff78b9373e0    rbp = 0x00007ffc32a8eda0
[task 2022-12-14T04:42:13.245Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8ed60    r12 = 0x0000000000000000
[task 2022-12-14T04:42:13.245Z] 04:42:13     INFO -      r13 = 0x00007ff78b9894c0    r14 = 0x00007ffc32a8ed68
[task 2022-12-14T04:42:13.246Z] 04:42:13     INFO -      r15 = 0x00007ff78b937301    rip = 0x00007ff77aa2520a
[task 2022-12-14T04:42:13.246Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.246Z] 04:42:13     INFO -  4  libxul.so!mozilla::OffTheBooksCondVar::Wait() [BlockingResourceBase.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 514 + 0x4]
[task 2022-12-14T04:42:13.246Z] 04:42:13     INFO -      rbx = 0x00007ff78b937300    rbp = 0x00007ffc32a8edb0
[task 2022-12-14T04:42:13.247Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8edb0    r12 = 0x00007ff777296111
[task 2022-12-14T04:42:13.247Z] 04:42:13     INFO -      r13 = 0x00007ffc32a8edc8    r14 = 0xaaaaaaaaaaaaaaaa
[task 2022-12-14T04:42:13.247Z] 04:42:13     INFO -      r15 = 0x00007ff78b9373e0    rip = 0x00007ff77aa2515f
[task 2022-12-14T04:42:13.247Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.247Z] 04:42:13     INFO -  5  libxul.so!mozilla::TaskController::GetRunnableForMTTask(bool) [TaskController.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 587 + 0x7]
[task 2022-12-14T04:42:13.247Z] 04:42:13     INFO -      rbx = 0x00007ff78b937300    rbp = 0x00007ffc32a8ee00
[task 2022-12-14T04:42:13.248Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8edc0    r12 = 0x00007ff777296111
[task 2022-12-14T04:42:13.248Z] 04:42:13     INFO -      r13 = 0x00007ffc32a8edc8    r14 = 0xaaaaaaaaaaaaaaaa
[task 2022-12-14T04:42:13.248Z] 04:42:13     INFO -      r15 = 0x00007ff78b9373e0    rip = 0x00007ff77aa2e799
[task 2022-12-14T04:42:13.248Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.249Z] 04:42:13     INFO -  6  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 1138 + 0x10]
[task 2022-12-14T04:42:13.249Z] 04:42:13     INFO -      rbx = 0x00007ffc32a8eec0    rbp = 0x00007ffc32a8ef50
[task 2022-12-14T04:42:13.249Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8ee10    r12 = 0x00007ff78b990260
[task 2022-12-14T04:42:13.249Z] 04:42:13     INFO -      r13 = 0x00007ff78b9902a8    r14 = 0x0000000000000001
[task 2022-12-14T04:42:13.249Z] 04:42:13     INFO -      r15 = 0xaaaaaaaaaaaaaaaa    rip = 0x00007ff77aa4533c
[task 2022-12-14T04:42:13.250Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -  7  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 474 + 0xe]
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffc32a8ef80
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8ef60    r12 = 0x00007ff777c57c18
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -      r13 = 0x00007ff7775cffa7    r14 = 0x00007ffc32a8ef67
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -      r15 = 0x00007ff78b990260    rip = 0x00007ff77aa49d4b
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -  8  libxul.so!mozilla::dom::workerinternals::RuntimeService::Cleanup() [RuntimeService.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 1633 + 0xc]
[task 2022-12-14T04:42:13.251Z] 04:42:13     INFO -      rbx = 0x00007ff76e240ee8    rbp = 0x00007ffc32a8f2f0
[task 2022-12-14T04:42:13.252Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8ef90    r12 = 0x00007ff777c57c18
[task 2022-12-14T04:42:13.252Z] 04:42:13     INFO -      r13 = 0x00007ff7775cffa7    r14 = 0x00007ff76e240ee0
[task 2022-12-14T04:42:13.252Z] 04:42:13     INFO -      r15 = 0x00007ff78b990260    rip = 0x00007ff77d939ffe
[task 2022-12-14T04:42:13.252Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.252Z] 04:42:13     INFO -  9  libxul.so!mozilla::dom::workerinternals::RuntimeService::Observe(nsISupports*, char const*, char16_t const*) [RuntimeService.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 1925 + 0x7]
[task 2022-12-14T04:42:13.252Z] 04:42:13     INFO -      rbx = 0x00007ff7775cffa7    rbp = 0x00007ffc32a8f330
[task 2022-12-14T04:42:13.253Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f300    r12 = 0x0000000000000000
[task 2022-12-14T04:42:13.253Z] 04:42:13     INFO -      r13 = 0x00007ff7775cffa7    r14 = 0x00007ff76e240ee0
[task 2022-12-14T04:42:13.253Z] 04:42:13     INFO -      r15 = 0x0000000000000000    rip = 0x00007ff77d93dbc9
[task 2022-12-14T04:42:13.253Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO - 10  libxul.so!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 70 + 0xe]
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffc32a8f370
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f340    r12 = 0x0000000000000000
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO -      r13 = 0x00007ff7775cffa7    r14 = 0x0000000000000000
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO -      r15 = 0x00007ff7775cffa7    rip = 0x00007ff77a9bdc29
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO - 11  libxul.so!nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverService.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 291 + 0x18]
[task 2022-12-14T04:42:13.254Z] 04:42:13     INFO -      rbx = 0x0000000000000000    rbp = 0x00007ffc32a8f4a0
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f380    r12 = 0x0000000000000000
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0x00007ff7775cffa7    r14 = 0x00007ff78b9a6cb8
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x00007ffc32a8f3d0    rip = 0x00007ff77a9beefd
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 12  libxul.so!mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) [AppShutdown.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 409 + 0x10]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0xaaaaaaaaaaaaaaaa    rbp = 0x00007ffc32a8f510
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f4b0    r12 = 0x00007ff7775cffa7
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0x00007ffc32a8f4c8    r14 = 0x00007ffc32a8f4d8
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x0000000000000009    rip = 0x00007ff77a95819b
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 13  libxul.so!mozilla::ShutdownXPCOM(nsIServiceManager*) [XPCOMInit.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 590 + 0xe]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0x00007ffc32a8f538    rbp = 0x00007ffc32a8f5c0
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f520    r12 = 0x0000000000000000
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0x00007ffc32a90a98    r14 = 0x0000000000000000
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x00007ffc32a8f530    rip = 0x00007ff77aa7eb3a
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 14  libxul.so!XRE_TermEmbedding() [nsEmbedFunctions.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 224 + 0x6]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0x00007ff78b9c37a0    rbp = 0x00007ffc32a8f5e0
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f5d0    r12 = 0x00000000000005dc
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0x00007ffc32a90a98    r14 = 0x00007ffc32a8f901
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x00007ffc32a8f658    rip = 0x00007ff77f24114d
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 15  libxul.so!mozilla::ipc::ScopedXREEmbed::Stop() [ScopedXREEmbed.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 90 + 0x4]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0x00007ff78b9c37a0    rbp = 0x00007ffc32a8f600
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f5f0    r12 = 0x00000000000005dc
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0x00007ffc32a90a98    r14 = 0x00007ffc32a8f901
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x00007ffc32a8f658    rip = 0x00007ff77b1c11be
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 16  libxul.so!XRE_InitChildProcess(int, char**, XREChildData const*) [nsEmbedFunctions.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 747 + 0x10]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0x00007ffc32a8f628    rbp = 0x00007ffc32a8f900
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f610    r12 = 0x00000000000005dc
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0x00007ffc32a90a98    r14 = 0x00007ffc32a8f901
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x00007ffc32a8f658    rip = 0x00007ff77f2416b5
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 17  firefox-bin!content_process_main(mozilla::Bootstrap*, int, char**) [plugin-container.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 57 + 0x10]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0x0000000000000012    rbp = 0x00007ffc32a8f940
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f910    r12 = 0x00007ffc32a8f910
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x00007ffc32a90a98
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      r15 = 0x00007ff78b9036d0    rip = 0x0000560cfeab514a
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO - 18  firefox-bin!main [nsBrowserApp.cpp:9456d51620516727a0141b266d55db32b0792ea4 : 359 + 0x11]
[task 2022-12-14T04:42:13.255Z] 04:42:13     INFO -      rbx = 0x0000000000000012    rbp = 0x00007ffc32a909b0
[task 2022-12-14T04:42:13.257Z] 04:42:13     INFO -      rsp = 0x00007ffc32a8f950    r12 = 0x00007ffc32a90b38
[task 2022-12-14T04:42:13.258Z] 04:42:13     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x0000560cfeb5ea98
[task 2022-12-14T04:42:13.258Z] 04:42:13     INFO -      r15 = 0x00007ffc32a90a98    rip = 0x0000560cfeab5520
[task 2022-12-14T04:42:13.258Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.259Z] 04:42:13     INFO - 19  libc.so.6!__libc_start_main [libc-start.c : 310 + 0x19]
[task 2022-12-14T04:42:13.259Z] 04:42:13     INFO -      rbx = 0x0000000000000000    rbp = 0x0000560cfeb5a790
[task 2022-12-14T04:42:13.259Z] 04:42:13     INFO -      rsp = 0x00007ffc32a909c0    r12 = 0x0000560cfeab4ff0
[task 2022-12-14T04:42:13.260Z] 04:42:13     INFO -      r13 = 0x00007ffc32a90a90    r14 = 0x0000000000000000
[task 2022-12-14T04:42:13.260Z] 04:42:13     INFO -      r15 = 0x0000000000000000    rip = 0x00007ff78bdbab97
[task 2022-12-14T04:42:13.260Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.260Z] 04:42:13     INFO - 20  firefox-bin!_start + 0x28
[task 2022-12-14T04:42:13.260Z] 04:42:13     INFO -      rbx = 0x0000000000000000    rbp = 0x0000000000000000
[task 2022-12-14T04:42:13.261Z] 04:42:13     INFO -      rsp = 0x00007ffc32a90a80    r12 = 0x0000560cfeab4ff0
[task 2022-12-14T04:42:13.261Z] 04:42:13     INFO -      r13 = 0x00007ffc32a90a90    r14 = 0x0000000000000000
[task 2022-12-14T04:42:13.261Z] 04:42:13     INFO -      r15 = 0x0000000000000000    rip = 0x0000560cfeab5019
[task 2022-12-14T04:42:13.261Z] 04:42:13     INFO -     Found by: call frame info
[task 2022-12-14T04:42:13.262Z] 04:42:13     INFO - 
[task 2022-12-14T04:42:13.262Z] 04:42:13     INFO - Thread 1 IPC I/O Child

Hi Jed! Can you please take a look at this? Could this crash application crashed [@ __pthread_cond_wait] be related to this push?
Thank you!

Flags: needinfo?(jld)

Hi Jed! Can you please take a look at this? Could this crash application crashed [@ __pthread_cond_wait] be related to this push?

It is, in the sense that that push makes real existing bugs that were timing the testsuite observable as crashes :-)

Flags: needinfo?(jld)

The main thread is blocked in dom::workerinternals::RuntimeService::Cleanup, which is presumably blocked waiting for thread 17, a DOM Worker, to exit; it, in turn, is waiting in its usual event loop. It's not clear to me what's going on beyond that (and if this is a true hang or just slowness).

Flags: needinfo?(bugmail)

The crash shows the worker is idle, so this is probably a hang along the lines of some others on file where we have crash annotations that ask the worker why it's refusing to shutdown. I am struggling to find that bug right now, but I think the general idea is that a workerdebuggeerunnable somehow causes asymmetry in the busy count.

At a meta level, it's possible that this may indeed unleash a massive new swathe of intermittents and it might be premature to turn it on.

Flags: needinfo?(bugmail) → needinfo?(jstutte)

There are also these crashes on linux debug mochitest with Main app process exited normally | application crashed [@ __pthread_cond_wait] after WARNING: Process 3927 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:198
Failure log: https://treeherder.mozilla.org/logviewer?job_id=399791282&repo=autoland&lineNumber=9086
Same dom::workerinternals::RuntimeService::Cleanup can be observed - https://treeherder.mozilla.org/logviewer?job_id=399791282&repo=autoland&lineNumber=9155

OS: Unspecified → Linux
Summary: Intermittent xpcom/string/crashtests/1113005.html (finished) | application crashed [@ __pthread_cond_wait] after [Parent 1500, IPC I/O Parent] WARNING: Process 1770 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts → Intermittent xpcom/string/crashtests/1113005.html (finished) | Main app process exited normally application crashed [@ __pthread_cond_wait] after [Parent 1500, IPC I/O Parent] WARNING: Process X hanging at shutdown; attempting crash report (fatal error)
Duplicate of this bug: 1806274

The following field has been copied from a duplicate bug:

Field Value Source
Regressed by bug 1793525 bug 1806274

For more information, please visit auto_nag documentation.

Keywords: regression
Regressed by: 1793525

Set release status flags based on info from the regressing bug 1793525

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #5)

The crash shows the worker is idle, so this is probably a hang along the lines of some others on file where we have crash annotations that ask the worker why it's refusing to shutdown. I am struggling to find that bug right now, but I think the general idea is that a workerdebuggeerunnable somehow causes asymmetry in the busy count.

I think you refer to bug 1435343 here? However there we look at parent process hangs where the terminator asks the RuntimeService::CrashIfHanging. But yes, the underlying root cause might be the same.

At a meta level, it's possible that this may indeed unleash a massive new swathe of intermittents and it might be premature to turn it on.

Hmm, I think :gcp is right in saying:

(In reply to Gian-Carlo Pascutto [:gcp] from comment #3)

Hi Jed! Can you please take a look at this? Could this crash application crashed [@ __pthread_cond_wait] be related to this push?

It is, in the sense that that push makes real existing bugs that were timing the testsuite observable as crashes :-)

IIUC :jld is trying to make failures like those on bug 1769821 more actionable, I would not expect this to increases the number of failures, but just to change their nature?

Flags: needinfo?(jstutte) → needinfo?(jld)

Looking at RuntimeService::UnregisterWorker, I wonder if those two conditions are really mutually exclusive? Shouldn't we always decrease mChildWorkerCount if we were a child?

Flags: needinfo?(echuang)

Update:
There have been 42 failures within the last 7 days, all of them on Linux 18.04 x64 WebRender debug.

Recent failure: https://treeherder.mozilla.org/logviewer?job_id=400785472&repo=autoland&lineNumber=233904

[task 2022-12-27T00:00:13.994Z] 00:00:13     INFO - REFTEST PROCESS-CRASH | xpcom/string/crashtests/1113005.html (finished) | application crashed [@ __pthread_cond_wait]
[task 2022-12-27T00:00:13.994Z] 00:00:13     INFO - Crash dump filename: /tmp/tmp97w_t4_z.mozrunner/minidumps/6eed7da1-6c56-2744-8898-16678351d32a.dmp
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - Operating system: Linux
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO -                   4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - CPU: amd64
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO -      family 6 model 85 stepping 7
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO -      2 CPUs
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - Linux Ubuntu 18.04 - bionic (Ubuntu 18.04.6 LTS)
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - 
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - Crash reason:  SIGABRT
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - Crash address: 0x3e8000005e4
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - Crashing instruction: `cmp rax, -0x1000`
[task 2022-12-27T00:00:13.996Z] 00:00:13     INFO - No memory accessed by instruction
[task 2022-12-27T00:00:13.997Z] 00:00:13     INFO - Process uptime: not available
[task 2022-12-27T00:00:13.997Z] 00:00:13     INFO - 
[task 2022-12-27T00:00:13.997Z] 00:00:13     INFO - Thread 0 file:// Content (crashed)
[task 2022-12-27T00:00:13.997Z] 00:00:13     INFO -  0  libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
[task 2022-12-27T00:00:13.998Z] 00:00:13     INFO -      rax = 0xfffffffffffffffc    rdx = 0x0000000000000000
[task 2022-12-27T00:00:13.998Z] 00:00:13     INFO -      rcx = 0x00007faf0ef749f3    rbx = 0x00007faf0dc37400
[task 2022-12-27T00:00:13.998Z] 00:00:13     INFO -      rsi = 0x0000000000000080    rdi = 0x00007faf0dc3742c
[task 2022-12-27T00:00:13.998Z] 00:00:13     INFO -      rbp = 0x00007faf0dc37424    rsp = 0x00007ffe95b070f0
[task 2022-12-27T00:00:13.998Z] 00:00:13     INFO -       r8 = 0x0000000000000001     r9 = 0x0000000000000000
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r10 = 0x0000000000000000    r11 = 0x0000000000000246
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r12 = 0x00007faf0dc3742c    r13 = 0x0000000000000000
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r14 = 0x00007faf0dc37300    r15 = 0x0000000000000033
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rip = 0x00007faf0ef749f3
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -     Found by: given as instruction pointer in context
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -  1  firefox-bin!mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) [ConditionVariable_posix.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 106 + 0xa]
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rbx = 0x00007faf0dc37400    rbp = 0x00007ffe95b071d0
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rsp = 0x00007ffe95b071c0    r12 = 0x7fffffffffffffff
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r13 = 0x00007faf0dc894c0    r14 = 0x00007ffe95b07258
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r15 = 0x00007faf0dc37400    rip = 0x000056332329fc99
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -  2  firefox-bin!mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) [ConditionVariable_posix.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 113 + 0x7]
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rbx = 0x00007faf0dc373e0    rbp = 0x00007ffe95b07240
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rsp = 0x00007ffe95b071e0    r12 = 0x7fffffffffffffff
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r13 = 0x00007faf0dc894c0    r14 = 0x00007ffe95b07258
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r15 = 0x00007faf0dc37400    rip = 0x000056332329fd16
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -  3  libxul.so!mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) [BlockingResourceBase.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 534 + 0xf]
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rbx = 0x00007faf0dc373e0    rbp = 0x00007ffe95b07290
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07250    r12 = 0x0000000000000000
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r13 = 0x00007faf0dc894c0    r14 = 0x00007ffe95b07258
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r15 = 0x00007faf0dc37301    rip = 0x00007faefcd9721e
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -  4  libxul.so!mozilla::OffTheBooksCondVar::Wait() [BlockingResourceBase.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 514 + 0x4]
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rbx = 0x00007faf0dc37300    rbp = 0x00007ffe95b072a0
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      rsp = 0x00007ffe95b072a0    r12 = 0x00007faef961da86
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r13 = 0x00007ffe95b072b8    r14 = 0xaaaaaaaaaaaaaaaa
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -      r15 = 0x00007faf0dc373e0    rip = 0x00007faefcd97173
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:13.999Z] 00:00:13     INFO -  5  libxul.so!mozilla::TaskController::GetRunnableForMTTask(bool) [TaskController.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 587 + 0x7]
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rbx = 0x00007faf0dc37300    rbp = 0x00007ffe95b072f0
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rsp = 0x00007ffe95b072b0    r12 = 0x00007faef961da86
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r13 = 0x00007ffe95b072b8    r14 = 0xaaaaaaaaaaaaaaaa
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r15 = 0x00007faf0dc373e0    rip = 0x00007faefcda07ad
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -  6  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 1131 + 0xe]
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe95b07440
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07300    r12 = 0x00007faf0dc78248
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r13 = 0x00007faf0dc78200    r14 = 0x00007ffe95b07457
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r15 = 0x00007ffe95b07301    rip = 0x00007faefcdb730e
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -  7  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 476 + 0xe]
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe95b07470
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07450    r12 = 0x00007faef9fe3d40
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r13 = 0x00007faef995a41c    r14 = 0x00007ffe95b07457
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r15 = 0x00007faf0dc78200    rip = 0x00007faefcdbbaa1
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -  8  libxul.so!mozilla::dom::workerinternals::RuntimeService::Cleanup() [RuntimeService.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 1634 + 0xc]
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rbx = 0x00007faeedaa3dc8    rbp = 0x00007ffe95b077e0
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07480    r12 = 0x00007faef9fe3d40
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r13 = 0x00007faef995a41c    r14 = 0x00007faeedaa3dc0
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r15 = 0x00007faf0dc78200    rip = 0x00007faeffcc9ce4
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -  9  libxul.so!mozilla::dom::workerinternals::RuntimeService::Observe(nsISupports*, char const*, char16_t const*) [RuntimeService.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 1926 + 0x7]
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rbx = 0x00007faef995a41c    rbp = 0x00007ffe95b07820
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rsp = 0x00007ffe95b077f0    r12 = 0x0000000000000000
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r13 = 0x00007faef995a41c    r14 = 0x00007faeedaa3dc0
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      r15 = 0x0000000000000000    rip = 0x00007faeffccd8af
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO - 10  libxul.so!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 70 + 0xe]
[task 2022-12-27T00:00:14.000Z] 00:00:13     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe95b07860
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07830    r12 = 0x0000000000000000
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r13 = 0x00007faef995a41c    r14 = 0x0000000000000000
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r15 = 0x00007faef995a41c    rip = 0x00007faefcd2fc7b
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO - 11  libxul.so!nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverService.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 291 + 0x18]
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rbx = 0x0000000000000000    rbp = 0x00007ffe95b07990
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07870    r12 = 0x0000000000000000
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r13 = 0x00007faef995a41c    r14 = 0x00007faf0dca68f8
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r15 = 0x00007ffe95b078c0    rip = 0x00007faefcd30f4f
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO - 12  libxul.so!mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) [AppShutdown.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 409 + 0x10]
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rbx = 0xaaaaaaaaaaaaaaaa    rbp = 0x00007ffe95b07a00
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rsp = 0x00007ffe95b079a0    r12 = 0x00007faef995a41c
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r13 = 0x00007ffe95b079b8    r14 = 0x00007ffe95b079c8
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r15 = 0x0000000000000009    rip = 0x00007faefccca1db
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO - 13  libxul.so!mozilla::ShutdownXPCOM(nsIServiceManager*) [XPCOMInit.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 590 + 0xe]
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rbx = 0x00007ffe95b07a28    rbp = 0x00007ffe95b07ab0
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07a10    r12 = 0x0000000000000000
[task 2022-12-27T00:00:14.001Z] 00:00:13     INFO -      r13 = 0x00007ffe95b08f48    r14 = 0x0000000000000000
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r15 = 0x00007ffe95b07a20    rip = 0x00007faefcdf089a
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO - 14  libxul.so!XRE_InitChildProcess(int, char**, XREChildData const*) [nsEmbedFunctions.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 680 + 0x10]
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      rbx = 0x00007ffe95b07ad8    rbp = 0x00007ffe95b07db0
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07ac0    r12 = 0x00000000000005e4
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r13 = 0x00007ffe95b08f48    r14 = 0x000000000000000f
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r15 = 0x00007ffe95b07d01    rip = 0x00007faf015da81a
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO - 15  firefox-bin!content_process_main(mozilla::Bootstrap*, int, char**) [plugin-container.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 57 + 0x10]
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      rbx = 0x0000000000000012    rbp = 0x00007ffe95b07df0
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07dc0    r12 = 0x00007ffe95b07dc0
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x00007ffe95b08f48
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r15 = 0x00007faf0dc036d0    rip = 0x000056332323114a
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO - 16  firefox-bin!main [nsBrowserApp.cpp:186f845b2cac36b2ec389b0d1e454b283977fe60 : 359 + 0x11]
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      rbx = 0x0000000000000012    rbp = 0x00007ffe95b08e60
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      rsp = 0x00007ffe95b07e00    r12 = 0x00007ffe95b08fe8
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x00005633232dab18
[task 2022-12-27T00:00:14.002Z] 00:00:13     INFO -      r15 = 0x00007ffe95b08f48    rip = 0x0000563323231520
[task 2022-12-27T00:00:14.003Z] 00:00:13     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.003Z] 00:00:13     INFO - 17  libc.so.6!__libc_start_main [libc-start.c : 310 + 0x19]
[task 2022-12-27T00:00:14.003Z] 00:00:13     INFO -      rbx = 0x0000000000000000    rbp = 0x00005633232d6810
[task 2022-12-27T00:00:14.003Z] 00:00:13     INFO -      rsp = 0x00007ffe95b08e70    r12 = 0x0000563323230ff0
[task 2022-12-27T00:00:14.003Z] 00:00:13     INFO -      r13 = 0x00007ffe95b08f40    r14 = 0x0000000000000000
[task 2022-12-27T00:00:14.003Z] 00:00:14     INFO -      r15 = 0x0000000000000000    rip = 0x00007faf0e054b97
[task 2022-12-27T00:00:14.004Z] 00:00:14     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.004Z] 00:00:14     INFO - 18  firefox-bin!_start + 0x28
[task 2022-12-27T00:00:14.004Z] 00:00:14     INFO -      rbx = 0x0000000000000000    rbp = 0x0000000000000000
[task 2022-12-27T00:00:14.005Z] 00:00:14     INFO -      rsp = 0x00007ffe95b08f30    r12 = 0x0000563323230ff0
[task 2022-12-27T00:00:14.005Z] 00:00:14     INFO -      r13 = 0x00007ffe95b08f40    r14 = 0x0000000000000000
[task 2022-12-27T00:00:14.006Z] 00:00:14     INFO -      r15 = 0x0000000000000000    rip = 0x0000563323231019
[task 2022-12-27T00:00:14.006Z] 00:00:14     INFO -     Found by: call frame info
[task 2022-12-27T00:00:14.006Z] 00:00:14     INFO - 
Whiteboard: [retriggered] → [retriggered][stockwell needswork:owner]

There have been 46 total failures in the last 7 days, recent failure log.
Affected platforms are:

  • linux1804-64-qr

Hi Jens, is there something actionable to be done here? This bug has now 189 total failures in the last 30 days and it is on disable recommended bugs list: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2022-12-10&endday=2023-01-09&tree=trunk&failurehash=all&bug=1805613
It looks to be split between 70% crashtest-nofis failures and 30% mochitest-browser-chrome, all on linux1804-64-qr debug.

Flags: needinfo?(jstutte)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]

(In reply to Cosmin Sabou [:CosminS] from comment #18)

Hi Jens, is there something actionable to be done here? This bug has now 189 total failures in the last 30 days and it is on disable recommended bugs list: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2022-12-10&endday=2023-01-09&tree=trunk&failurehash=all&bug=1805613

I assume that bug 1793525 did shuffle some things here and that we see the same intermittents as before but differently. For example bug 1769821 and bug 1791335 seem to have lower frequency now. The great thing about bug 1793525 is that now we have the child process' stack in the log.

The most actionable question is probably comment 11 for now. Looking at this instance I see two active worker threads that are idling and the main thread is stuck inside RuntimeService::UnregisterWorker.

It looks to be split between 70% crashtest-nofis failures and 30% mochitest-browser-chrome, all on linux1804-64-qr debug.

If I look for this test I get quite some intermittent bugs, most of them are moot now (and can get closed) but some get sporadic reports for Windows, too. Not sure if they have the same reason, though.

Flags: needinfo?(jstutte)

(In reply to Jens Stutte [:jstutte] from comment #19)

If I look for this test I get quite some intermittent bugs, most of them are moot now (and can get closed) but some get sporadic reports for Windows, too. Not sure if they have the same reason, though.

I closed as incomplete the bugs from that list that didn't have a failure for a long time.
Maybe the same should be done also with bug 1769821 and bug 1791335 that haven't had a failure on trunk since bug 1793525 landed.

:asuth, would I be too optimistic in hoping that the cleanup from bug 1775784 might help also here ?

Flags: needinfo?(echuang) → needinfo?(bugmail)

this test is failing 32% of the time on linux64/debug-crashtest-nofis and nofis-swr. It doesn't appear to be failing elsewhere.

Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]

There have been 60 total failures in the last 7 days, recent failure log.
There is an 
affected platform: linux1804-64-qr.

There have been 49 total failures in the last 7 days.
There is an 
affected platform: linux1804-64-qr.

Recent failure log.

[task 2023-02-06T06:01:02.064Z] 06:01:02     INFO - [Child 1752, Main Thread] WARNING: IPC message 'PContent::Msg_AccumulateChildHistograms' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:504
[task 2023-02-06T06:01:02.065Z] 06:01:02     INFO - [Child 1752, Main Thread] WARNING: '!ipcActor->SendAccumulateChildHistograms(histogramsToSend)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:278
[task 2023-02-06T06:01:02.065Z] 06:01:02     INFO - [Child 1752, Main Thread] WARNING: IPC message 'PContent::Msg_RecordDiscardedData' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:504
[task 2023-02-06T06:01:02.066Z] 06:01:02     INFO - [Child 1752, Main Thread] WARNING: '!ipcActor->SendRecordDiscardedData(discardedData)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:297
[task 2023-02-06T06:01:06.044Z] 06:01:06     INFO - [Parent 1499, IPC I/O Parent] WARNING: Process 1752 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:203
[task 2023-02-06T06:01:06.262Z] 06:01:06     INFO - [Parent 1499, IPC I/O Parent] WARNING: process 1752 exited on signal 6: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:256
[task 2023-02-06T06:01:06.265Z] 06:01:06     INFO - [Parent 1499, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4663
[task 2023-02-06T06:01:06.287Z] 06:01:06     INFO - REFTEST INFO | Copy/paste: /builds/worker/fetches/minidump-stackwalk/minidump-stackwalk --symbols-url=https://symbols.mozilla.org/ --cyborg=/tmp/tmp3nmvsxjm/08cf51db-0b78-9346-a3e5-947a62333e12.trace /tmp/tmpfe8s26eb.mozrunner/minidumps/08cf51db-0b78-9346-a3e5-947a62333e12.dmp /builds/worker/workspace/build/symbols
[task 2023-02-06T06:01:11.139Z] 06:01:11     INFO - REFTEST INFO | Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/08cf51db-0b78-9346-a3e5-947a62333e12.dmp
[task 2023-02-06T06:01:11.253Z] 06:01:11     INFO - REFTEST PROCESS-CRASH | xpcom/string/crashtests/1113005.html (finished) | application crashed [@ __pthread_cond_wait]
[task 2023-02-06T06:01:11.253Z] 06:01:11     INFO - Crash dump filename: /tmp/tmpfe8s26eb.mozrunner/minidumps/08cf51db-0b78-9346-a3e5-947a62333e12.dmp
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Operating system: Linux
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -                   4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - CPU: amd64
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      family 6 model 85 stepping 7
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      2 CPUs
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Linux Ubuntu 18.04 - bionic (Ubuntu 18.04.6 LTS)
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - 
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Crash reason:  SIGABRT
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Crash address: 0x0
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Crashing instruction: `cmp rax, -0x1000`
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - No memory accessed by instruction
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Process uptime: not available
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - 
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO - Thread 0 file:// Content (crashed)
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -  0  libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rax = 0xfffffffffffffffc    rdx = 0x0000000000000000
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rcx = 0x00007ff70873d9f3    rbx = 0x00007ff707337400
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rsi = 0x0000000000000080    rdi = 0x00007ff707337428
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rbp = 0x00007ff707337424    rsp = 0x00007fff27cddc90
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -       r8 = 0x0000000000000000     r9 = 0x0000000000000000
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r10 = 0x0000000000000000    r11 = 0x0000000000000246
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r12 = 0x00007ff707337428    r13 = 0x0000000000000000
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r14 = 0x00007ff707337300    r15 = 0x0000000000000014
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rip = 0x00007ff70873d9f3
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -     Found by: given as instruction pointer in context
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -  1  firefox-bin!mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) [ConditionVariable_posix.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 106 + 0xa]
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rbx = 0x00007ff707337400    rbp = 0x00007fff27cddd70
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rsp = 0x00007fff27cddd60    r12 = 0x7fffffffffffffff
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r13 = 0x00007ff7073894c0    r14 = 0x00007fff27cdddf8
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r15 = 0x00007ff707337400    rip = 0x000056448fd0c979
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -  2  firefox-bin!mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) [ConditionVariable_posix.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 113 + 0x7]
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rbx = 0x00007ff7073373e0    rbp = 0x00007fff27cddde0
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rsp = 0x00007fff27cddd80    r12 = 0x7fffffffffffffff
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r13 = 0x00007ff7073894c0    r14 = 0x00007fff27cdddf8
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r15 = 0x00007ff707337400    rip = 0x000056448fd0c9f6
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -  3  libxul.so!mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) [BlockingResourceBase.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 534 + 0xf]
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rbx = 0x00007ff7073373e0    rbp = 0x00007fff27cdde30
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rsp = 0x00007fff27cdddf0    r12 = 0x0000000000000000
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r13 = 0x00007ff7073894c0    r14 = 0x00007fff27cdddf8
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r15 = 0x00007ff707337301    rip = 0x00007ff6f62e8af0
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -  4  libxul.so!mozilla::OffTheBooksCondVar::Wait() [BlockingResourceBase.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 514 + 0x4]
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rbx = 0x00007ff707337300    rbp = 0x00007fff27cdde40
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rsp = 0x00007fff27cdde40    r12 = 0x00007ff6f2a6b851
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r13 = 0x00007fff27cdde58    r14 = 0xaaaaaaaaaaaaaaaa
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r15 = 0x00007ff7073373e0    rip = 0x00007ff6f62e8a45
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -  5  libxul.so!mozilla::TaskController::GetRunnableForMTTask(bool) [TaskController.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 587 + 0x7]
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rbx = 0x00007ff707337300    rbp = 0x00007fff27cdde90
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      rsp = 0x00007fff27cdde50    r12 = 0x00007ff6f2a6b851
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r13 = 0x00007fff27cdde58    r14 = 0xaaaaaaaaaaaaaaaa
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -      r15 = 0x00007ff7073373e0    rip = 0x00007ff6f62f1d35
[task 2023-02-06T06:01:11.254Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.255Z] 06:01:11     INFO -  6  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 1159 + 0xe]
[task 2023-02-06T06:01:11.255Z] 06:01:11     INFO -      rbx = 0x0000000000000001    rbp = 0x00007fff27cddff0
[task 2023-02-06T06:01:11.255Z] 06:01:11     INFO -      rsp = 0x00007fff27cddea0    r12 = 0x00007ff707378248
[task 2023-02-06T06:01:11.256Z] 06:01:11     INFO -      r13 = 0x00007ff707378200    r14 = 0x00007fff27cde007
[task 2023-02-06T06:01:11.256Z] 06:01:11     INFO -      r15 = 0x00007fff27cddf01    rip = 0x00007ff6f6308db9
[task 2023-02-06T06:01:11.256Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.256Z] 06:01:11     INFO -  7  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 477 + 0xe]
[task 2023-02-06T06:01:11.257Z] 06:01:11     INFO -      rbx = 0x0000000000000001    rbp = 0x00007fff27cde020
[task 2023-02-06T06:01:11.257Z] 06:01:11     INFO -      rsp = 0x00007fff27cde000    r12 = 0x00007ff6f3438aa0
[task 2023-02-06T06:01:11.257Z] 06:01:11     INFO -      r13 = 0x00007ff6f2da7eab    r14 = 0x00007fff27cde007
[task 2023-02-06T06:01:11.257Z] 06:01:11     INFO -      r15 = 0x00007ff707378200    rip = 0x00007ff6f630d5a9
[task 2023-02-06T06:01:11.258Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.258Z] 06:01:11     INFO -  8  libxul.so!mozilla::dom::workerinternals::RuntimeService::Cleanup() [RuntimeService.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 1634 + 0xc]
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      rbx = 0x00007ff6e8c4fb88    rbp = 0x00007fff27cde390
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      rsp = 0x00007fff27cde030    r12 = 0x00007ff6f3438aa0
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      r13 = 0x00007ff6f2da7eab    r14 = 0x00007ff6e8c4fb80
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      r15 = 0x00007ff707378200    rip = 0x00007ff6f93cb30c
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -  9  libxul.so!mozilla::dom::workerinternals::RuntimeService::Observe(nsISupports*, char const*, char16_t const*) [RuntimeService.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 1926 + 0x7]
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      rbx = 0x00007ff6f2da7eab    rbp = 0x00007fff27cde3d0
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      rsp = 0x00007fff27cde3a0    r12 = 0x0000000000000000
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      r13 = 0x00007ff6f2da7eab    r14 = 0x00007ff6e8c4fb80
[task 2023-02-06T06:01:11.262Z] 06:01:11     INFO -      r15 = 0x0000000000000000    rip = 0x00007ff6f93ceed7
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO - 10  libxul.so!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 70 + 0xe]
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      rbx = 0x0000000000000001    rbp = 0x00007fff27cde410
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      rsp = 0x00007fff27cde3e0    r12 = 0x0000000000000000
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      r13 = 0x00007ff6f2da7eab    r14 = 0x0000000000000000
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      r15 = 0x00007ff6f2da7eab    rip = 0x00007ff6f62814e9
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO - 11  libxul.so!nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverService.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 291 + 0x18]
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      rbx = 0x0000000000000000    rbp = 0x00007fff27cde540
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      rsp = 0x00007fff27cde420    r12 = 0x0000000000000000
[task 2023-02-06T06:01:11.263Z] 06:01:11     INFO -      r13 = 0x00007ff6f2da7eab    r14 = 0x00007ff7073a6b78
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r15 = 0x00007fff27cde470    rip = 0x00007ff6f62827bd
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO - 12  libxul.so!mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) [AppShutdown.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 400 + 0x10]
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rbx = 0xaaaaaaaaaaaaaaaa    rbp = 0x00007fff27cde5b0
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rsp = 0x00007fff27cde550    r12 = 0x00007ff6f2da7eab
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r13 = 0x00007fff27cde568    r14 = 0x00007fff27cde578
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r15 = 0x0000000000000009    rip = 0x00007ff6f621b569
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO - 13  libxul.so!mozilla::ShutdownXPCOM(nsIServiceManager*) [XPCOMInit.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 590 + 0xe]
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rbx = 0x00007fff27cde5d8    rbp = 0x00007fff27cde660
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rsp = 0x00007fff27cde5c0    r12 = 0x0000000000000000
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r13 = 0x00007fff27cdfaf8    r14 = 0x0000000000000000
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r15 = 0x00007fff27cde5d0    rip = 0x00007ff6f6342126
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO - 14  libxul.so!XRE_InitChildProcess(int, char**, XREChildData const*) [nsEmbedFunctions.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 679 + 0x10]
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rbx = 0x00007fff27cde688    rbp = 0x00007fff27cde960
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rsp = 0x00007fff27cde670    r12 = 0x00000000000005db
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r13 = 0x00007fff27cdfaf8    r14 = 0x000000000000000f
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r15 = 0x00007fff27cde901    rip = 0x00007ff6fad38566
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO - 15  firefox-bin!content_process_main(mozilla::Bootstrap*, int, char**) [plugin-container.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 57 + 0x10]
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rbx = 0x0000000000000012    rbp = 0x00007fff27cde9a0
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rsp = 0x00007fff27cde970    r12 = 0x00007fff27cde970
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x00007fff27cdfaf8
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r15 = 0x00007ff7073036d0    rip = 0x000056448fc9e1ca
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO - 16  firefox-bin!main [nsBrowserApp.cpp:5a43cfc8da40906e7374653c06a3fa687665591d : 353 + 0x11]
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rbx = 0x0000000000000012    rbp = 0x00007fff27cdfa10
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      rsp = 0x00007fff27cde9b0    r12 = 0x00007fff27cdfb98
[task 2023-02-06T06:01:11.264Z] 06:01:11     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x000056448fd47bb8
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -      r15 = 0x00007fff27cdfaf8    rip = 0x000056448fc9e5a0
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO - 17  libc.so.6!__libc_start_main [libc-start.c : 310 + 0x19]
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -      rbx = 0x0000000000000000    rbp = 0x000056448fd438b0
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -      rsp = 0x00007fff27cdfa20    r12 = 0x000056448fc9e070
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -      r13 = 0x00007fff27cdfaf0    r14 = 0x0000000000000000
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -      r15 = 0x0000000000000000    rip = 0x00007ff70781db97
[task 2023-02-06T06:01:11.265Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO - 18  firefox-bin!_start + 0x28
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO -      rbx = 0x0000000000000000    rbp = 0x0000000000000000
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO -      rsp = 0x00007fff27cdfae0    r12 = 0x000056448fc9e070
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO -      r13 = 0x00007fff27cdfaf0    r14 = 0x0000000000000000
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO -      r15 = 0x0000000000000000    rip = 0x000056448fc9e099
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO -     Found by: call frame info
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO - 
[task 2023-02-06T06:01:11.266Z] 06:01:11     INFO - Thread 1 IPC I/O Child

(In reply to Jens Stutte [:jstutte] from comment #22)

:asuth, would I be too optimistic in hoping that the cleanup from bug 1775784 might help also here ?

Andrew, can you take a look at this please?
Thank you.

Flags: needinfo?(bugmail)
Flags: needinfo?(bugmail)

There have been 70 total failures in the last 7 days.

All failures are on Linux 18.04 x64 WebRender debug.

Recent failure log.

[task 2023-02-14T03:47:54.077Z] 03:47:54     INFO - [Parent 1501, IPC I/O Parent] WARNING: Process 1765 hanging at shutdown; attempting crash report (fatal error).: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:203
[task 2023-02-14T03:47:54.083Z] 03:47:54     INFO - [Parent 1501, IPC I/O Parent] WARNING: ignoring non-exit event for process 1765: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:261
[task 2023-02-14T03:47:55.083Z] 03:47:55     INFO - [Parent 1501, IPC I/O Parent] WARNING: process 1765 exited on signal 6: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc:266
[task 2023-02-14T03:47:55.085Z] 03:47:55     INFO - [Parent 1501, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4691
[task 2023-02-14T03:47:55.107Z] 03:47:55     INFO - REFTEST INFO | Copy/paste: /builds/worker/fetches/minidump-stackwalk/minidump-stackwalk --symbols-url=https://symbols.mozilla.org/ --cyborg=/tmp/tmpthjm30nf/2957d1e5-b151-7acf-99b4-be376d1e9873.trace /tmp/tmpwoqrilia.mozrunner/minidumps/2957d1e5-b151-7acf-99b4-be376d1e9873.dmp /builds/worker/workspace/build/symbols
[task 2023-02-14T03:48:00.810Z] 03:48:00     INFO - REFTEST INFO | Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/2957d1e5-b151-7acf-99b4-be376d1e9873.dmp
[task 2023-02-14T03:48:00.923Z] 03:48:00     INFO - REFTEST PROCESS-CRASH | xpcom/string/crashtests/1113005.html (finished) | application crashed [@ __pthread_cond_wait]
[task 2023-02-14T03:48:00.923Z] 03:48:00     INFO - Crash dump filename: /tmp/tmpwoqrilia.mozrunner/minidumps/2957d1e5-b151-7acf-99b4-be376d1e9873.dmp
[task 2023-02-14T03:48:00.923Z] 03:48:00     INFO - Operating system: Linux
[task 2023-02-14T03:48:00.924Z] 03:48:00     INFO -                   4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018
[task 2023-02-14T03:48:00.924Z] 03:48:00     INFO - CPU: amd64
[task 2023-02-14T03:48:00.924Z] 03:48:00     INFO -      family 6 model 85 stepping 7
[task 2023-02-14T03:48:00.925Z] 03:48:00     INFO -      2 CPUs
[task 2023-02-14T03:48:00.925Z] 03:48:00     INFO - Linux Ubuntu 18.04 - bionic (Ubuntu 18.04.6 LTS)
[task 2023-02-14T03:48:00.925Z] 03:48:00     INFO - 
[task 2023-02-14T03:48:00.926Z] 03:48:00     INFO - Crash reason:  SIGABRT
[task 2023-02-14T03:48:00.926Z] 03:48:00     INFO - Crash address: 0x0
[task 2023-02-14T03:48:00.926Z] 03:48:00     INFO - Crashing instruction: `cmp rax, -0x1000`
[task 2023-02-14T03:48:00.927Z] 03:48:00     INFO - No memory accessed by instruction
[task 2023-02-14T03:48:00.927Z] 03:48:00     INFO - Process uptime: not available
[task 2023-02-14T03:48:00.927Z] 03:48:00     INFO - 
[task 2023-02-14T03:48:00.927Z] 03:48:00     INFO - Thread 0 file:// Content (crashed)
[task 2023-02-14T03:48:00.927Z] 03:48:00     INFO -  0  libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
[task 2023-02-14T03:48:00.927Z] 03:48:00     INFO -      rax = 0xfffffffffffffffc    rdx = 0x0000000000000000
[task 2023-02-14T03:48:00.928Z] 03:48:00     INFO -      rcx = 0x00007f20907c69f3    rbx = 0x00007f208f437400
[task 2023-02-14T03:48:00.928Z] 03:48:00     INFO -      rsi = 0x0000000000000080    rdi = 0x00007f208f437428
[task 2023-02-14T03:48:00.928Z] 03:48:00     INFO -      rbp = 0x00007f208f437424    rsp = 0x00007ffe9675f020
[task 2023-02-14T03:48:00.928Z] 03:48:00     INFO -       r8 = 0x0000000000000000     r9 = 0x0000000000000000
[task 2023-02-14T03:48:00.928Z] 03:48:00     INFO -      r10 = 0x0000000000000000    r11 = 0x0000000000000246
[task 2023-02-14T03:48:00.928Z] 03:48:00     INFO -      r12 = 0x00007f208f437428    r13 = 0x0000000000000000
[task 2023-02-14T03:48:00.929Z] 03:48:00     INFO -      r14 = 0x00007f208f437300    r15 = 0x000000000000001c
[task 2023-02-14T03:48:00.929Z] 03:48:00     INFO -      rip = 0x00007f20907c69f3
[task 2023-02-14T03:48:00.929Z] 03:48:00     INFO -     Found by: given as instruction pointer in context
[task 2023-02-14T03:48:00.930Z] 03:48:00     INFO -  1  firefox-bin!mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) [ConditionVariable_posix.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 106 + 0xa]
[task 2023-02-14T03:48:00.930Z] 03:48:00     INFO -      rbx = 0x00007f208f437400    rbp = 0x00007ffe9675f100
[task 2023-02-14T03:48:00.930Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f0f0    r12 = 0x7fffffffffffffff
[task 2023-02-14T03:48:00.930Z] 03:48:00     INFO -      r13 = 0x00007f208f4894c0    r14 = 0x00007ffe9675f188
[task 2023-02-14T03:48:00.931Z] 03:48:00     INFO -      r15 = 0x00007f208f437400    rip = 0x000055ad102ff8f1
[task 2023-02-14T03:48:00.931Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.931Z] 03:48:00     INFO -  2  firefox-bin!mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) [ConditionVariable_posix.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 113 + 0x7]
[task 2023-02-14T03:48:00.932Z] 03:48:00     INFO -      rbx = 0x00007f208f4373e0    rbp = 0x00007ffe9675f170
[task 2023-02-14T03:48:00.932Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f110    r12 = 0x7fffffffffffffff
[task 2023-02-14T03:48:00.933Z] 03:48:00     INFO -      r13 = 0x00007f208f4894c0    r14 = 0x00007ffe9675f188
[task 2023-02-14T03:48:00.933Z] 03:48:00     INFO -      r15 = 0x00007f208f437400    rip = 0x000055ad102ff96e
[task 2023-02-14T03:48:00.933Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.934Z] 03:48:00     INFO -  3  libxul.so!mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) [BlockingResourceBase.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 534 + 0xf]
[task 2023-02-14T03:48:00.934Z] 03:48:00     INFO -      rbx = 0x00007f208f4373e0    rbp = 0x00007ffe9675f1c0
[task 2023-02-14T03:48:00.934Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f180    r12 = 0x0000000000000000
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -      r13 = 0x00007f208f4894c0    r14 = 0x00007ffe9675f188
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -      r15 = 0x00007f208f437301    rip = 0x00007f207e351f46
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -  4  libxul.so!mozilla::OffTheBooksCondVar::Wait() [BlockingResourceBase.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 514 + 0x4]
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -      rbx = 0x00007f208f437300    rbp = 0x00007ffe9675f1d0
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f1d0    r12 = 0x00007f207aacca43
[task 2023-02-14T03:48:00.935Z] 03:48:00     INFO -      r13 = 0x00007ffe9675f1e8    r14 = 0xaaaaaaaaaaaaaaaa
[task 2023-02-14T03:48:00.936Z] 03:48:00     INFO -      r15 = 0x00007f208f4373e0    rip = 0x00007f207e351e9b
[task 2023-02-14T03:48:00.936Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.936Z] 03:48:00     INFO -  5  libxul.so!mozilla::TaskController::GetRunnableForMTTask(bool) [TaskController.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 587 + 0x7]
[task 2023-02-14T03:48:00.937Z] 03:48:00     INFO -      rbx = 0x00007f208f437300    rbp = 0x00007ffe9675f220
[task 2023-02-14T03:48:00.937Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f1e0    r12 = 0x00007f207aacca43
[task 2023-02-14T03:48:00.938Z] 03:48:00     INFO -      r13 = 0x00007ffe9675f1e8    r14 = 0xaaaaaaaaaaaaaaaa
[task 2023-02-14T03:48:00.938Z] 03:48:00     INFO -      r15 = 0x00007f208f4373e0    rip = 0x00007f207e35b18b
[task 2023-02-14T03:48:00.938Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.939Z] 03:48:00     INFO -  6  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 1159 + 0xe]
[task 2023-02-14T03:48:00.939Z] 03:48:00     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe9675f380
[task 2023-02-14T03:48:00.939Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f230    r12 = 0x00007f208f478248
[task 2023-02-14T03:48:00.939Z] 03:48:00     INFO -      r13 = 0x00007f208f478200    r14 = 0x00007ffe9675f397
[task 2023-02-14T03:48:00.939Z] 03:48:00     INFO -      r15 = 0x00007ffe9675f201    rip = 0x00007f207e37232f
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -  7  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 477 + 0xe]
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe9675f3b0
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f390    r12 = 0x00007f207b49a8c0
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      r13 = 0x00007f207ae099e5    r14 = 0x00007ffe9675f397
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      r15 = 0x00007f208f478200    rip = 0x00007f207e376b1f
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -  8  libxul.so!mozilla::dom::workerinternals::RuntimeService::Cleanup() [RuntimeService.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 1634 + 0xc]
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      rbx = 0x00007f2070d22dc8    rbp = 0x00007ffe9675f720
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f3c0    r12 = 0x00007f207b49a8c0
[task 2023-02-14T03:48:00.940Z] 03:48:00     INFO -      r13 = 0x00007f207ae099e5    r14 = 0x00007f2070d22dc0
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      r15 = 0x00007f208f478200    rip = 0x00007f208143f1f4
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -  9  libxul.so!mozilla::dom::workerinternals::RuntimeService::Observe(nsISupports*, char const*, char16_t const*) [RuntimeService.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 1926 + 0x7]
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      rbx = 0x00007f207ae099e5    rbp = 0x00007ffe9675f760
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f730    r12 = 0x0000000000000000
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      r13 = 0x00007f207ae099e5    r14 = 0x00007f2070d22dc0
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      r15 = 0x0000000000000000    rip = 0x00007f2081442dbd
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO - 10  libxul.so!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 70 + 0xe]
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe9675f7a0
[task 2023-02-14T03:48:00.941Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f770    r12 = 0x0000000000000000
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      r13 = 0x00007f207ae099e5    r14 = 0x0000000000000000
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      r15 = 0x00007f207ae099e5    rip = 0x00007f207e2ea8fb
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO - 11  libxul.so!nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverService.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 291 + 0x18]
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      rbx = 0x0000000000000000    rbp = 0x00007ffe9675f8d0
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f7b0    r12 = 0x0000000000000000
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      r13 = 0x00007f207ae099e5    r14 = 0x00007f208f4a6e48
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      r15 = 0x00007ffe9675f800    rip = 0x00007f207e2ebbcf
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO - 12  libxul.so!mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) [AppShutdown.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 400 + 0x10]
[task 2023-02-14T03:48:00.942Z] 03:48:00     INFO -      rbx = 0xaaaaaaaaaaaaaaaa    rbp = 0x00007ffe9675f940
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f8e0    r12 = 0x00007f207ae099e5
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      r13 = 0x00007ffe9675f8f8    r14 = 0x00007ffe9675f908
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      r15 = 0x0000000000000009    rip = 0x00007f207e284945
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO - 13  libxul.so!mozilla::ShutdownXPCOM(nsIServiceManager*) [XPCOMInit.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 590 + 0xe]
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      rbx = 0x00007ffe9675f968    rbp = 0x00007ffe9675f9f0
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      rsp = 0x00007ffe9675f950    r12 = 0x0000000000000000
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      r13 = 0x00007ffe96760e88    r14 = 0x0000000000000000
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -      r15 = 0x00007ffe9675f960    rip = 0x00007f207e3ab69e
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.943Z] 03:48:00     INFO - 14  libxul.so!XRE_InitChildProcess(int, char**, XREChildData const*) [nsEmbedFunctions.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 679 + 0x10]
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      rbx = 0x00007ffe9675fa18    rbp = 0x00007ffe9675fcf0
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      rsp = 0x00007ffe9675fa00    r12 = 0x00000000000005dd
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      r13 = 0x00007ffe96760e88    r14 = 0x000000000000000f
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      r15 = 0x00007ffe9675fd01    rip = 0x00007f2082daf840
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO - 15  firefox-bin!content_process_main(mozilla::Bootstrap*, int, char**) [plugin-container.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 57 + 0x10]
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      rbx = 0x0000000000000012    rbp = 0x00007ffe9675fd30
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      rsp = 0x00007ffe9675fd00    r12 = 0x00007ffe9675fd00
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x00007ffe96760e88
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -      r15 = 0x00007f208f4036d0    rip = 0x000055ad102911ca
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.944Z] 03:48:00     INFO - 16  firefox-bin!main [nsBrowserApp.cpp:b57b0356d5ddc81da52ec7c81343fb3516c09eaa : 353 + 0x11]
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      rbx = 0x0000000000000012    rbp = 0x00007ffe96760da0
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      rsp = 0x00007ffe9675fd40    r12 = 0x00007ffe96760f28
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x000055ad1033ab28
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      r15 = 0x00007ffe96760e88    rip = 0x000055ad102915a0
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO - 17  libc.so.6!__libc_start_main [libc-start.c : 310 + 0x19]
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      rbx = 0x0000000000000000    rbp = 0x000055ad10336820
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      rsp = 0x00007ffe96760db0    r12 = 0x000055ad10291070
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      r13 = 0x00007ffe96760e80    r14 = 0x0000000000000000
[task 2023-02-14T03:48:00.945Z] 03:48:00     INFO -      r15 = 0x0000000000000000    rip = 0x00007f208f8a6b97
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO - 18  firefox-bin!_start + 0x28
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO -      rbx = 0x0000000000000000    rbp = 0x0000000000000000
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO -      rsp = 0x00007ffe96760e70    r12 = 0x000055ad10291070
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO -      r13 = 0x00007ffe96760e80    r14 = 0x0000000000000000
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO -      r15 = 0x0000000000000000    rip = 0x000055ad10291099
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO -     Found by: call frame info
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO - 
[task 2023-02-14T03:48:00.946Z] 03:48:00     INFO - Thread 1 IPC I/O Child

Hello Nika! Could you please take a look at this? This was in the disable-recommended list. Maybe you can help us assign this to someone.
Thank you!

Flags: needinfo?(nika)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]

The linked log is showing that the content process is hanging during workers shutdown. ni? :jstutte for insight into why we'd be hanging during worker runtimeservice shutdown.

Flags: needinfo?(nika) → needinfo?(jstutte)

Worker shutdown happens in two steps:

and indeed we see an idling DOM Worker thread in the same, hanging content process:

[task 2023-02-15T23:14:08.735Z] 23:14:08     INFO - Thread 15 DOM Worker
[task 2023-02-15T23:14:08.735Z] 23:14:08     INFO -  0  libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
[task 2023-02-15T23:14:08.740Z] 23:14:08     INFO -  1  firefox-bin!mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) [ConditionVariable_posix.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 106 + 0xa]
[task 2023-02-15T23:14:08.743Z] 23:14:08     INFO -  2  firefox-bin!mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) [ConditionVariable_posix.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 113 + 0x7]
[task 2023-02-15T23:14:08.746Z] 23:14:08     INFO -  3  libxul.so!mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) [BlockingResourceBase.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 534 + 0xf]
[task 2023-02-15T23:14:08.749Z] 23:14:08     INFO -  4  libxul.so!mozilla::OffTheBooksCondVar::Wait() [BlockingResourceBase.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 514 + 0x4]
[task 2023-02-15T23:14:08.752Z] 23:14:08     INFO -  5  libxul.so!mozilla::dom::WorkerPrivate::WaitForWorkerEvents() [WorkerPrivate.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 3866 + 0xb]
[task 2023-02-15T23:14:08.755Z] 23:14:08     INFO -  6  libxul.so!mozilla::dom::WorkerPrivate::RunCurrentSyncLoop() [WorkerPrivate.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 4324 + 0x7]
[task 2023-02-15T23:14:08.757Z] 23:14:08     INFO -  7  libxul.so!mozilla::dom::AutoSyncLoopHolder::Run() [WorkerPrivate.h:99d62e2f0002221e0bff4b09e69b879baeb69532 : 1531 + 0x4]
[task 2023-02-15T23:14:08.760Z] 23:14:08     INFO -  8  libxul.so!mozilla::dom::WorkerMainThreadRunnable::Dispatch(mozilla::dom::WorkerStatus, mozilla::ErrorResult&) [WorkerRunnable.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 570 + 0x4]
[task 2023-02-15T23:14:08.763Z] 23:14:08     INFO -  9  libxul.so!mozilla::dom::WorkerThreadProxySyncRunnable::Dispatch(mozilla::dom::WorkerStatus, mozilla::ErrorResult&) [XMLHttpRequestWorker.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 217 + 0x4]
[task 2023-02-15T23:14:08.765Z] 23:14:08     INFO - 10  libxul.so!mozilla::dom::XMLHttpRequestWorker::ReleaseProxy(mozilla::dom::XMLHttpRequestWorker::ReleaseType) [XMLHttpRequestWorker.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 1462 + 0xf]
[task 2023-02-15T23:14:08.768Z] 23:14:08     INFO - 11  libxul.so!fu2::abi_400::detail::type_erasure::tables::vtable<fu2::abi_400::detail::property<false, false, void ()> >::invoke<(unsigned long)0, fu2::abi_400::detail::type_erasure::data_accessor*, unsigned long const&>(fu2::abi_400::detail::type_erasure::data_accessor*, unsigned long const&) const [function2.hpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 1036]
[task 2023-02-15T23:14:08.769Z] 23:14:08     INFO - 12  libxul.so!fu2::abi_400::detail::type_erasure::erasure<true, fu2::abi_400::detail::config<true, false, fu2::capacity_fixed<(unsigned long)16, (unsigned long)8> >, fu2::abi_400::detail::property<false, false, void ()> >::invoke<(unsigned long)0, fu2::abi_400::detail::type_erasure::erasure<true, fu2::abi_400::detail::config<true, false, fu2::capacity_fixed<(unsigned long)16, (unsigned long)8> >, fu2::abi_400::detail::property<false, false, void ()> >&, >(fu2::abi_400::detail::type_erasure::erasure<true, fu2::abi_400::detail::config<true, false, fu2::capacity_fixed<(unsigned long)16, (unsigned long)8> >, fu2::abi_400::detail::property<false, false, void ()> >&, &&) [function2.hpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 1258]
[task 2023-02-15T23:14:08.770Z] 23:14:08     INFO - 13  libxul.so!fu2::abi_400::detail::type_erasure::invocation_table::operator_impl<(unsigned long)0, fu2::abi_400::detail::function<fu2::abi_400::detail::config<true, false, fu2::capacity_fixed<(unsigned long)16, (unsigned long)8> >, fu2::abi_400::detail::property<false, false, void ()> >, void ()>::operator()() [function2.hpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 816]
[task 2023-02-15T23:14:08.770Z] 23:14:08     INFO - 14  libxul.so!mozilla::dom::WorkerRef::Notify() [WorkerRef.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 97 + 0xb]
[task 2023-02-15T23:14:08.773Z] 23:14:08     INFO - 15  libxul.so!mozilla::dom::WorkerPrivate::NotifyWorkerRefs(mozilla::dom::WorkerStatus) [WorkerPrivate.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 4135 + 0x5]
[task 2023-02-15T23:14:08.776Z] 23:14:08     INFO - 16  libxul.so!mozilla::dom::WorkerPrivate::NotifyInternal(mozilla::dom::WorkerStatus) [WorkerPrivate.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 4815 + 0xa]
[task 2023-02-15T23:14:08.779Z] 23:14:08     INFO - 17  libxul.so!mozilla::dom::WorkerRunnable::Run() [WorkerRunnable.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 377 + 0x13]
[task 2023-02-15T23:14:08.781Z] 23:14:08     INFO - 18  libxul.so!mozilla::dom::WorkerPrivate::ProcessAllControlRunnablesLocked() [WorkerPrivate.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 3887 + 0x5]
[task 2023-02-15T23:14:08.784Z] 23:14:08     INFO - 19  libxul.so!mozilla::dom::WorkerPrivate::DoRunLoop(JSContext*) [WorkerPrivate.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 3174 + 0x7]
[task 2023-02-15T23:14:08.786Z] 23:14:08     INFO - 20  libxul.so!mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() [RuntimeService.cpp:99d62e2f0002221e0bff4b09e69b879baeb69532 : 2044 + 0xb]

Apparently we are receiving the cancel notification and while notifying the worker refs we find one that makes us create a sync loop where we get stuck.

It might look as if all of this started with the landing of bug 1803062, where we touched some sensible part of these code paths? The first instances here are from Dec 13 where we first landed it (got backed out and relanded on Dec 15). But interestingly, the patch from bug 1793525 which makes us recognize these problems landed around the same time.

Flags: needinfo?(jvarga)
Flags: needinfo?(jstutte)
Flags: needinfo?(jld)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]
Severity: -- → S3
Priority: -- → P2

(In reply to Jens Stutte [:jstutte] from comment #22)

:asuth, would I be too optimistic in hoping that the cleanup from bug 1775784 might help also here ?

I don't think it will help.

Maybe we could have the logic that notices a content process is hanging and says it will wait a bit (posix win) send an IPC message to the content process that could call a variant of RuntimeService::CrashIfHanging that just printfs the message it builds up instead of calling MOZ_CRASH_UNSAFE? (Like we can refactor all the code into nsCString InspectRemainingWorkers() and just have CrashIfHanging call that, and if the string is not void then it will crash. (But in the IPC case we always would print out like "WorkerInspection: ..." so it's easy to find in the log.

(In reply to Jens Stutte [:jstutte] from comment #38)

and indeed we see an idling DOM Worker thread in the same, hanging content process:

I'm having trouble identifying which log this is from; do you know? This seems like it may be a different problem since a failure as shown in this stack (in XMLHttpRequestWorker::ReleaseProxy) is inherently distinctive because it must be on the worker stack. Also, there may be something interesting happening on other threads because these crashtests use the file scheme and so the sync XHR usage in the dom/fs crash tests (1 2) could potentially be showing something like file I/O still happening on the STS.

It might look as if all of this started with the landing of bug 1803062, where we touched some sensible part of these code paths? The first instances here are from Dec 13 where we first landed it (got backed out and relanded on Dec 15). But interestingly, the patch from bug 1793525 which makes us recognize these problems landed around the same time.

I dug into the dom/fs logic that could be related to this but nothing jumped out at me. I think having the CrashIfHanging data would probably be most promising.

Flags: needinfo?(bugmail)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #41)

(In reply to Jens Stutte [:jstutte] from comment #38)

and indeed we see an idling DOM Worker thread in the same, hanging content process:

I'm having trouble identifying which log this is from; do you know? This seems like it may be a different problem since a failure as shown in this stack (in XMLHttpRequestWorker::ReleaseProxy) is inherently distinctive because it must be on the worker stack. Also, there may be something interesting happening on other threads because these crashtests use the file scheme and so the sync XHR usage in the dom/fs crash tests (1 2) could potentially be showing something like file I/O still happening on the STS.

For example here.

I touched base with :jstutte and Joshua; here's some more context/details for my proposal:

Problem

  • This bug (and potentially related bugs) are a family of shutdown hangs that:
    • Only happen in content processes; we'd see a different signature in the parent process where nsTerminator would call RuntimeService::CrashIfHanging.
      • I think the terminator is only used in the parent process and maybe some other processes? :jstutte, please feel free to correct me if I am wrong. I asked in #codecoverage about bug 1685505 which could provide coverage information on a per-process basis, but there weren't any runs immediately available to download and check.
    • Only happen in debug-ish builds, as in opt builds we would never let the content process get to xpcom-shutdown-threads where we're hanging.
    • Are indicative of a logic bug that is failing to remove a WorkerRef in response to a worker transitioning to cancellation.
      • It is quite plausible that the problem here will be fixed by bug 1800659 since we may be seeing a situation where a runnable is being cleaned up by ClearMainEventQueue which is somewhat of a foot-gun.
    • Probably will show up under a wide variety of different test signatures because we only detect the problem when an effort is made to shut down the content process, which is likely to only happen once all the tests are done running (if the worker was running under a common origin), or will only happen multiple seconds after the test has completed thanks to the grace periods (for a different origin, or if a dedicated ServiceWorker process is involved.)

Prior Proposal from comment 41

  • The information provided by RuntimeService::CrashIfHanging is pretty handy, but not available in this situation because the content process is being terminated by the parent process in ChildReaper/ChildLaxReaper::CrashProcessIfHanging (on posix, on windows).
  • Old plan from comment 41: It's possible we could do something more involved to change the nature of the crash so that in this content process case we are actually crashing by calling the workers RuntimeService::CrashIfHanging, but I think it's potentially more useful and less risky of a change if we:
    • Introduce a new method RuntimeService::DumpRunningWorkers which could be triggered by ChildReaper::CrashProcessIfHanging and which could also be used interactively from gdb, similar to how nsIContent::Dump can be used to dump the contents of DOM elements/nodes or DumpJSStack can dump the current JS stack.
    • Add a new #ifdef DEBUG IPC method like SlowShutdownDumpRelevantInfo to PContent (maybe adjacent to the existing shutdown-related methods which could be sent by the ChildReaper::CrashProcessIfHanging methods immediately prior to beginning their sleep. The main rationale is to avoid having this message and its processing race the termination of the child process; sending the message immediately before calling kill() is obviously likely to not accomplish anything

Revised plan without IPC

While researching the IPC implications, I think I realized my proposal is impossible[1] because of when CrashProcessIfHanging is invoked. I asked in https://chat.mozilla.org/#/room/#ipc:mozilla.org about that, but in the process I think I realized it's maybe better if we just:

  • In our event loop spinning in RuntimeService::Cleanup, we should add a short timer in content processes that causes us to call DumpRunningWorkers (which we would add) after a second or something like that.
  • We could cause a crash if we wanted, but we would need to include the string "fatal error" to make treeherder recognize what's happening. Alternately, we could not cause a crash and let the current situation continue to happen where the parent process causes the crash, but we will have the diagnostic information in the log already.

1: Edit: Nika indicated on https://chat.mozilla.org/#/room/#ipc:mozilla.org that when that call is made we are very deep in shutdown, as at that point the single IO thread has been shutdown, probably here and the main thread will also no longer be processing runnables, etc.

Flags: needinfo?(jvarga)

Additional discussion in https://chat.mozilla.org/#/room/#ipc:mozilla.org suggests maybe nsTerminator can just address this situation? It sounds like maybe I was wrong about it getting used in content processes. :jstutte, I assume you might know offhand?

Flags: needinfo?(jstutte)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #44)

Additional discussion in https://chat.mozilla.org/#/room/#ipc:mozilla.org suggests maybe nsTerminator can just address this situation? It sounds like maybe I was wrong about it getting used in content processes. :jstutte, I assume you might know offhand?

The nsTerminator singleton is initialized only via nsAppStartup::Quit or nsWindow::ProcessMessageInternal both of which are not supposed to run in content processes. So, while there is no explicit check or assert inside AppShutdown::Init that ensures anything, the terminator is currently only used in the parent process, AFAICS.

(In particular RuntimeService::CrashIfHanging is only ever called in the parent process (via the terminator), which feels sound to me as only the parent process can be supposed to have the complete oversight over all existing service workers.)

Content children are governed by two (!) other shutdown timers, ContentParent::StartForceKillTimer and ContentChild::StartForceKillTimer, racing in the parent and the content process (with the same dom_ipc_tabs_shutdownTimeoutSecs). The philosophy here is that

  • if the content process is hanging, the parent will kill it
  • if the parent process is too busy to acknowledge a child's shutdown, the child will kill itself. Note that this timer is only started after "content-child-shutdown" has been already notified.

See bug 1279293 comment 208 for situations where either can happen. Here we see a situation where the parent's timer fired and kills us, apparently, as you already noted in comment 43. We are stuck in the loop inside RuntimeService::Cleanup which is called late during "xpcom-shutdown-threads" in the content process.

Bug 1746591 already proposes to move RuntimeService::Cleanup to an async shutdown blocker, but that alone would not necessarily heal the flaws of our book-keeping here. Your proposal to do something on the lines of RuntimeService::CrashIfHanging to dump out our book-keeping if we are stuck in that loop for a while seems reasonable to me.

Flags: needinfo?(jstutte)

(In reply to Jens Stutte [:jstutte] from comment #45)

Bug 1746591 already proposes to move RuntimeService::Cleanup to an async shutdown blocker, but that alone would not necessarily heal the flaws of our book-keeping here. Your proposal to do something on the lines of RuntimeService::CrashIfHanging to dump out our book-keeping if we are stuck in that loop for a while seems reasonable to me.

Note that if we really want to be able to react on shutdown in a content process, we should observe "content-child-shutdown" or "quit-application-granted" as these are the only notifications that are always fired during shutdown. In particular the entire XPCOM shutdown is not happening in release at all. But I assume we can just omit the cleanup also for workers? Or should we tell the parent that the workers using our process are going away?

Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]
Assignee: nobody → jmarshall
Attachment #9324663 - Attachment description: WIP: Bug 1805613 - Dump worker info on shutdown hang → Bug 1805613 - Dump worker info on shutdown hang r=#dom-worker-reviewers!
Status: NEW → ASSIGNED
Pushed by jmarshall@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4f9330b81814 Dump worker info on shutdown hang r=asuth
Keywords: leave-open

The log for https://treeherder.mozilla.org/logviewer?job_id=410819921&repo=autoland shows:

[task 2023-03-30T15:40:39.044Z] 15:40:39 INFO - GECKO(2497) | [Child 4032: Main Thread]: D/WorkerShutdownDump Found ActiveWorker (dedicated): resource://gre/modules/translation/cld-worker.js
[task 2023-03-30T15:40:39.045Z] 15:40:39 INFO - GECKO(2497) | [Child 4032: Main Thread]: D/WorkerShutdownDump Principal: [System Principal]
[task 2023-03-30T15:40:39.045Z] 15:40:39 INFO - GECKO(2497) | [Child 4032: Main Thread]: D/WorkerShutdownDump LoadingPrincipal: [System Principal]
[task 2023-03-30T15:40:39.047Z] 15:40:39 INFO - GECKO(2497) | [Child 4032: Main Thread]: D/WorkerShutdownDump BusyCount: 3
[task 2023-03-30T15:40:39.048Z] 15:40:39 INFO - GECKO(2497) | [Child 4032: Main Thread]: D/WorkerShutdownDump CrashInfo: IsChromeWorker(false)|ScriptLoader|XMLHttpRequestWorker

Woo! Hooray debug logging!

So cld-worker.js has 2 uses of XHR:

  • sync:
c.read=function(a){var b=new XMLHttpRequest;b.open("GET",a,!1)
  • async:
c.readAsync=function(a,b,d){var e=new XMLHttpRequest;e.open("GET",a,!0)

And it seems like the worker is created from LanguageDetector.sys.mjs which seems to lack any shutdown awareness. Gonna quickly file a bug there.

I filed Bug 1826222 on the LanguageDetector being shutdown-aware, but I should note that the shutdown hang in question very much seems like something that is strictly due to workers-related bugs, so there's no reason to block on that bug.

It seems very possible that the problem here may be something involving either cancellation, which we should ideally fix soon in bug 1800659, and/or scriptloader complications, and there are various recently landed fixes and imminently landing fixes for that, so I'm not sure there's more to do here on this bug other than scrape the various intermittents that get reported to see if they differ from comment 73 or trends change.

edit: Note that the ScriptLoader WorkerRef may not actually be causing problems, it could just be a side effect of the XHR spinning a syncloop on top of the script loader's syncloop.

Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]

Clicking on some recent instances I saw:

[task 2023-04-15T23:36:17.175Z] 23:36:17     INFO - [Child 1806: Main Thread]: D/WorkerShutdownDump Found ActiveWorker (dedicated): blob:null/ecd76e1a-8290-4598-8cb2-e5d8e5cb235a
[task 2023-04-15T23:36:17.176Z] 23:36:17     INFO - [Child 1806: Main Thread]: D/WorkerShutdownDump Principal: file://UNIVERSAL_FILE_URI_ORIGIN
[task 2023-04-15T23:36:17.177Z] 23:36:17     INFO - [Child 1806: Main Thread]: D/WorkerShutdownDump LoadingPrincipal: file://UNIVERSAL_FILE_URI_ORIGIN
[task 2023-04-15T23:36:17.177Z] 23:36:17     INFO - [Child 1806: Main Thread]: D/WorkerShutdownDump BusyCount: 1
[task 2023-04-15T23:36:17.178Z] 23:36:17     INFO - [Child 1806: Main Thread]: D/WorkerShutdownDump CrashInfo: IsChromeWorker(false)|WorkerDebuggeeRunnable::mSender

Not sure what that actually means, though. Do blobs keep a worker reference until they live? From a short search in searchfox I think "not in general", but there is CreateImageBitmapFromBlob::WorkerShuttingDown that looks a bit suspicious, fiddling with a mutex. But I have no real evidence that this ref is even involved here, of course. FWIW, the DOM Worker thread seems to be alive and just happily waiting for new events inside WorkerPrivate::DoRunLoop (which makes sense, as we'll stay there when HasActiveWorkerRefs), so I assume we are not stuck on that mutex, at least.

Flags: needinfo?(bugmail)

(In reply to Jens Stutte [:jstutte] from comment #83)

Not sure what that actually means, though. Do blobs keep a worker reference until they live?

I'm not sure I'm parsing you correctly here, but if this was related to loading the top-level script via Blob URL, we would expect to see a "ScriptLoader" StrongWorkerRef.

In general, Blobs themselves don't cause lifetime issues, but attempts to read their contents are potentially async and those will involve StrongWorkerRef instances.

From a short search in searchfox I think "not in general", but there is CreateImageBitmapFromBlob::WorkerShuttingDown that looks a bit suspicious, fiddling with a mutex. But I have no real evidence that this ref is even involved here, of course.

I don't think it's involved in this case based on a search for createImageBitmap and searching in the test log. (Caveat: as discussed in https://chat.mozilla.org/#/room/#test:mozilla.org it's a bit difficult to figure out what's going on test-wise from that job. Ideally bug 1608836 would be fixed so we can see the "Test groups" there and then maybe even something like searchfox could make it possible to do a search filtered down to the tests that would have been run for the job.)

FWIW, the DOM Worker thread seems to be alive and just happily waiting for new events inside WorkerPrivate::DoRunLoop (which makes sense, as we'll stay there when HasActiveWorkerRefs), so I assume we are not stuck on that mutex, at least.

I think the Busycount being 1 suggests that something like the ParentStatus check we were going to address in another patch by chaning to >= Killing or similar[1] is failing a dispatch to the worker and the runnable is getting leaked and so the workerref is never released. Bug 1800659 will potentially help with that, but may not end up addressing the ParentStatus check.

1: I'm struggling to find the patch for that. Our phabricator emails continue to no longer send me a copy of my own comments (https://bugzilla.mozilla.org/show_bug.cgi?id=1666365) and the phabricator search times out after 30 seconds for most interesting queries.

Flags: needinfo?(bugmail)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #84)

(In reply to Jens Stutte [:jstutte] from comment #83)
I'm not sure I'm parsing you correctly here, but if this was related to loading the top-level script via Blob URL, we would expect to see a "ScriptLoader" StrongWorkerRef.

IIUC then the blob:null/ecd76e1a-8290-4598-8cb2-e5d8e5cb235a is just the "fake" URL of the blob that contains our worker JS. A stronger hint might be the last line of the log pointing to the WorkerDebuggeeRunnable::mSender ref.

I think the Busycount being 1 suggests that something like the ParentStatus check we were going to address in another patch by chaning to >= Killing or similar[1] is failing a dispatch to the worker and the runnable is getting leaked and so the workerref is never released. Bug 1800659 will potentially help with that, but may not end up addressing the ParentStatus check.

This would also apply to any WorkerDebuggeeRunnable, and in particular the MessageEventRunnable, see also bug 1769913 for an earlier suspect (already linked to bug 1800659, too).

Update

There have been 32 failures, all of them on Linux 18.04 x64 WebRender debug.

[task 2023-04-30T22:06:06.826Z] 22:06:06     INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ __pthread_cond_wait]
[task 2023-04-30T22:06:06.826Z] 22:06:06     INFO - Crash dump filename: /tmp/tmp_h_e8akm.mozrunner/minidumps/4bd1e343-f5b6-f915-01c1-c94c0ea6bd08.dmp
[task 2023-04-30T22:06:06.826Z] 22:06:06     INFO - Operating system: Linux
[task 2023-04-30T22:06:06.826Z] 22:06:06     INFO -                   4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018
[task 2023-04-30T22:06:06.826Z] 22:06:06     INFO - CPU: amd64
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO -      family 6 model 85 stepping 7
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO -      2 CPUs
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - Linux Ubuntu 18.04 - bionic (Ubuntu 18.04.6 LTS)
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - 
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - Crash reason:  SIGABRT
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - Crash address: 0x0
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - Crashing instruction: `cmp rax, -0x1000`
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - No memory accessed by instruction
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - Process uptime: not available
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - 
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO - Thread 0 Isolated Web Co (crashed)
[task 2023-04-30T22:06:06.827Z] 22:06:06     INFO -  0  libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      rax = 0xfffffffffffffffc    rdx = 0x0000000000000000
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      rcx = 0x00007f6a7082a9f3    rbx = 0x00007f6a6f431418
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      rsi = 0x0000000000000080    rdi = 0x00007f6a6f431440
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      rbp = 0x00007f6a6f43143c    rsp = 0x00007ffe037bf950
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -       r8 = 0x0000000000000000     r9 = 0x0000000000000000
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      r10 = 0x0000000000000000    r11 = 0x0000000000000246
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      r12 = 0x00007f6a6f431440    r13 = 0x0000000000000000
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      r14 = 0x00007f6a6f431300    r15 = 0x0000000000000060
[task 2023-04-30T22:06:06.828Z] 22:06:06     INFO -      rip = 0x00007f6a7082a9f3
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -     Found by: given as instruction pointer in context
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -  1  firefox-bin!mozilla::detail::ConditionVariableImpl::wait(mozilla::detail::MutexImpl&) [ConditionVariable_posix.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 106 + 0xa]
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -      rbx = 0x00007f6a6f431418    rbp = 0x00007ffe037bfa30
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfa20    r12 = 0x7fffffffffffffff
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -      r13 = 0x00007f6a6f4834c0    r14 = 0x00007ffe037bfab8
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -      r15 = 0x00007f6a6f431301    rip = 0x0000561cce76518e
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -  2  firefox-bin!mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) [ConditionVariable_posix.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 113 + 0x7]
[task 2023-04-30T22:06:06.829Z] 22:06:06     INFO -      rbx = 0x00007f6a6f431418    rbp = 0x00007ffe037bfaa0
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfa40    r12 = 0x7fffffffffffffff
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -      r13 = 0x00007f6a6f4834c0    r14 = 0x00007ffe037bfab8
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -      r15 = 0x00007f6a6f431301    rip = 0x0000561cce76520a
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -  3  libxul.so!mozilla::OffTheBooksCondVar::Wait(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) [BlockingResourceBase.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 534 + 0xf]
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -      rbx = 0x00007f6a6f4313f8    rbp = 0x00007ffe037bfaf0
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfab0    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.830Z] 22:06:06     INFO -      r13 = 0x00007f6a6f4834c0    r14 = 0x00007ffe037bfab8
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -      r15 = 0x00007f6a6f431301    rip = 0x00007f6a5c183d20
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -  4  libxul.so!mozilla::OffTheBooksCondVar::Wait() [BlockingResourceBase.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 514 + 0x4]
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -      rbx = 0x00007f6a6f431300    rbp = 0x00007ffe037bfb00
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfb00    r12 = 0x00007f6a589ba58a
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -      r13 = 0x00007ffe037bfb18    r14 = 0xaaaaaaaaaaaaaaaa
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -      r15 = 0x00007f6a6f4313f8    rip = 0x00007f6a5c183c75
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.831Z] 22:06:06     INFO -  5  libxul.so!mozilla::TaskController::GetRunnableForMTTask(bool) [TaskController.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 603 + 0x7]
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -      rbx = 0x00007f6a6f431300    rbp = 0x00007ffe037bfb50
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfb10    r12 = 0x00007f6a589ba58a
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -      r13 = 0x00007ffe037bfb18    r14 = 0xaaaaaaaaaaaaaaaa
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -      r15 = 0x00007f6a6f4313f8    rip = 0x00007f6a5c18b71b
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -  6  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 1173 + 0xf]
[task 2023-04-30T22:06:06.832Z] 22:06:06     INFO -      rbx = 0x00007f6a6f472200    rbp = 0x00007ffe037bfcb0
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfb60    r12 = 0x00007f6a6f472201
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -      r13 = 0x0000000000000001    r14 = 0x00007ffe037bfcc7
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -      r15 = 0x00007ffe037bfc10    rip = 0x00007f6a5c1a2f0c
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -  7  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 479 + 0xe]
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -      rbx = 0x0000000000000001    rbp = 0x00007ffe037bfce0
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfcc0    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.833Z] 22:06:06     INFO -      r13 = 0x00007ffe037bfcf8    r14 = 0x00007ffe037bfcc7
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -      r15 = 0x00007f6a6f472200    rip = 0x00007f6a5c1a780a
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -  8  libxul.so!mozilla::dom::workerinternals::RuntimeService::Cleanup() [RuntimeService.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 1654 + 0xc]
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -      rbx = 0x00007f6a512cc940    rbp = 0x00007ffe037c0090
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -      rsp = 0x00007ffe037bfcf0    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -      r13 = 0x00007ffe037bfcf8    r14 = 0x00007f6a512cc948
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -      r15 = 0x00007f6a6f472200    rip = 0x00007f6a5f2149d3
[task 2023-04-30T22:06:06.834Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -  9  libxul.so!mozilla::dom::workerinternals::RuntimeService::Observe(nsISupports*, char const*, char16_t const*) [RuntimeService.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 1950 + 0x7]
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      rbx = 0x00007f6a512cc940    rbp = 0x00007ffe037c00d0
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      rsp = 0x00007ffe037c00a0    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      r13 = 0x0000000000000000    r14 = 0x00007f6a58cef33d
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      r15 = 0x0000000000000000    rip = 0x00007f6a5f2185dd
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO - 10  libxul.so!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 70 + 0xe]
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      rbx = 0x0000000000000000    rbp = 0x00007ffe037c0110
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      rsp = 0x00007ffe037c00e0    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.835Z] 22:06:06     INFO -      r13 = 0x0000000000000000    r14 = 0x00007f6a58cef33d
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -      r15 = 0x0000000000000000    rip = 0x00007f6a5c11c3e8
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO - 11  libxul.so!nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverService.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 291 + 0x18]
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -      rbx = 0x00007f6a58cef33d    rbp = 0x00007ffe037c0240
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -      rsp = 0x00007ffe037c0120    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -      r13 = 0x0000000000000000    r14 = 0x00007f6a6f4a0c18
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -      r15 = 0x00007ffe037c0170    rip = 0x00007f6a5c11d6b0
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.836Z] 22:06:06     INFO - 12  libxul.so!mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) [AppShutdown.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 433 + 0x10]
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -      rbx = 0x0000000000000009    rbp = 0x00007ffe037c02a0
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -      rsp = 0x00007ffe037c0250    r12 = 0x00007f6a58cef33d
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -      r13 = 0x00007ffe037c0258    r14 = 0x00007ffe037c0268
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -      r15 = 0x00007ffe037c02c8    rip = 0x00007f6a5c0b3bd5
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO - 13  libxul.so!mozilla::ShutdownXPCOM(nsIServiceManager*) [XPCOMInit.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 590 + 0xe]
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -      rbx = 0x0000000000000000    rbp = 0x00007ffe037c0350
[task 2023-04-30T22:06:06.837Z] 22:06:06     INFO -      rsp = 0x00007ffe037c02b0    r12 = 0x0000000000000000
[task 2023-04-30T22:06:06.838Z] 22:06:06     INFO -      r13 = 0x000000000000451c    r14 = 0x00007ffe037c02c8
[task 2023-04-30T22:06:06.838Z] 22:06:06     INFO -      r15 = 0x00007ffe037c02c8    rip = 0x00007f6a5c1de58e
[task 2023-04-30T22:06:06.838Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.838Z] 22:06:06     INFO - 14  libxul.so!XRE_InitChildProcess(int, char**, XREChildData const*) [nsEmbedFunctions.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 677 + 0x10]
[task 2023-04-30T22:06:06.838Z] 22:06:06     INFO -      rbx = 0x00007ffe037c0378    rbp = 0x00007ffe037c0650
[task 2023-04-30T22:06:06.839Z] 22:06:06     INFO -      rsp = 0x00007ffe037c0360    r12 = 0x00007ffe037c03a8
[task 2023-04-30T22:06:06.839Z] 22:06:06     INFO -      r13 = 0x000000000000451c    r14 = 0x000000000000000f
[task 2023-04-30T22:06:06.839Z] 22:06:06     INFO -      r15 = 0x0000000080004005    rip = 0x00007f6a60ba8a47
[task 2023-04-30T22:06:06.839Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.839Z] 22:06:06     INFO - 15  firefox-bin!content_process_main(mozilla::Bootstrap*, int, char**) [plugin-container.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 57 + 0x11]
[task 2023-04-30T22:06:06.839Z] 22:06:06     INFO -      rbx = 0x00007ffe037c17e8    rbp = 0x00007ffe037c0690
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      rsp = 0x00007ffe037c0660    r12 = 0x00007ffe037c0660
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x0000000000000012
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      r15 = 0x00007f6a6f4036e0    rip = 0x0000561cce6f6e5d
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO - 16  firefox-bin!main [nsBrowserApp.cpp:f8e0fef28d2036d43fe3192a08e94b168c84627b : 375 + 0x11]
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      rbx = 0x00007ffe037c1888    rbp = 0x00007ffe037c1700
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      rsp = 0x00007ffe037c06a0    r12 = 0x00007ffe037c17e8
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      r13 = 0xaaaaaaaaaaaaaaaa    r14 = 0x0000561cce7a0318
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -      r15 = 0x0000000000000012    rip = 0x0000561cce6f7267
[task 2023-04-30T22:06:06.840Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO - 17  libc.so.6!__libc_start_main [libc-start.c : 310 + 0x19]
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      rbx = 0x0000000000000000    rbp = 0x0000561cce79bfb0
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      rsp = 0x00007ffe037c1710    r12 = 0x0000561cce6f6d00
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      r13 = 0x00007ffe037c17e0    r14 = 0x0000000000000000
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      r15 = 0x0000000000000000    rip = 0x00007f6a6f90ab97
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO - 18  firefox-bin!_start + 0x28
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      rbx = 0x0000000000000000    rbp = 0x0000000000000000
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      rsp = 0x00007ffe037c17d0    r12 = 0x0000561cce6f6d00
[task 2023-04-30T22:06:06.841Z] 22:06:06     INFO -      r13 = 0x00007ffe037c17e0    r14 = 0x0000000000000000
[task 2023-04-30T22:06:06.842Z] 22:06:06     INFO -      r15 = 0x0000000000000000    rip = 0x0000561cce6f6d29
[task 2023-04-30T22:06:06.842Z] 22:06:06     INFO -     Found by: call frame info
[task 2023-04-30T22:06:06.842Z] 22:06:06     INFO - 
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell disable-recommended][stockwell needswork:owner]

Update

There have been 39 failures within the last 7 days, all of them on Linux 18.04 x64 WebRender debug.

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=415687214&repo=mozilla-central&lineNumber=52498

Whiteboard: [retriggered][stockwell disable-recommended][stockwell needswork:owner] → [retriggered][stockwell needswork:owner]
Component: String → DOM: Workers

Update
In the last 7 days, there have been 32 total failures, all of them on Linux 18.04 x64 WebRender debug.

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=417961754&repo=autoland&lineNumber=40901

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #84)

I think the Busycount being 1 suggests that something like the ParentStatus check we were going to address in another patch by chaning to >= Killing or similar[1] is failing a dispatch to the worker and the runnable is getting leaked and so the workerref is never released. Bug 1800659 will potentially help with that, but may not end up addressing the ParentStatus check.

From a more recent failure it seems that bug 1800659 did not help.

[task 2023-06-18T21:49:05.246Z] 21:49:05     INFO - GECKO(20314) | [Child 21862: Main Thread]: D/WorkerShutdownDump Found ActiveWorker (dedicated): resource://gre/modules/translation/cld-worker.js
[task 2023-06-18T21:49:05.248Z] 21:49:05     INFO - GECKO(20314) | [Child 21862: Main Thread]: D/WorkerShutdownDump Principal: [System Principal]
[task 2023-06-18T21:49:05.250Z] 21:49:05     INFO - GECKO(20314) | [Child 21862: Main Thread]: D/WorkerShutdownDump LoadingPrincipal: [System Principal]
[task 2023-06-18T21:49:05.250Z] 21:49:05     INFO - GECKO(20314) | [Child 21862: Main Thread]: D/WorkerShutdownDump BusyCount: 3
[task 2023-06-18T21:49:05.250Z] 21:49:05     INFO - GECKO(20314) | [Child 21862: Main Thread]: D/WorkerShutdownDump CrashInfo: IsChromeWorker(false)|XMLHttpRequestWorker

We might need bug 1836700 ?

Summary: Intermittent xpcom/string/crashtests/1113005.html (finished) | Main app process exited normally application crashed [@ __pthread_cond_wait] after [Parent 1500, IPC I/O Parent] WARNING: Process X hanging at shutdown; attempting crash report (fatal error) → Intermittent Main app process exited normally application crashed [@ __pthread_cond_wait] after [Parent 1500, IPC I/O Parent] WARNING: Process X hanging at shutdown; attempting crash report (fatal error)

Update

There have been 53 failures within the last 7 days, all of them on Linux 18.04 x64 WebRender debug.

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=422904304&repo=autoland

Based on Comment 98, Eden, do you have any input here?

Thank you.

Flags: needinfo?(echuang)

Update

There have been 80 failures within the last 7 days, all of them on Linux 18.04 x64 WebRender debug.

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=424268734&repo=autoland&lineNumber=37907

Joshua, is there any chance you have the time to look into this since you are the assignee?
Thank you.

Flags: needinfo?(jmarshall)

Yes, I can prioritize this.

Flags: needinfo?(jmarshall)

this is perma fail on wdspec tests as a result of bug 1836204

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #113)

this is perma fail on wdspec tests as a result of bug 1836204

Yes, and see also bug 1845424 which is covering a shutdown hang.

Maybe the recent spike is also caused by bug 1826222?

Flags: needinfo?(gtatum)

It seems that the fix on bug 1845518 helped here to lower the amount of failures in automation.

Depends on: 1845518
Depends on: 1836700
Flags: needinfo?(echuang)
Assignee: jmarshall → echuang

I think this could be improved by bug 1836700. Removing the WorkerPrivate's BusyCount could be helpful on this bug.

Maybe the recent spike is also caused by bug 1826222?

Most likely, which I have fixes in place for the issues in automation, and Bug 1826222 is filed for the further fix.

Flags: needinfo?(gtatum)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: