ReleaseWorkerRunnable control runnable does not call superclass WorkerRunnable::Cancel. [Assertion failure: IsCanceled() (Subclass Cancel() didn't set IsCanceled()!), at /dom/workers/WorkerRunnable.cpp:253]
Categories
(Core :: DOM: Workers, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox96 | --- | wontfix |
firefox97 | --- | wontfix |
firefox98 | --- | verified |
People
(Reporter: jkratzer, Assigned: jstutte)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression, testcase, Whiteboard: [bugmon:bisected,confirmed])
Attachments
(2 files)
Testcase found while fuzzing mozilla-central rev 0ea31fd939c8 (built with: --enable-debug --enable-fuzzing).
Testcase can be reproduced using the following commands:
$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build 0ea31fd939c8 --debug --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.zip --repeat 2
Assertion failure: IsCanceled() (Subclass Cancel() didn't set IsCanceled()!), at /dom/workers/WorkerRunnable.cpp:253
==913634==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f31f375f096 bp 0x7f31e4dcf340 sp 0x7f31e4dcf1b0 T913757)
==913634==The signal is caused by a WRITE memory access.
==913634==Hint: address points to the zero page.
#0 0x7f31f375f096 in mozilla::dom::WorkerRunnable::Run() /dom/workers/WorkerRunnable.cpp:253:5
#1 0x7f31f374ec91 in mozilla::dom::WorkerPrivate::ProcessAllControlRunnablesLocked() /dom/workers/WorkerPrivate.cpp:3677:9
#2 0x7f31f374fc13 in ProcessAllControlRunnables /builds/worker/workspace/obj-build/dist/include/mozilla/dom/WorkerPrivate.h:1050:12
#3 0x7f31f374fc13 in mozilla::dom::WorkerPrivate::OnProcessNextEvent() /dom/workers/WorkerPrivate.cpp:3175:15
#4 0x7f31f3769f6b in mozilla::dom::WorkerThread::Observer::OnProcessNextEvent(nsIThreadInternal*, bool) /dom/workers/WorkerThread.cpp:364:19
#5 0x7f31ef2123e0 in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1094:3
#6 0x7f31ef20f134 in NS_ProcessPendingEvents(nsIThread*, unsigned int) /xpcom/threads/nsThreadUtils.cpp:432:19
#7 0x7f31f3751858 in mozilla::dom::WorkerPrivate::ClearMainEventQueue(mozilla::dom::WorkerPrivate::WorkerRanOrNot) /dom/workers/WorkerPrivate.cpp:3720:5
#8 0x7f31f374f04b in mozilla::dom::WorkerPrivate::NotifyInternal(mozilla::dom::WorkerStatus) /dom/workers/WorkerPrivate.cpp:4533:7
#9 0x7f31f375ecac in mozilla::dom::WorkerRunnable::Run() /dom/workers/WorkerRunnable.cpp:378:12
#10 0x7f31f374ec91 in mozilla::dom::WorkerPrivate::ProcessAllControlRunnablesLocked() /dom/workers/WorkerPrivate.cpp:3677:9
#11 0x7f31f374df07 in mozilla::dom::WorkerPrivate::DoRunLoop(JSContext*) /dom/workers/WorkerPrivate.cpp:3004:21
#12 0x7f31f372e407 in mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() /dom/workers/RuntimeService.cpp:2244:42
#13 0x7f31ef212879 in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1169:16
#14 0x7f31ef21999a in NS_ProcessNextEvent(nsIThread*, bool) /xpcom/threads/nsThreadUtils.cpp:467:10
#15 0x7f31efca7e8b in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:300:20
#16 0x7f31efbc6307 in MessageLoop::RunInternal() /ipc/chromium/src/base/message_loop.cc:331:10
#17 0x7f31efbc6212 in RunHandler /ipc/chromium/src/base/message_loop.cc:324:3
#18 0x7f31efbc6212 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:306:3
#19 0x7f31ef20e4eb in nsThread::ThreadFunc(void*) /xpcom/threads/nsThread.cpp:391:10
#20 0x7f32043c6a07 in _pt_root /nsprpub/pr/src/pthreads/ptthread.c:201:5
#21 0x7f320513a608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477:8
#22 0x7f3204d02292 in __clone /build/glibc-eX1tMB/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV /dom/workers/WorkerRunnable.cpp:253:5 in mozilla::dom::WorkerRunnable::Run()
==913634==ABORTING
Reporter | ||
Comment 1•3 years ago
|
||
Comment 2•3 years ago
|
||
Bugmon Analysis
Verified bug as reproducible on mozilla-central 20211115093917-0ea31fd939c8.
The bug appears to have been introduced in the following build range:
Start: 9b2e412995e62775bbc37a013354a6c964e25e69 (20210929123904)
End: 68940497078c0bd6d8101a180f98a686bf9a78c3 (20210929130821)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=9b2e412995e62775bbc37a013354a6c964e25e69&tochange=68940497078c0bd6d8101a180f98a686bf9a78c3
Assignee | ||
Comment 3•3 years ago
|
||
That pushlog points to bug 1722576.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Comment 4•3 years ago
|
||
So far I can't reproduce this assertion. Is it possible that this test case doesn't actually depend on structuredClone
?
Reporter | ||
Comment 5•3 years ago
|
||
Tom, the call to structuredClone()
is required for me to trigger the bug. You may have better luck reproducing the issue by increasing the repeat count like so:
python -m grizzly.replay ./firefox/firefox testcase.zip --repeat 100 --relaunch 2
Comment 6•3 years ago
|
||
I can't reproduce this. Maybe some should try to investigate.
[2021-11-23 20:21:33] Running test (100/100)...
[2021-11-23 20:21:36] Failed to reproduce results
Assignee | ||
Comment 7•3 years ago
|
||
(In reply to Jason Kratzer [:jkratzer] from comment #5)
Tom, the call to
structuredClone()
is required for me to trigger the bug. You may have better luck reproducing the issue by increasing the repeat count like so:
python -m grizzly.replay ./firefox/firefox testcase.zip --repeat 100 --relaunch 2
Jason, does this still reproduce for you? And I assume you are already trying to reproduce this with pernosco? Thanks!
Reporter | ||
Comment 8•3 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #7)
(In reply to Jason Kratzer [:jkratzer] from comment #5)
Tom, the call to
structuredClone()
is required for me to trigger the bug. You may have better luck reproducing the issue by increasing the repeat count like so:
python -m grizzly.replay ./firefox/firefox testcase.zip --repeat 100 --relaunch 2
Jason, does this still reproduce for you? And I assume you are already trying to reproduce this with pernosco? Thanks!
Jens, yes - this still reproduces for me on m-c 20211129-d03f87555639. I'm trying to get a pernosco session but so far no luck.
Comment 9•3 years ago
|
||
A Pernosco session is available here: https://pernos.co/debug/l9iyK8CJNLMvgsHQ4-ct5Q/index.html
Assignee | ||
Updated•3 years ago
|
Comment 10•3 years ago
|
||
You successfully logged in, but either you are not authorized to view this trace OR the debugging database for this trace has expired (typically 7 days after the trace was collected) and needs to be rebuilt.
Comment 11•3 years ago
|
||
(In reply to Tom Schuster [:evilpie] from comment #10)
You successfully logged in, but either you are not authorized to view this trace OR the debugging database for this trace has expired (typically 7 days after the trace was collected) and needs to be rebuilt.
Please try this: https://github.com/Pernosco/pernosco/wiki/Login-Troubleshooting
Updated•3 years ago
|
Updated•3 years ago
|
Comment 12•3 years ago
|
||
I don't have a @mozilla account obviously, so I was a bit hesitant to do this. But now that we are also getting bug 1749002 on try we should do something. Can someone who actually know worker code look at this and see if bug 1749002 is related as well?
Updated•3 years ago
|
Assignee | ||
Comment 13•3 years ago
|
||
Sorry, I did not think about the access to pernosco.
Eden, can you help to take a look?
Comment 14•3 years ago
|
||
The pernosco trace shows that the WeakWorkerRef ReleaseWorkerRunnable::Cancel is failing to call the WorkerRunnable::Cancel like PerformanceEntryAdded does, for example.
Note that it continues to be nonsensical that WorkerControlRunnables can be canceled, but that's not something we're going to fix in this bug (and I think there's an existing bug).
Assignee | ||
Comment 15•3 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #14)
The pernosco trace shows that the WeakWorkerRef ReleaseWorkerRunnable::Cancel is failing to call the WorkerRunnable::Cancel like PerformanceEntryAdded does, for example.
Would this apply also to the CrashIfHangingRunnable ?
Comment 16•3 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #15)
Would this apply also to the CrashIfHangingRunnable ?
Yes.
Assignee | ||
Comment 17•3 years ago
|
||
Updated•3 years ago
|
Assignee | ||
Comment 18•3 years ago
|
||
This seemed straight forward enough to just do it, but while doing so I noticed that we need to be more careful about the order inside Cancel
also where we already called the base class' function, that is we need to ensure that the base class' function is called first and bail out in case. I hope I understood this right (see patch).
Comment 19•3 years ago
|
||
:jstutte, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 20•3 years ago
|
||
(In reply to Release mgmt bot [:marco/ :calixte] from comment #19)
:jstutte, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.
Actually it is a bit unfair to say, that bug 1722576 regressed this. It just happened to implement a piece of API that the fuzzer then used, but the underlying issue was always there and could have been triggered differently, I assume...
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Comment 21•3 years ago
|
||
Comment 22•3 years ago
|
||
bugherder |
Comment 23•3 years ago
|
||
Bugmon Analysis
Verified bug as fixed on rev mozilla-central 20220112213002-38711fbec2b1.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.
Comment 24•3 years ago
|
||
Set release status flags based on info from the regressing bug 1722576
Updated•3 years ago
|
Description
•