Closed Bug 1808820 Opened 2 years ago Closed 2 years ago

Assertion failure: !globalScopeAlive, at /builds/worker/checkouts/gecko/dom/workers/RuntimeService.cpp:2106

Categories

(Core :: Graphics: WebGPU, defect)

defect

Tracking

()

VERIFIED FIXED
112 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox108 --- disabled
firefox109 --- disabled
firefox110 --- disabled
firefox111 --- disabled
firefox112 --- fixed

People

(Reporter: tsmith, Assigned: jimb)

References

(Blocks 2 open bugs, Regression)

Details

(4 keywords, Whiteboard: [bugmon:bisected,confirmed][fuzzblocker])

Attachments

(3 files)

Attached file testcase.html (deleted) —

Found while fuzzing m-c 20221217-3ccb0b86ab11 (--enable-debug --enable-fuzzing)

To reproduce via Grizzly Replay:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch -d --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.html

Assertion failure: !globalScopeAlive, at /builds/worker/checkouts/gecko/dom/workers/RuntimeService.cpp:2106

#0 0x7fee8f8c1c2a in mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() /builds/worker/checkouts/gecko/dom/workers/RuntimeService.cpp:2106:7
#1 0x7fee8ad85598 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1191:16
#2 0x7fee8ad8ba1d in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:476:10
#3 0x7fee8b97d4ea in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:300:20
#4 0x7fee8b89fe28 in MessageLoop::RunInternal() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:381:10
#5 0x7fee8b89fd31 in RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:374:3
#6 0x7fee8b89fd31 in MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:356:3
#7 0x7fee8ad80a97 in nsThread::ThreadFunc(void*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:383:10
#8 0x7fee9db23c86 in _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5
#9 0x7fee9e3ccb42 in start_thread nptl/pthread_create.c:442:8
#10 0x7fee9e45e9ff  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Flags: in-testsuite?

Jens, what does this assertion mean? Thanks.

Flags: needinfo?(jstutte)

That assertion means that something living outside the worker kept a reference to the worker's global scope such that it has not been freed before the worker ends. If we could get a pernosco session for this we will hopefully discover, who. Thanks Tyson!

Flags: needinfo?(jstutte) → needinfo?(twsmith)

Verified bug as reproducible on mozilla-central 20230106214742-7968ae37c117.
The bug appears to have been introduced in the following build range:

Start: e6e2286d2ac25001127a1cf54a87a95fb435c734 (20220708093332)
End: 807e95cd9956aa4967ddddc80f8ccab4ad370e8d (20220708081410)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e6e2286d2ac25001127a1cf54a87a95fb435c734&tochange=807e95cd9956aa4967ddddc80f8ccab4ad370e8d

Keywords: regression
Whiteboard: [bugmon:bisected,confirmed]

(In reply to Jens Stutte [:jstutte] from comment #2)

If we could get a pernosco session for this we will hopefully discover, who. Thanks Tyson!

You got it :)

A Pernosco session is available here: https://pernos.co/debug/lWL9CW6z65G5DwmTN5-pdQ/index.html

Flags: needinfo?(twsmith)

:nical, since you are the author of the regressor, bug 1750576, could you take a look? Also, could you set the severity field?

For more information, please visit auto_nag documentation.

Flags: needinfo?(nical.bugzilla)

I've only dug into this a little bit, but it seems like the IPC flow with the promise in https://phabricator.services.mozilla.com/D146817 may be okay such that https://searchfox.org/mozilla-central/source/dom/base/DOMMozPromiseRequestHolder.h isn't necessary, but that the problem is that the PCanvasManager top-level protocol isn't being torn down so the IPC promise is not being automatically rejected, so this may potentially be dependent on :janv's work on bug 1809044? (Unless a WorkerRef is being used for the protocol lifecycle maintenance?)

It looks like past the assert we do try to safely handle the case if globalScopeAlive exists:
https://searchfox.org/mozilla-central/rev/fb9a504ca73529fa550efe488db2a012a4bf5169/dom/workers/RuntimeService.cpp#2111-2116

Is that enough to make us safe, or does the fact that it was alive here mean it might be causing problems elsewhere?

Flags: needinfo?(bugmail)

I think the failure mode will be one of a safe error/no-op/early return, a safe null de-ref crash, a safe explicit MOZ_CRASH/MOZ_RELEASE_ASSERT, or a safe leak, yes. In particular, because it seems like the top-level protocol is not getting torn down but the event target WorkerThread will be gone, I mainly expect this to result in leaks.

Flags: needinfo?(bugmail)
Keywords: sec-other

Was there any specific opt-in step taken to run WebGPU on a worker? For the first nightly milestone we aim to allow WebGPU only on main thread at first.

I won't have time to look into this soon, forwarding the needinfo to Jim to make sure someone in the team has the reminder to keep an eye on it.

Flags: needinfo?(nical.bugzilla) → needinfo?(jimb)
Attached file prefs.js (deleted) —

(In reply to Nicolas Silva [:nical] from comment #9)

Was there any specific opt-in step taken to run WebGPU on a worker?

This is the prefs.js file that was in use when the issue was found by fuzzers.

I'm not familiar with WebGPU at all, but just looking at the WebIDL for GPUProvider and the code for Instance::RequestAdapter() I can't at a glance see anything that would prevent this from being exposed on workers when dom.webgpu.enabled is set to true.

There has been an increase in reports recently (maybe another fix unblocked this?). Marking as fuzzblocker.

Whiteboard: [bugmon:bisected,confirmed] → [bugmon:bisected,confirmed][fuzzblocker]

(In reply to Andrew McCreight [:mccr8] from comment #11)

I'm not familiar with WebGPU at all, but just looking at the WebIDL for GPUProvider and the code for Instance::RequestAdapter() I can't at a glance see anything that would prevent this from being exposed on workers when dom.webgpu.enabled is set to true.

Flags: needinfo?(nical.bugzilla)

It seems we get this now also as an intermittent, see bug 1809668. Can we just unhide this to allow for easier duping and such?

WebGPU shouldn't be accessible in workers at all, for the time being.

Assignee: nobody → jimb
Flags: needinfo?(jimb)
Attachment #9318848 - Attachment description: WIP: Bug 1808820: Don't offer WebGPU in worker scopes. → Bug 1808820: Don't offer WebGPU in worker scopes.
Flags: needinfo?(nical.bugzilla)

Unhiding to make tracking dupes easier, and Jim has just queued the patch for landing.

Group: gfx-core-security
Pushed by jblandy@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/445044d219ad Don't offer WebGPU in worker scopes. r=webgpu-reviewers,webidl,nical,saschanaz
Duplicate of this bug: 1809668
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 112 Branch

Verified bug as fixed on rev mozilla-central 20230222094403-3408467a0885.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon
Regressions: 1818379
Regressions: 1818918
Duplicate of this bug: 1817557
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: