Closed Bug 1799201 Opened 2 years ago Closed 2 years ago

Crash in [@ RefPtr<T>::operator nsIGlobalObject* const | mozilla::dom::Promise::MaybeSomething<T>]

Categories

(Core :: DOM: Workers, defect, P2)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox107 --- unaffected
firefox108 --- wontfix
firefox109 --- fixed

People

(Reporter: mccr8, Assigned: yulia)

References

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main109+r])

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/efe2e544-e29e-4d14-9a95-ee8510221103

Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS

Top 10 frames of crashing thread:

0  XUL  RefPtr<nsIGlobalObject>::operator nsIGlobalObject* const&  mfbt/RefPtr.h:299
0  XUL  mozilla::dom::Promise::MaybeSomething<nsresult&>  xpcom/base/nsCOMPtr.h:328
1  XUL  mozilla::dom::workerinternals::loader::WorkerScriptLoader::CancelMainThread  dom/workers/ScriptLoader.cpp:729
2  XUL  mozilla::detail::RunnableMethodArguments<nsTArray<mozilla::dom::WorkerLoadContext*>&&>::applyImpl<mozilla::dom::workerinternals::loader::WorkerScriptLoader, void   xpcom/ds/nsTArray.h:410
2  XUL  mozilla::detail::RunnableMethodArguments<nsTArray<mozilla::dom::WorkerLoadContext*>&&>::apply<mozilla::dom::workerinternals::loader::WorkerScriptLoader, void   xpcom/threads/nsThreadUtils.h:1153
2  XUL  mozilla::detail::RunnableMethodImpl<RefPtr<mozilla::dom::workerinternals::loader::WorkerScriptLoader> const, void   xpcom/ds/nsTArray.h:1029
3  XUL  mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal  mfbt/RefPtr.h:50
4  XUL  mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal  xpcom/threads/TaskController.cpp:684
5  XUL  nsThread::ProcessNextEvent  xpcom/base/nsCOMPtr.h:872
6  XUL  mozilla::ipc::MessagePump::Run  ipc/glue/MessagePump.cpp:182

Crash on a poison value, while doing something with a Promise in WorkerScriptLoader::CancelMainThread(). I don't know if this is a dupe of an existing issue.

Here's another crash from the same Nightly: bp-ad8af492-539c-4af0-ac51-f793f0221104

It is also happening while doing something with a Promise in WorkerScriptLoader::CancelMainThread(). The crash address is 0, but registers r12 and rdi contain the poison value, so maybe something similar is going wrong there.

This also looks like that prior crash, down to poison values in those two registers: bp-ffdc8588-9193-4f24-b461-559370221104

Another signature, with the poison value in rsi: bp-daf012e3-162b-4b64-8a21-70c720221104

Crash Signature: [@ RefPtr<T>::operator nsIGlobalObject* const | mozilla::dom::Promise::MaybeSomething<T>] → [@ RefPtr<T>::operator nsIGlobalObject* const | mozilla::dom::Promise::MaybeSomething<T>][@ RefPtr<T>::get | RefPtr<T>::operator nsIGlobalObject* | mozilla::dom::Promise::MaybeSomething<T> ]

Added Linux/Android signatures.

Crash Signature: [@ RefPtr<T>::operator nsIGlobalObject* const | mozilla::dom::Promise::MaybeSomething<T>][@ RefPtr<T>::get | RefPtr<T>::operator nsIGlobalObject* | mozilla::dom::Promise::MaybeSomething<T> ] → [@ RefPtr<T>::operator nsIGlobalObject* const | mozilla::dom::Promise::MaybeSomething<T>] [@ RefPtr<T>::get | RefPtr<T>::operator nsIGlobalObject* | mozilla::dom::Promise::MaybeSomething<T>] [@ RefPtr<T>::get | RefPtr<T>::operator nsIGlobalObject* const…

I think we touched this with Yulia's work on WorkerScriptLoader, not sure if it qualifies as regression of that, but you may have more context.

Flags: needinfo?(ystartsev)
Flags: needinfo?(bugmail)
Assignee: nobody → ystartsev
Flags: needinfo?(ystartsev)

This is partially tested by module tests, that are currently turned off. Such as /html/semantics/scripting-1/the-script-element/css-module/css-module-worker-test.html -- I'm investigating.

Severity: -- → S2
Priority: -- → P2
Depends on: 1800496

Since bug 1800496 landed there were no new crashes for nightly, AFAICS. Are we confident enough to close this?

Flags: needinfo?(bugmail) → needinfo?(ystartsev)

On this one I want to wait to track the crashes for a bit longer.

It looks like we no longer have crashes associated with the worker script loader for this crash signature -- but it exists for other code paths. If we want to treat this bug as being related to the crashes for worker script loader, then this can be closed.

Flags: needinfo?(ystartsev)

Hi Gabriele, could we prefix this better in order to look at the first frame after the promise things? Like:

RefPtr<nsIGlobalObject>::get() const
RefPtr<nsIGlobalObject>::operator nsIGlobalObject*() const&
mozilla::dom::Promise::MaybeSomething<nsresult&>(nsresult&, void (mozilla::dom::Promise::*)(JSContext*, JS::Handle<JS::Value>))
mozilla::dom::Promise::MaybeReject(nsresult)
>>>> nsImageLoadingContent::RejectDecodePromises(nsresult) <<<<
Flags: needinfo?(gsvelto)

(In reply to Jens Stutte [:jstutte] from comment #10)

Hi Gabriele, could we prefix this better in order to look at the first frame after the promise things? Like:

RefPtr<nsIGlobalObject>::get() const
RefPtr<nsIGlobalObject>::operator nsIGlobalObject*() const&
mozilla::dom::Promise::MaybeSomething<nsresult&>(nsresult&, void (mozilla::dom::Promise::*)(JSContext*, JS::Handle<JS::Value>))
mozilla::dom::Promise::MaybeReject(nsresult)
>>>> nsImageLoadingContent::RejectDecodePromises(nsresult) <<<<

Hmm, might be covered already, see bug 1802315 comment 2

Yes it would, but I'd rather put those prefixes in the irrelevant list rather than in the prefix one, because lots of methods in the prefix list is going to make signatures super-long.

Flags: needinfo?(gsvelto)
Depends on: 1802315
Assignee: ystartsev → nobody

IIUC the same as bug 1800446 comment 11 applies here, too? I still see some UAF instances in 108 beta that would probably amplify on release. Sure we cannot uplift?

Flags: needinfo?(ystartsev)

It looks like the sequence of patches landed in 1800496 have resolved the security issues with workers. When I was discussing uplifting 1800496 along with its regressions 1801833, 1801616, 1801344 -- it's a really large queue to uplift.

I'll cc dveditz to get his opinion here, I am not the best judge of what we should do. If we should uplift, I can make a combined patch.

Flags: needinfo?(ystartsev) → needinfo?(dveditz)

I would dearly love to get these crashes fixed: they are themselves a stability problem, but also are the kind of racy condition security bug hunters might be poking around at. At least one of the duplicate bugs in this set (bug 1800446) was found by an external contributor running ASAN builds (just the crash -- couldn't identify a testcase). But it's really late: we are already in RC week.

It's Dianna's call at this point. There's also a planned "security fix point release" mid-cycle now. Could be more reasonable to target that?

The patchset is large, but is it relatively straightforward? The 3 regression fixes on top of the big patch were small and to the point, so those don't worry me at all for uplift.

Flags: needinfo?(dveditz) → needinfo?(dsmith)

I've prepared the patch queue on top of beta, and i've pushed it against this bug to make the uplift easier. It applies cleanly. The combined patch is attached to this bug.

No one else has been really poking around in this code aside from me, so I think that in spite of the size, given how stable this has been on nightly, this is relatively safe to uplift, and it does fix all of the issues we had around this code.

combined patch for 1800496, and regressions 1801833, 1801616, 1801344;

Comment on attachment 9307040 [details]
Bug 1799201 - Uplift - combined patch of 1800496, and regressions 1801833, 1801616, 1801344;

Beta/Release Uplift Approval Request

  • User impact if declined: Stability of the browser with regards to be workers will be affected, with crashes occurring at worker shutdown intermittently. These have been the source of several security bugs.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): The patch, considering the size, is relatively not risky as it has been baking on nightly for over a week and all regressions were caught and fixed early on. No new regressions have come up in the last week. Since the landing of these patches, no new security issues have been opened in relation to worker cancellation.
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9307040 - Flags: approval-mozilla-beta?

Comment on attachment 9307040 [details]
Bug 1799201 - Uplift - combined patch of 1800496, and regressions 1801833, 1801616, 1801344;

This might be too late/risky to consider for an RC respin for 108. Though I see the benefit of including this, I think it is better to let this ride the 109 trains.

I did want to point out that that due to the holiday schedule, there is no planned dot release for 108.

Flags: needinfo?(dsmith)
Attachment #9307040 - Flags: approval-mozilla-beta? → approval-mozilla-release-
Assignee: nobody → ystartsev
Status: NEW → ASSIGNED

In case someone changes their mind, we have the patch ready for uplift.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
Whiteboard: [adv-main109+r]
Duplicate of this bug: 1801700
Group: dom-core-security → core-security-release

Copying crash signatures from duplicate bugs.

Crash Signature: const | mozilla::dom::Promise::MaybeSomething<T>] → const | mozilla::dom::Promise::MaybeSomething<T>] [@ RefPtr<nsIGlobalObject>::get() const]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: