Bugzilla

Calixte Denizet (:calixte)

Assignee

Comment 2

•

6 years ago

Hmm. I don't see anything in the backtrace that would implicate bug 1528146...

Do you have a link to the pushlog for the relevant nightly, by any chance?

Flags: needinfo?(bzbarsky) → needinfo?(sciguyryan)

Comment 3

•

6 years ago

The pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=826b59e57fe4&tochange=6924dd16f7b1

About the backtrace:
https://crash-stats.mozilla.com/report/index/3f23aea2-2732-4a9b-8281-682970190224

see frame 43:
https://hg.mozilla.org/mozilla-central/annotate/6924dd16f7b1e2e6ce71a8eb4cbe330d3e4745dc/dom/base/nsContentSink.cpp#l1565

In comparing backtrace and hg history, it's the only patch I see, I know that the position in the backtrace is pretty low, but the # of crashes is high enough so...

Pascal Chevrel:pascalc (PTO until August 21)

Comment 4

•

6 years ago

Tracking as it has a high volume of crashes on Nightly and may explode in beta.

tracking-firefox67: --- → +

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

6 years ago

Priority: -- → P1

Assignee

Comment 5

•

6 years ago

About the backtrace:

Ah, that's a quite different backtrace than the one in comment 0. ;)

Given that bug 1528146 was in for a few days last week before being backed out without this popping up, it's fairly unlikely that it's really involved. Especially since all it's doing in this case is triggering some script, that runs other script, that eventually does something.

Bug 1523843 landed in that same push and is a more likely candidate for issues here, in theory, in that it actually affects compartment boundaries.

There were also various spidermonkey changes, which might in theory be involved. I really wish the compartment checks gave a better idea of what fails the assert. :(

Jan, any idea how to go about debugging this? We could try disabling bug 1523843 and seeing if this goes away, I guess... Or we could make the InternalCallOrConstruct code provide more information about what compartment issues it's running into, on nightly?

Flags: needinfo?(sciguyryan) → needinfo?(jdemooij)

Assignee

Comment 6

•

6 years ago

So I poked at the crash reports some. If I search for "js::ContextChecks::check" on crash-stats, look at reports, sort by build id, I see the following:

All the Windows and Android stacks are cut off.
The Linux/Mac stacks mostly pass through PrecompiledScript::ExecuteInGlobal and that's triggered via nsContentSink::NotifyDocElementCreated invoking ExtensionPolicyService::CheckContentScripts. They fail the compartment checks in js::InternalCallOrConstruct.
There is a separate crash under ReportIsNotFunction that I just filed as bug 1530388.

As far as point 2 goes, a bunch of these are coming off the HTML5 parser, not the old parser, so they are NOT the notification I added in bug 1528146, but the notification we used to fire all along.

Assignee

Comment 7

•

6 years ago

So tcampbell found https://crash-stats.mozilla.com/report/index/41652358-8427-401a-9460-279470190225 which says we're coming through js::InternalCallOrConstruct with a callee that is a JSFunction backed by a JSNative. At that point, chances are we're failing a cx->check() call in CallJSNative.

What's not clear is whether it's the one for args or rval, and if the former which thing in CallArgs it's for...

I wonder whether it's worth adding explicit MOZ_CRASH diagnostics to CallJSNative that would say which (if any) of the checks is failing, temporarily.

Assignee

Comment 8

•

6 years ago

Kris notes that TamperMonkey seems likely to be related...

Comment 9

•

6 years ago

Ryan, can you give us a list of Tampermonkey user scripts you have installed, or at least those which would run on Facebook? A full list would be better, since it's possible some scripts may run in iframes.

Flags: needinfo?(sciguyryan)

Marco Zehe (:MarcoZ)

Comment 10

•

6 years ago

I also hit these, although they might also be cut off since I'm on Windows:

I do have TamperMonkey installed, although it only has one script on this machine ATM, which runs on github.com to make it a bit more accessible. The script is here.

Ryan Jones-Ward [:sciguyryan]

Assignee

Comment 11

•

6 years ago

Marco, were you loading any particular page(s) when you hit those?

The stacks being cut off is not really a problem; what I'd really love is a way to reproduce so I can debug. Reproducing just once is enough as long as I can manage to catch it under rr... ;)

Flags: needinfo?(mzehe)

Reporter

Comment 12

•

6 years ago

(In reply to Kris Maglione [:kmag] from comment #9)

Ryan, can you give us a list of Tampermonkey user scripts you have installed, or at least those which would run on Facebook? A full list would be better, since it's possible some scripts may run in iframes.

Hi Kris.

I do have TamperMonkey installed however none of the scripts are running on Facebook or any of its domains. Disabling TamperMonkey for the whole domain does not prevent the crashing either. However disabling TamperMonkey completely does prevent the crash, which is fascinating.

Flags: needinfo?(sciguyryan)

https://crash-stats.mozilla.org/report/index/a859bf2d-dab6-4367-a4fb-604b90190226

Comment 13

•

6 years ago

I noticed some crashes on support.microsoft.com on crash-stats that had the CanvasBlocker extension installed.

I can reproduce this 100% reliably with that extension on https://support.microsoft.com/nl-be/help/12415/windows-10-recovery-options

https://searchfox.org/mozilla-central/rev/b10ae6b7a50d176a813900cbe9dc18c85acd604b/js/src/vm/Interpreter-inl.h#501

Comment 14

•

6 years ago

So in a debug build we fail this debug-only check:

At this point our |obj| is a CrossOriginObjectWrapper proxy, |key| is "HTMLCanvasElement" and |res| is a function.

DumpJSStack:

0 preIntercept/forEachFunction/</<(object = "HTMLCanvasElement", 0, HTMLCanvasElement) ["moz-extension://183e0772-4c8f-1544-ad6c-c67925b21926/lib/intercept.js":81:24]

intercept.js:81 is this line:

var constructor = getWrapped(window)[object];

Flags: needinfo?(jdemooij) → needinfo?(bzbarsky)

https://searchfox.org/mozilla-central/rev/b10ae6b7a50d176a813900cbe9dc18c85acd604b/js/xpconnect/wrappers/WrapperFactory.cpp#51

Comment 15

•

6 years ago

We have a CrossOriginObjectWrapper => XrayWaiver => nsOuterWindowProxy. I think the JSAutoRealm in the XrayWaiver code switches compartments here (the wrapper and cx are probably in different compartments):

Julien Cristau [:jcristau]

Updated

•

6 years ago

Flags: needinfo?(mzehe)

Updated

•

6 years ago

Keywords: topcrash

Marco Zehe (:MarcoZ)

Comment 16

•

6 years ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #11)

Marco, were you loading any particular page(s) when you hit those?

No, just your plain old facebook.com which brings up the news feed. Takes a few seconds, then crashes the tab. I am logged in automatically. One extra bit is that I am using our Facebook Container add-on. So no special page, just the news feed that comes up by default.

Assignee

Comment 17

•

6 years ago

We have a CrossOriginObjectWrapper => XrayWaiver => nsOuterWindowProxy

Aha! That is not a thing that should be happening as I planned this stuff; certainly the code in XrayWaiver can't deal with it.

Should be pretty simple to write a testcase for this... Working on that and a fix.

Assignee: nobody → bzbarsky

Updated

•

6 years ago

Group: dom-core-security

Keywords: csectype-uaf, sec-high

Comment 18

•

6 years ago

This isn't a UAF

Keywords: csectype-uaf

Comment 19

•

6 years ago

Isn't this a compartment mismatch? We generally mark those as a UAF because they can cause the GC to not mark a live object.

Comment 20

•

6 years ago

(In reply to Andrew McCreight [:mccr8] from comment #19)

Isn't this a compartment mismatch? We generally mark those as a UAF because they can cause the GC to not mark a live object.

I think we only run into the compartment mismatch in special corner cases because of the weird double wrapper. I don't think this should affect GC behavior unless one of those corner cases lets through a cross-compartment object that doesn't hit a compartment check.

I could be wrong, though.

Assignee

Comment 21

•

6 years ago

OK, I've confirmed this is a regression from bug 1523843.

Blocks: 1523843
No longer blocks: 1528146

Assignee

Comment 22

•

6 years ago

Alright, so some more information on what's going on, now that I have the steps from comment 13 in rr.

We have a toplevel page at https://support.microsoft.com/nl-be/help/12415/windows-10-recovery-options which has a subframe at https://support.microsoft.com/api/content/SignedOut. The two pages are same-compartment (after bug 1523843) but are not same-origin, because the parent page sets document.domain to "microsoft.com".

We have an extension running in a sandbox targeting the subframe. It has an expanded principal, as expected, wrapping a moz-extension principal and the https://support.microsoft.com/api/content/SignedOut principal. The sandbox has the wantXrays flag so it can get same-origin Xrays and hence XrayWaivers.

We first seem to run into trouble in xpc::WrapperFactory::Rewrap trying to create a cross-compartment wrapper for an object. The object is an XrayWaiver for the subframe's WindowProxy. The principal of the XrayWaiver is the toplevel page's principal. Presumably that's because we first created an XrayWaiver for that WindowProxy when the subframe contained initial about:blank (which inherits the parent's principal). Then we did the actual subframe load, but since we're using the same compartment we kept the same WindowProxy object (though with a new global and principal) and hence also kept using the same XrayWaiver.

Since the two ContentPrincipals involved are not same-origin with each other for purposes of this code, we end up in the block at https://searchfox.org/mozilla-central/source/js/xpconnect/wrappers/WrapperFactory.cpp#698-704 and decide that our CCW should be a CrossOriginObjectWrapper. That's because both targetSubsumesOrigin and originSubsumesTarget are both false, so we never worry about the wantXrays flags, since they're not relevant to not-same-origin cases.

But really, the fact that targetSubsumesOrigin is false is really wrong here. "target" is the extension sandbox. "origin" really ought to be the window that sandbox is targeting, but we're getting it off the XrayWaiver, so we're ending up with the parent window, which the sandbox does not subsume...

This business of having a waiver that's not same-origin with the thing it's targeting seems pretty weird. Maybe we need a different approach for bug 1526624. Alternately, maybe Rewrap needs to strip off waivers and only reapply them if we're about to decide to use an Xray or something... I still need to double-check exactly how the sandbox is getting its hands on this not-same-origin waiver; that's coming up next.

Flags: needinfo?(bobbyholley)

Comment 23

•

6 years ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #22)

It has an expanded principal, as expected, wrapping a moz-extension principal and the https://support.microsoft.com/api/content/SignedOut principal. The sandbox has the wantXrays flag so it can get same-origin Xrays and hence XrayWaivers.

Ugh. I didn't realize we were still setting that flag on those sandboxes. The fact that they have expanded principals should be enough for them to get X-rays for the objects they want X-rays for. We don't really want them to have same-origin X-rays...

In any case, I'm not sure that's really relevant to the problem here. It seems like we'd still have it with just an expanded principal.

And, either way, the case of a content script created for a window before it's document.domain changes is already pretty broken, I think, since we don't update the principal of the sandbox. And when we did remap wrappers when document.domain changed, we ran into other issues (i.e., bug 1414783 where we lose expandos for the window wrappers).

Assignee

Comment 24

•

6 years ago

OK, so as expected the place the waiver comes from is the extension calling its getWrapped function which calls xpc::wrappedJSObject_getter which calls xpc::WrapperFactory::WaiveXrayAndWrap. AllowWaiver in there returns true, because there we're comparing the context compartment (the extension's sandbox) to the object compartment (the compartment both web pages are in) and doing a Subsumes() check, which ignores document.domain. And that passes in this case. And even if we compared Realms or did SubsumesConsideringDomain() there it would still pass, because at that point we're comparing the extension sandbox to the subframe's OuterWindowProxy.

Assignee

Comment 25

•

6 years ago

Right, the wantXrays flag is not really relevant here, except insofar as there are cases in which it suppresses the use of CrossOriginObjectWrapper, so I wanted to understand why that wasn't happening in this case.

the case of a content script created for a window before it's document.domain changes is already pretty broken

I think it actually works kinda fine in that case: the expanded principal stores a pointer to the principal that the document.domain change mutates, so SubsumesConsideringDomain() still passes.

The issue in this case is that the XrayWaivers are per-compartment, not per-realm, given how we fixed bug 1526624. Which means that we should really never be examining any Realm-dependent state (including principals) on them. Ideally js::GetNonCCWObjectRealm would just assert against being called on an XrayWaiver, as would JS::GetNonCCWObjectGlobal and maybe in general any place we assert !IsCrossCompartmentWrapper(). Maybe we need a semantically distinct !IsObjectWithoutRealm() assert or something.

Again, unless we change how waivers work to be per-Realm.

Assignee

Comment 26

•

6 years ago

Attached patch Mochitest for this bug (deleted) — Details — Splinter Review

Comment 27

•

6 years ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #25)

the case of a content script created for a window before it's document.domain changes is already pretty broken

I think it actually works kinda fine in that case: the expanded principal stores a pointer to the principal that the document.domain change mutates, so SubsumesConsideringDomain() still passes.

Oh, that's a good point. In that case, I guess just not remapping wrappers may have been enough to fix the existing problems. I was expecting to need to do more work after that.

Assignee

Comment 28

•

6 years ago

Alright, bholley and I talked about comment 22. The basic options seem to be:

Change XrayWaiver to be per-realm, not per-compartment. This basically means adjusting FixWaiverAfterTransplant to handle the same-compartment different-realm case.
Add asserts against getting realms, principals, etc off XrayWaivers (just like we already have for CCWs). Fix all the resulting fallout.

I took a brief stab at #2 and it's somewhat invasive and kinda annoying. So I'm going to try #1.

Bobby Holley (:bholley)

Updated

•

6 years ago

Flags: needinfo?(bobbyholley)

Assignee

Comment 29

•

6 years ago

Attached file Bug 1530146 part 1. Switch XrayWaiver to always being same-realm with its target. r=bholley (deleted) — Details

Assignee

Comment 30

•

6 years ago

Attached file Bug 1530146 part 2. Back out the fix for bug 1526624, since it's no longer needed. r=bholley (deleted) — Details

Assignee

Updated

•

6 years ago

Flags: needinfo?(bzbarsky)

Comment 31

•

6 years ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #29)

Created attachment 9047271 [details]
Bug 1530146 part 1. Switch XrayWaiver to always being same-realm with its target. r=bholley

I wonder if I should be concerned that the second email I got about this revision was plain text rather than "The content for this message can only be transmitted over a secure channel" ...

Comment 32

•

6 years ago

(In reply to Kris Maglione [:kmag] from comment #31)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #29)

Created attachment 9047271 [details]
Bug 1530146 part 1. Switch XrayWaiver to always being same-realm with its target. r=bholley

I wonder if I should be concerned that the second email I got about this revision was plain text rather than "The content for this message can only be transmitted over a secure channel" ...

I got the same thing. You should file a bug about it.

https://hg.mozilla.org/integration/autoland/rev/dadc02e71d59
https://hg.mozilla.org/integration/autoland/rev/fe2cba661d5e

Assignee

Comment 33

•

6 years ago

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 34

•

6 years ago

https://hg.mozilla.org/mozilla-central/rev/fe2cba661d5e
https://hg.mozilla.org/mozilla-central/rev/dadc02e71d59

Group: dom-core-security → core-security-release

Status: NEW → RESOLVED

Closed: 6 years ago

status-firefox67: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla67