Closed Bug 868715 Opened 12 years ago Closed 12 years ago

Better rooting for 'any' and 'object' in bindings code

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla24
Tracking Status
firefox23 + fixed
firefox24 --- fixed

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

Details

(Keywords: dev-doc-complete)

Attachments

(11 files, 3 obsolete files)

(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
Details | Diff | Splinter Review
(deleted), patch
peterv
: review+
sfink
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
In bug 865969 we added rooting here, but it was somewhat ad-hoc, and looks ugly to consumers. I just looked through the codegen for various cases, and here's where we are today: 1) 'any' and 'object' arguments: Rooted in bindings, callees get a Value or JSObject* or JSObject&. Except optional arguments, which expose the icky classnames we're using here. 2) Sequences of 'object' or 'any' as arguments. Currently disallowed. 3) Variadic 'any' or 'object': broken; a gc hazard right now, even without a moving or exact gc. 4) 'any' and 'object' return values are currently not rooted in bindings, but immediately assigned to *vp, so actually ok. 5) Sequences of 'object' and 'any' as return values. These are GC hazards right now. What I would like to do to fix this up: * On-stack any/object should become Rooted. C++ callees should be able to just take a Handle for direct object/any arguments. I don't think worrying about nullable vs not is worth it, so non-nullable 'object' would also be a Rooted and hence a JSObject* or a Handle<JSObject*> for the callee. * any/object return values can probably stay as-is for now. * Variadic and sequence arguments should use a new on-stack class that has a conversion operator to Sequence<Value> or Sequence<JSObject*>. This stack class would also include an AutoArrayRooter to handle the rooting. The other option, of course, is to just use AutoValue/ObjectVector here and have callees see that. But that would cause problems for C++ code that wants to call into callback codegen, actually; it's simpler to just have a Sequence that is guaranteed rooted here. * Sequence return values should probably just become an AutoValueVector or AutoObjectVector. C++ callers wanting to call a callback that returns one of these would need such a beastie, which I think is OK: such calls will be rare. Oh, and dictionaries would consider using the current lazy-constructed types. Any objections to the proposed types before I start redoing how we declare stack things to allow declaring them with a nontrivial constructor? ;)
And once we do this we should update the docs on what the types used here are....
Keywords: dev-doc-needed
#4 doesn't bother me at all. Assigning to a rooted location is a valid way of rooting. Perhaps CallArgs.rval().set... would make this cleaner/safer? For Variadic and Sequence arguments, could Sequence<> inherit from CustomAutoRooter? I don't know if that's cleaner than containing an AutoArrayRooter. They amount to the same thing. Nothing in your plan strikes me as wrong, though I haven't looked at your lazy-constructed types.
> Perhaps CallArgs.rval().set... I don't have CallArgs available (yet). Doing it will be a bit of work, and may need to be at least partially done by someone who is comfortable writing ion codegen for passing a CallArgs, since these methods are called directly from jitcode. > For Variadic and Sequence arguments, could Sequence<> inherit from CustomAutoRooter? Maybe. I'm not sure what that buys me; I'd have to write tracing code at that point, right? Good question about lazily-constructed lifetimes.... I guess there are four other cases that need to be considered: sequences of sequences of object/any, dictionaries containing sequences of object/any, sequences containing dictionaries which contain object/any, dictionaries which contain dictionaries which contain object/any. I'll look into those. Come to think of that last, actually, sequences of dictionaries containing object/any (as implemented in bug 820957) are already broken because LIFO ordering does NOT hold there; I'm not sure I realized back in December when I reviewed that that Rooted depended on that ordering. And actually, sequences of dictionaries _are_ allowed, so any time someone uses a sequence of dictionaries containing object/any we currently have a GC hazard, since those Rooted end up on the heap, and it's not like Rooted actually roots anything... I'm not actually sure what to do with all that yet, short of just manually rooting everything and screw performance. Need to think about it. :(
Oh, and to be clear, anything used in a Sequence must be memmovable. That means no Rooted inside a sequence, ever, and we need to enforce that no matter what we end up with for this setup.
Just to check, is an AutoArrayRooter memmovable?
To answer my own question, AutoArrayRooter inherits from AutoGCRooter, which adds "this" to a linked list in the JS engine. So it is certainly not memmovable. That means we need to disallow sequences of dictionaries that contain any/object. And perhaps also disallow variadic arguments of such... At least unless we change how any/object are represented in a dictionary.
So, there are a number of considerations here, but it seems that the hard case is Collection<JSObject>, where collection can be a Dict of Dicts, a Sequence of Dicts, a Dict of Sequences, etc, ad-nausea. Since there are no inter-object connections here, from the GC's perspective this is just a pile of things which need to be traced during GC. (Ignoring minor GC, which I'll get to in a moment.) We have two choices: (1) tell the GC about each and every object in the collection, or (2) tell the GC about the collection and teach it how to find the objects in the collection when it needs to. Which of these options is more efficient is probably going to depend on the collection size and complexity, but (2) is going to be way faster in the majority of situations. Currently there is a bit of confusion in the API about how you should accomplish (2), including different stories for the stack and the heap. I think this is the case now because the GC typically traces "roots" (Rooted, AutoGCRooter, etc), where the browser is responsible for tracing everything else (NS_IMPL_CYCLE_COLLECTION). I don't think this necessarily needs to be so confusing. Conceptually, when you want to trace $COLLECTION, you want to call JS_CallFooTracer on every GCThing in $COLLECTION. This should be a very simple function to write, even if, in C++ at least, you will need a different function for every shape of collection. Thus, it seems like all we need to do is arrange for this function to be called whenever (a) the thing is on the stack or (b) the thing in the C heap holding it is traced. We could do this easily today with CustomAutoRooter, Rooted, or NS_IMPL_CYCLE_COLLECTION_SHOUTY_MACROS with various amounts of extra work now or boilerplate later. We could also probably come up with something nicer and more consistent if we wanted. Does this all makes sense? How would you like to proceed?
Minor GC adds a bit of complexity, but hopefully not too much. First, the minor GC traces all Rooted and all AutoGCRooters, so anything that is rooted by one of these mechanisms is done. Secondly, the minor GC does /not/ trace the full heap, so it needs a way to find everything that is in the heap that points into the nursery. We do this by intercepting all writes to GC thing references (JSObject*, Value, etc) that live in the C heap (and the JS heap too, but this is internal and should not concern embeddings) and keeping a collection of all the cross generation references. Big important caveat: We only need these barriers if the reference may hold a pointer to a GC thing allocated in the nursery. If the thing is known to always have a finalizer, or is allocated with a NewObjectKind of TenuredObject (I don't think this is in the API yet), then we do not need barriers on it. My hope is that most complex collections in the browser will fall into this category. Possibly important detail: For performance, we prefer to bake in the direct address of the GC thing at the time when it is written (&reference). This plus a contiguous nursery makes the "is this pointer an inter-heap reference" test fast: !isInsideNursery(reference) && isInsideNursery(*reference). Naturally, this breaks terribly if someone memmoves the reference, so this technique is almost totally useless for collections. For collections, we have two different mechanisms with different performance tradeoffs and boilerplate requirements, so it is tractable if we need to deal with it. Of particular concern is where we are using the address of a mobile thing as an invariant of the collection: e.g. as a key in a hash table. In SpiderMonkey we have ended up with a semi-custom solution for each of our major collections. I'm hoping we don't have to do this for the browser, but we are prepared to, if required.
> This should be a very simple function to write, So from my point of view, the steps for tracing a sequence, say, fall into the following buckets: 1) Sequence of any or object: just trace the elements. 2) Sequence of sequences or dictionaries or unions: invoke the correct trace function on each member. I _think_ I can get there with some template specialization, though it will be ugly. Similar considerations apply to dictionaries and unions, though those are simpler because we're generating the code for them anyway so we can just emit calls to the trace on members. I can probably get away with using some sort of auto rooter on the stack to enter into the whole thing, I guess. Sound reasonable? > My hope is that most complex collections in the browser will fall into this category. The bindings collections will absolutely not: they're working with arbitrary page-provided objects...
Maybe I should expand on my proposal from comment 9. What I am proposing is that: 1) A type needs tracing if it is JSObject*, Value, or a collection type that needs tracing. 2) A collection type that needs tracing is a sequence of types that need tracing, a union containing a type that needs tracing, or a dictionary containing a type that needs tracing. 3) Any collection type that needs tracing has a (static? virtual? What arguments do these functions get? See item 4) trace function that knows how to trace that type. 4) Whenever we declare a collection type that needs tracing on the stack, we register its trace function somewhere so it'll get called if GC happens.
Flags: needinfo?(terrence)
Terrence, the set of types that we need is statically known, so we can statically generate trace hooks for each type (like, a dictionary of dictionaries, say). Maybe you were thinking that we'd have to handle at runtime arbitrary nestings of these things?
(In reply to Boris Zbarsky (:bz) from comment #10) > Maybe I should expand on my proposal from comment 9. What I am proposing is > that: > > 1) A type needs tracing if it is JSObject*, Value, or a collection type > that needs > tracing. > 2) A collection type that needs tracing is a sequence of types that need > tracing, > a union containing a type that needs tracing, or a dictionary containing > a type that > needs tracing. > 3) Any collection type that needs tracing has a (static? virtual? What > arguments do > these functions get? See item 4) trace function that knows how to trace > that type. > 4) Whenever we declare a collection type that needs tracing on the stack, > we register its > trace function somewhere so it'll get called if GC happens. Sounds good to me. As to registering a trace function, you currently have only one option and a second one that we can support relatively easily. 1) Derive a new CustomAutoRooter sub-class and override the |virtual void trace(JSTracer*)| method. Either include the collection inside the new class or instantiate the new class next to the collection on the stack. This class uses a similar linked list structure to Rooted, so should be quite fast. 2) Implement a template specialization for RootMethods<Collection> and use Rooted<Collection>. Right now every specialization of RootMethods has custom code inside of js/src/gc/RootMarking.h, but we could easily add a generic variant with a virtual base that you could mix in to Collection and override with the right tracing code. Do either of these sound reasonable? We are, of course, not averse to adding new API if there is a more elegant way to accomplish this. (In reply to Andrew McCreight [:mccr8] from comment #11) > Terrence, the set of types that we need is statically known, so we can > statically generate trace hooks for each type (like, a dictionary of > dictionaries, say). Maybe you were thinking that we'd have to handle at > runtime arbitrary nestings of these things? No, sorry, I should have been clearer that I just meant a static subset of possible nesting. What I was trying to say is that it is fairly low overhead to create custom code to trace for any possible layout. Specifically, I don't think we necessarily need to have any complex protocol where every Collection needs to be taught how to trace some subset of the things that can be put in them.
Flags: needinfo?(terrence)
> Do either of these sound reasonable? (1) actually should be, yes. I'll go ahead and give it a shot, at least. That might work for return values too, actually, since those are placed on the stack in bindings code. I'll see how that goes.
Blocks: 856911
Peter, I'd really like to get this into 23 as well, so would appreciate fast-ish reviews. I can steal some of your other reviews to help that!
Attachment #749115 - Flags: review?(peterv)
Attachment #749120 - Flags: review?(peterv)
Attached patch part 7. Add tracing to dictionaries. (obsolete) (deleted) — Splinter Review
Attachment #749121 - Flags: review?(peterv)
Attachment #749122 - Flags: review?(peterv)
Attachment #749123 - Flags: review?(peterv)
Attachment #749138 - Flags: review?(peterv)
Attachment #749120 - Attachment is obsolete: true
Attachment #749120 - Flags: review?(peterv)
Attachment #749139 - Flags: review?(peterv)
Attachment #749121 - Attachment is obsolete: true
Attachment #749121 - Flags: review?(peterv)
Attachment #749138 - Attachment is obsolete: true
Attachment #749138 - Flags: review?(peterv)
Oh, one more comment. I considered switching the construction order of the holder and decl and just passing a reference to the decl to the holder. Then I wouldn't need the various set* methods. But I'd need to audit all our decls to make sure they can be passed as a constructor argument to their holders, and I know that at least for XPConnect unwrapping the uninitialized raw pointer is not OK to pass to the nsCOMPtr. We could preinitialize to null there, of course... Let me know if you'd prefer that, either in this bug or a followup.
Whiteboard: [need review]
Comment on attachment 749115 [details] [diff] [review] part 1. Change instantiateJSToNativeConversionTemplate to return an object, to make its return value more extensible. Review of attachment 749115 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/bindings/Codegen.py @@ +2399,5 @@ > + for whether we have a JS::Value. Only used when > + defaultValue is not None. > + > + declType: A CGThing representing the native C++ type we're converting > + to. This is allowed to not be None if the conversion code is s/not//? @@ +2404,5 @@ > + supposed to be used as-is. > + > + holderType: A CGThing representing the type of a "holder" which will > + hold a possible reference to the C++ thing whose type we > + returned in #1, or not set or set to None if no such holder What's the difference between "not set" and "set to None" @@ +2407,5 @@ > + hold a possible reference to the C++ thing whose type we > + returned in #1, or not set or set to None if no such holder > + is needed. > + > + dealWithOptional: A boolean set to True indicating whether the caller s/set to True//
Attachment #749115 - Flags: review?(peterv) → review+
> What's the difference between "not set" and "set to None" The former is no longer possible; I think those comments come from an earlier patch iteration. Will fix. Will apply the other two changes.
Comment on attachment 749116 [details] [diff] [review] part 2. Add the ability to request that the declType or holderType be constructed with a JSContext. Review of attachment 749116 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/bindings/Codegen.py @@ +2427,5 @@ > + > + holderArgs: If not None, the arguments to pass to the ${holderName} > + constructor. These will have template substitution > + performed on them so you can use things like > + ${valHandle}. Maybe note that these are a string (as opposed to eg arrays of strings) @@ +3503,5 @@ > + if info.declArgs is not None: > + declArgs = CGGeneric(string.Template(info.declArgs).substitute(replacements)) > + else: > + declArgs = None > + Trailing whitespace.
Attachment #749116 - Flags: review?(peterv) → review+
Comment on attachment 749117 [details] [diff] [review] part 3. Use on-stack Rooted<Value> for 'any' arguments in WebIDL bindings. Review of attachment 749117 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/html/content/public/HTMLCanvasElement.h @@ +91,5 @@ > SetUnsignedIntAttr(nsGkAtoms::width, aWidth, aRv); > } > already_AddRefed<nsISupports> > GetContext(JSContext* aCx, const nsAString& aContextId, > + const Optional<JS::Rooted<JS::Value> >& aContextOptions, As discussed we should try specializing Optional<JS::Handle> to hold a JS::Rooted, so that this could just be |const Optional<JS::Handle<JS::Value> >&| ::: dom/bindings/Codegen.py @@ +4250,5 @@ > # nested sequences we don't use the same variable name to iterate over > # different sequences. > sequenceWrapLevel = 0 > > +def wrapTypeIntoCurrentCompartment(type, value, isNonMember=False): We generally use isMember, so I'd prefer to use that here too. @@ +4322,5 @@ > > raise TypeError("Unknown type; we don't know how to wrap it in constructor " > "arguments: %s" % type) > > +def wrapArgIntoCurrentCompartment(arg, value, isNonMember=False): isMember?
Attachment #749117 - Flags: review?(peterv) → review+
Comment on attachment 749118 [details] [diff] [review] part 4. Use on-stack Rooted<JSObject*> for 'object' arguments in WebIDL bndings. Review of attachment 749118 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/canvas/src/CanvasRenderingContext2D.h @@ +312,5 @@ > mozilla::ErrorResult& error); > JSObject* GetMozCurrentTransformInverse(JSContext* cx, > mozilla::ErrorResult& error) const; > + void SetMozCurrentTransformInverse(JSContext* cx, > + JS::Handle<JSObject*> currentTransform, Trailing whitespace. ::: dom/bindings/Codegen.py @@ -4185,5 @@ > args = CGList([CGGeneric(arg) for arg in argsPre], ", ") > for (a, name) in arguments: > - # This is a workaround for a bug in Apple's clang. > - if a.type.isObject() and not a.type.nullable() and not a.optional: > - name = "(JSObject&)" + name Woot! @@ +5849,5 @@ > self.type.flatMemberTypes) > structName = self.type.__str__() > > setters.extend(mapTemplate("${setter}", templateVars)) > + # Don't generate a SetASObject, since we don't use it s/AS/As/
Attachment #749118 - Flags: review?(peterv) → review+
Attachment #749119 - Flags: review?(peterv) → review+
Comment on attachment 749628 [details] [diff] [review] Part 6 without the JS APIs that people are helpfully removing Review of attachment 749628 [details] [diff] [review]: ----------------------------------------------------------------- I guess using typeNeedsCx to mean that the type needs tracing works fine, but it seems a bit fragile. IIRC it's mainly used to know if the caller is likely to need a cx to do anything useful with the argument. ::: dom/bindings/BindingUtils.h @@ +1717,5 @@ > + eInfallibleArray, > + eFallibleArray > + }; > + > + virtual void trace(JSTracer *trc) MOZ_OVERRIDE { { on next line
Attachment #749628 - Flags: review?(peterv) → review+
> Maybe note that these are a string (as opposed to eg arrays of strings) > Trailing whitespace. > We generally use isMember, so I'd prefer to use that here too. > Trailing whitespace. > s/AS/As/ > { on next line Fixed. > As discussed we should try specializing Optional<JS::Handle> to hold a JS::Rooted Will do a patch on top of these. > I guess using typeNeedsCx to mean that the type needs tracing works fine, but it seems > a bit fragile. Agreed; once we trace typed arrays we'll need another predicate. I'll do that at that point.
Attachment #750205 - Flags: review?(sphink) → review+
Comment on attachment 749139 [details] [diff] [review] Part 7 with the guard object notifier bits fixed Review of attachment 749139 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/bindings/BindingUtils.h @@ +1734,5 @@ > +{ > +public: > + explicit DictionaryRooter(JSContext *cx MOZ_GUARD_OBJECT_NOTIFIER_PARAM) > + : JS::CustomAutoRooter(cx MOZ_GUARD_OBJECT_NOTIFIER_PARAM_TO_PARENT), > + mDictionaryType(eNone) I wish we could make this a template argument, but it seems nontrivial :-(. @@ +1760,5 @@ > + > + virtual void trace(JSTracer *trc) MOZ_OVERRIDE { > + if (mDictionaryType == eDictionary) { > + mDictionary->TraceDictionary(trc); > + } else if (mDictionaryType == eNullableDictionary) { Do we actually expect to end up here with |mDictionaryType == eNone|? If not I'd just assert |mDictionaryType == eNullableDictionary|.
Attachment #749139 - Flags: review?(peterv) → review+
Attachment #749122 - Flags: review?(peterv) → review+
Attachment #749123 - Flags: review?(peterv) → review+
Comment on attachment 750205 [details] [diff] [review] part 10. Create specializations of Optional for 'any' and 'object' types so that we can have those look like Optional<Handle<Value> > and Optional<Handle<JSObject*> > respectively. Review of attachment 750205 [details] [diff] [review]: ----------------------------------------------------------------- This is all looking great.
Attachment #750205 - Flags: review?(peterv) → review+
> Do we actually expect to end up here with |mDictionaryType == eNone|? It really depends on the exact timing of GCs, but I guess given our current setup no. OK, I'll do that.
[Approval Request Comment] Bug caused by (feature/regressing bug #): Various WebIDL conversions. User impact if declined: GC hazards in bindings code that can lead to possibly-exploitable crashes. Testing completed (on m-c, etc.): Passes try. Risk to taking this patch (and alternatives if risky): I believe this is low-risk. It's certainly not going to make the rooting situation worse. String or IDL/UUID changes made by this patch: None NOTE: Do not check in this roll-up patch. It's just here to serve as an approval placeholder. I'll push the actual patch stack if/when I get approval.
Attachment #750511 - Flags: approval-mozilla-aurora?
Documented the signature changes and such.
Depends on: 874154
Comment on attachment 750511 [details] [diff] [review] Roll-up aurora patch. Do not check in! Trusting your call here on risk, this looks like a lot of work to land but it's Aurora and we can backout if need be.
Attachment #750511 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 873479
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: