Closed Bug 811168 Opened 12 years ago Closed 12 years ago

GC: Implement Raw<T> and make Return<T> stronger

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: terrence, Assigned: terrence)

References

Details

(Whiteboard: [js:t])

Attachments

(2 files, 5 obsolete files)

I'd like to be able to safely use Return<T> as a Raw<T> for cases where we have a tail GC. e.g.: ///////// Return<T> foo = js_NewFoo(cx); if (!foo) return false; { AutoAssertNoGC nogc; stuff.foo = foo.get(nogc); } return MaybeGC(cx, stuff); ///////// This example more or less works: the next MaybeCheckStackRoots will poison |foo|, ensuring no unsafe access. However, it is brittle feeling because the MayGC/NoGC scopes do not clearly segregate our lifetimes. Ideally, |foo| should be in the AutoAssertNoGC scope and go out of scope after we use it. This doesn't really work because js_NewGCThing will GC, and we can't just skip "the next" GC, because this could be a method that triggers many GCs. Am I over-thinking this? Should we just use unsafeGet() after this js_NewFoo, or is there some type-y thing we can do to make the MayGC/NoGC areas line up nicely with the variable lifetimes?
I think a decent solution here may be to implement the Raw<T> we've been meaning to implement for so long. //////// Stuff stuff; { Raw<T> foo = js_NewFoo(cx); if (!foo) return false; stuff.foo = foo; } return MaybeGC(cx, stuff); //////// This is a bit more awkward if MaybeGC takes |foo| directly. I'm not sure what to do in that case.
Actually, I'm over-thinking this. If we make Raw<T> an AutoAssertNoGC, then there is no way that we can pass a Raw argument to an AssertCanGC() method. So |foo| would necessarily be a HandleFoo. There are cases where we want a cast to a raw pointer value, e.g. for the GC, so we will need an UnsafeFoo typedef regardless. For cases where there is a valid GC hazard, but where we still want a bare pointer as input (and there are a couple of these), then we can do something simple with templates to snuff the RawFoo's AutoAssertNoGC zone and get out an UnsafeFoo in a single step. #ifdef DEBUG template <typename T> T AsUnsafe(const Raw<T> &raw) { T out = raw->asUnsafe(); raw->~Raw<T>(); return out; } #else // UnsafeT and RawT are both typedefs to T. template <typename T> T AsUnsafe(const T &t) { return t; } #endif
stuff.foo is a raw pointer, not a Rooted? I assume that's what you mean, since you're talking about tail GCs. Just brainstorming then. How about: { ReturnNoGC<T> rngc(js_NewFoo(cx)); if (!rngc.val) return false; stuff.foo = rngc.val.get(rngc.nogc); } return MaybeGC(cx, stuff); A ReturnNoGC<T> is basically struct { Return<T> val; AutoAssertNoGC nogc; }. Or you could make ReturnNoGC<T> behave like a Return<T>, so the .val is unnecessary. Or you could just use Return<T>, and give it a nogc field of type Maybe<AutoAssertNoGC>. Um... maybe that doesn't make sense? I was thinking you'd lazily construct the AutoAssertNoGC() when you called foo.nogc(), and MaybeCheckStackRoots would flag all Return<T>s on the stack so that .nogc() would assert. (The latter requires a linked list of Return<T> objects to be maintained, probably via a ReturnBase or something.) { Return<T> foo = js_NewFoo(cx); if (!foo) return false; // MaybeGC() would succeed here, but would make the following nogc() call crash stuff.foo = foo.get(foo.nogc()); // MaybeGC() would abort here } return MaybeGC(cx, stuff); Come to think of it, the .nogc() is redundant, and this would just be foo.get(). (Return<T>::get would take an optional AutoAssertNoGC object.) Or maybe the parameter-less version should be called getAndAssertNoGC(). I am probably completely defeating the purpose of Return<T> here somehow. Let's see... operator T& would fail before you called foo.get(). It would succeed after, but if you did anything that would GC, it would assert. So the danger would be T outer; // or Raw<T> outer { Return<T> foo = js_NewFoo(cx); stuff.foo = foo.getAndAssertNoGC(); outer = foo; } MaybeGC(); outer->anything(); but that's not really different from now, is it? I'm certain to be missing something major and obvious here.
mid-aired. Is my ReturnNoGC<T> the same as your beefed-up Raw<T>?
Yeah, I think so... I'm going to try to whip something up to demonstrate what I had in mind and give it to you for feedback, hopefully before lunch.
Attached patch v0 (obsolete) (deleted) — Splinter Review
This is what I have in mind. There are a few runtime test failure I need to look at, but I think this conceptually sound.
Attachment #681144 - Flags: feedback?(sphink)
Comment on attachment 681144 [details] [diff] [review] v0 Review of attachment 681144 [details] [diff] [review]: ----------------------------------------------------------------- I like it. After this patch, would Return<T> *only* be used for the types of return values? That would be a nice property, assuming it makes sense. (And if it does, should we add a stack-only assertion?) ::: js/src/jsscope.cpp @@ +78,5 @@ > { > JS_ASSERT(!shape->base()->isOwned()); > assertSameCompartment(cx, shape->compartment()); > > + RawBaseShape nbase = js_NewGCBaseShape(cx).unsafeGet(); There's nothing particularly unsafe about this line of code anymore, is there? Could we add a get() that returns a Raw<T>? Or even an operator Raw<T>()? @@ +645,5 @@ > /* > * Now that we've possibly preserved slot, check whether all members match. > * If so, this is a redundant "put" and we can return without more work. > */ > + if (shape->matchesParamsAfterId((BaseShape*)nbase, slot, attrs, flags, shortid)) Why is this cast needed?
Attachment #681144 - Flags: feedback?(sphink) → feedback+
(In reply to Steve Fink [:sfink] from comment #7) > Comment on attachment 681144 [details] [diff] [review] > v0 > > Review of attachment 681144 [details] [diff] [review]: > ----------------------------------------------------------------- > > I like it. > > After this patch, would Return<T> *only* be used for the types of return > values? That would be a nice property, assuming it makes sense. (And if it > does, should we add a stack-only assertion?) > > ::: js/src/jsscope.cpp > @@ +78,5 @@ > > { > > JS_ASSERT(!shape->base()->isOwned()); > > assertSameCompartment(cx, shape->compartment()); > > > > + RawBaseShape nbase = js_NewGCBaseShape(cx).unsafeGet(); > > There's nothing particularly unsafe about this line of code anymore, is > there? Could we add a get() that returns a Raw<T>? Or even an operator > Raw<T>()? Right, I had a constructor on Raw<T> taking a Return<T>, but the problem is that in opt builds the BaseShape* doesn't have this. I'm a bit wary of the idea given my relationship with unintended consequences lately, but it might make sense to give Return<T> an |#ifndef DEBUG operator T()| conversion for this case (and not for others, because, hopefully, debug builds would catch the error). > @@ +645,5 @@ > > /* > > * Now that we've possibly preserved slot, check whether all members match. > > * If so, this is a redundant "put" and we can return without more work. > > */ > > + if (shape->matchesParamsAfterId((BaseShape*)nbase, slot, attrs, flags, shortid)) > > Why is this cast needed? This one is a bit annoying. |nbase| is a Rooted<UnownedBaseShape*> and matchesParamsAfterId takes a Raw<BaseShape*>, which C++ was not able to naively convert to. I still need to try the IsConvertibleTo trick from Rooted, which I expect will work better.
(In reply to Steve Fink [:sfink] from comment #7) > After this patch, would Return<T> *only* be used for the types of return > values? That would be a nice property, assuming it makes sense. (And if it > does, should we add a stack-only assertion?) Also yes, that would be awesome. I'd go even farther and ensure that it is only ever used as a C++ temporary and never named. Is there a way in C++ to do this?
Attached patch v1: with more C++ automation. (obsolete) (deleted) — Splinter Review
This ends up in a much nicer place. I was able to get rid of /all/ of the extra casting I added. The down side is I had to go overboard with C++. In the deep end. I've probably just frankensteined together another hilariously bad foot-gun.
Attachment #681144 - Attachment is obsolete: true
Attachment #681264 - Flags: feedback?(sphink)
Attachment #681264 - Flags: feedback?(sphink)
After using to Raw<T> on UnownedBaseShape, I managed to find and address several weaknesses in the new class and undo all of the C++ wonkiness than I had wrongly introduced. This is actually a substantially nice cleanup where I applied it: it gets us real type-y GC safety for (Unowned)BaseShape, except in the 3 cases where we were explicitly using bare pointers for performance. I think this is effectively the second to last piece (Steve has Array/Vector rooters almost done) before we can just sit down and crank on exact rooting.
Attachment #681264 - Attachment is obsolete: true
Attachment #681598 - Flags: review?(wmccloskey)
Comment on attachment 681598 [details] [diff] [review] v2: including application of new class to UnownedBaseShape. Looks like the BaseShape patch got backed out, so I'm going to rebase that into this.
Attachment #681598 - Flags: review?(wmccloskey)
And the test it perma-oranged got disabled a day later because it was perma-orange. *facepalm*
Attached patch v3: This is ready. (obsolete) (deleted) — Splinter Review
I've used this for UnownedBaseShape and BaseShape (included) and for most of Shape (next patch) now. I think this is a huge improvement. With Luke's help, I found a way to sanely expose our static methods in a non-static way for the user sites. I introduced some static Shape methods in this patch but I'm going to hold the non-static wrapping until next patch and do Shape all at once. I think it will be easier to review that way.
Attachment #681598 - Attachment is obsolete: true
Attachment #682553 - Flags: review?(wmccloskey)
Summary: GC: Allow for better ergonomics when storing Return<T> on the stack → GC: Implement Raw<T> and make Return<T> stronger
Blocks: ExactRooting
Comment on attachment 682553 [details] [diff] [review] v3: This is ready. Review of attachment 682553 [details] [diff] [review]: ----------------------------------------------------------------- This all seems pretty good to me. We definitely need some higher-level documentation about when to use these, though. Please add a comment in Root.h that gives some advice. Could you add a drop() method to Raw<T>? It would revert the value back to 0x1 and leave the NoGC region. I think that would allow you to eliminate some of the Unsafe pointers related to StackShape. I realized that there's some question right now about whether a user-implemented trace callback is allowed to GC. I think we should declare that it cannot. That will allow us to use unrooted pointers throughout the marking paths, which will be simpler. To that end, can you add AssertNoGC to MarkInternal in gc/Marking.cpp? Mainly I'm concerned about the names. I would expect something called Raw to be an actual raw pointer. Also, I don't like the name Unsafe for a type because it doesn't tell you what's unsafe about it. I would propose that a RawT should always be a typedef for T*, while the new template you've created should be called Unrooted<T>. One thing that concerns me is that we've so far been using RawObject to mean that we've vetted that function and we don't expect it to GC. Eventually we'll want to convert most of those to UnrootedObject. However, I think that's a cost we'll have to pay. ::: js/src/gc/Root.h @@ +399,2 @@ > > + Return(NullPtr) : ptr_(NULL) I don't understand why you need this. Why can't you use Return<T>(NULL) instead? Is it to avoid having to type out T? @@ +407,5 @@ > + EnterAssertNoGCScope(); > + } > + > + ~Return() > + { In general, functions that are at the top level (unindented) have an opening brace on its own line. Indented functions should have a brace on the same line. The only exception is when the arguments and such span more than one line. This needs to be fixed in several places in this file. ::: js/src/jsscope.h @@ +525,5 @@ > void handoffTableTo(Shape *newShape); > > inline void setParent(js::Shape *p); > > bool ensureOwnBaseShape(JSContext *cx) { Shouldn't this be static too? It's only called once, so it doesn't seem like it would be difficult.
Attachment #682553 - Flags: review?(wmccloskey)
(In reply to Bill McCloskey (:billm) from comment #17) > Comment on attachment 682553 [details] [diff] [review] > v3: This is ready. > > Review of attachment 682553 [details] [diff] [review]: > ----------------------------------------------------------------- > > This all seems pretty good to me. \o/ > We definitely need some higher-level > documentation about when to use these, though. Please add a comment in > Root.h that gives some advice. I've been meaning to revisit bug 784880 (Improve the GC stack rooting comments) and update the wiki article(s) as well. > Could you add a drop() method to Raw<T>? It would revert the value back to > 0x1 and leave the NoGC region. I think that would allow you to eliminate > some of the Unsafe pointers related to StackShape. Good idea! > I realized that there's some question right now about whether a > user-implemented trace callback is allowed to GC. I think we should declare > that it cannot. That will allow us to use unrooted pointers throughout the > marking paths, which will be simpler. To that end, can you add AssertNoGC to > MarkInternal in gc/Marking.cpp? Right! I totally forgot about the trace hook. When I updated Marking, I think I was assuming that the AssertNoGC at the top of Collect would catch everything, but that's clearly not entirely true. This has the added benefit that with the next paragraph included, Marking and jsgc.cpp can remain pretty much untouched from their current state. > Mainly I'm concerned about the names. I would expect something called Raw to > be an actual raw pointer. Also, I don't like the name Unsafe for a type > because it doesn't tell you what's unsafe about it. I would propose that a > RawT should always be a typedef for T*, while the new template you've > created should be called Unrooted<T>. Awesome, this was exactly what I wanted feedback on most. Upon reflection, I agree that those are clearer. > One thing that concerns me is that we've so far been using RawObject to mean > that we've vetted that function and we don't expect it to GC. Eventually > we'll want to convert most of those to UnrootedObject. However, I think > that's a cost we'll have to pay. Agreed. We can lean on the rooting analysis to catch errors here in the meantime. > ::: js/src/gc/Root.h > @@ +399,2 @@ > > > > + Return(NullPtr) : ptr_(NULL) > > I don't understand why you need this. Why can't you use Return<T>(NULL) > instead? Is it to avoid having to type out T? I initially had that. I switched to NullPtr() for the reason you mentioned, and because it's much less cluttered looking in methods that use this common pattern: { if (foo) return Return<Shape*>(NULL); if (bar) return Return<Shape*>(NULL); if (frob) return Return<Shape*>(NULL); RootedShape shape(cx, stuff); .... return shape; } There's only one method like this in the BaseShape patch -- its been more useful for Shape. I guess the more poignant question is if this constructor is going to turn into a footgun for some reason. Do you see a potential problem? > ::: js/src/jsscope.h > @@ +525,5 @@ > > void handoffTableTo(Shape *newShape); > > > > inline void setParent(js::Shape *p); > > > > bool ensureOwnBaseShape(JSContext *cx) { > > Shouldn't this be static too? It's only called once, so it doesn't seem like > it would be difficult. The next patch in the series does so. I should probable have done all of them in one go, but I didn't know at the time that there would only need to be 5 static-for-rooting-purposes methods on Shape.
I guess the NullPtr thing is fine. I don't think it's likely to cause problems. I wonder if returning nullptr instead of NULL might avoid this problem? I think currently we aren't allowed to use nullptr, but I'm not sure if there's a good reason for that.
mozilla/NullPtr.h provides a nullptr macro/keyword as appropriate for the compiler and compilation mode. It'd be nice to switch JS over to using it; Gecko switched to it awhile ago, so we'd be consistent with them, which seems nice. The nullptr implementation provided by that header isn't as a C++11 separate type, not everywhere, so you can't have overloads that distinguish for nullptr yet. Sort of. To do that we need to also hack up our own std::nullptr_t, which is obviously more than a bit fragile. :-) It could well happen eventually, perhaps even soon, but it's not super-trivial. It would definitely help stuff, tho -- bug 792790 really wanted it, bug 779048 wanted it, this wants it, and presumably others will in the future. Partly it just requires someone determined enough to make it happen.
Bug 806546 is the likely bug where std::nullptr_t would be implemented.
Attached patch v4 (obsolete) (deleted) — Splinter Review
Attachment #682553 - Attachment is obsolete: true
Attachment #683777 - Flags: review?(wmccloskey)
Attached patch v5 (deleted) — Splinter Review
I just noticed that I forgot to remove the unsafeGet() on assignment to the variables I was able to change to UnrootedT with DropUnrooted. This also incorporates an important change to DropUnrooted that I found very helpful for Shape. Specifically, it returns its value before poisoning. This allows us to have something like: UnrootedFoo foo = MaybeGC(cx); ... non-gc stuff ... UnrootedBar bar = DropUnrooted(foo)->maybeGC(cx); ... more non-gc stuff ...
Attachment #683777 - Attachment is obsolete: true
Attachment #683777 - Flags: review?(wmccloskey)
Attachment #684187 - Flags: review?(wmccloskey)
Whiteboard: [js:t]
Comment on attachment 684187 [details] [diff] [review] v5 Review of attachment 684187 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Marking.cpp @@ +266,5 @@ > JS_ASSERT(*thingp); > JS_ASSERT(kind == GetGCThingTraceKind(*thingp)); > switch (kind) { > case JSTRACE_OBJECT: > + MarkInternal(trc, reinterpret_cast<RawObject *>(thingp)); It's valid to use static_cast on a void*, so I think you can use that here and below. ::: js/src/gc/Root.h @@ +48,5 @@ > + * always clear. For example, the following innocuous-looking actions can > + * cause a GC: allocation of any new GC thing; JSObject::hasProperty; > + * JS_ReportError and friends; and ToNumber, among many others. The following > + * dangerous-looking actions cannot trigger a GC: js_malloc, cx->malloc_, > + * rt->malloc_, and friends and JS_ReportOutOfMemory. This is a nice paragraph :-). @@ +59,5 @@ > * > * - Rooted<T> declares a variable of type T, whose value is always rooted. > * Rooted<T> may be automatically coerced to a Handle<T>, below. Rooted<T> > * should be used whenever a local variable's value may be held live across a > + * call which can trigger a GC. This is generally true of You kind of trailed @@ +103,5 @@ > + * > + * - UnrootedT is a typedef for a pointer to thing of type T. In DEBUG builds > + * it gets replaced by a class that additionally acts as an AutoAssertNoGC > + * guard. Since there is only minimal compile-time protection against > + * mis-use, UnrootedT should only be used in places where there is adequate I don't think you need a hyphen in misuse (here or the next line). @@ +109,5 @@ > + * is caught at runtime. > + * > + * - DropUnrooted(UnrootedT &v) will poison |v| and end its AutoAssertNoGC > + * scope. This can be used to force |v| out of scope before its C++ scope > + * would end naturally. The usage of braces C++ syntactical scopes |{...}| I think you meant to remove the word braces here. ::: js/src/jsfriendapi.cpp @@ +889,5 @@ > AutoMarkInDeadCompartment amn(cell->compartment()); > > uint32_t kind = gc::GetGCThingTraceKind(ptr); > if (kind == JSTRACE_OBJECT) > + JSObject::writeBarrierPre(reinterpret_cast<RawObject>(ptr)); It's valid to use static_cast on a void*, so please use that here and below.
Attachment #684187 - Flags: review?(wmccloskey) → review+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
This bug appears to have added a number of warnings on Windows of the form c:\dev\mozilla-inbound\js\src\jsscope.h(231) : warning C4099: 'js::Shape' : type name first seen using 'struct' now seen using 'class' c:\dev\mozilla-inbound\js\src\jsprvtd.h(147) : see declaration of 'js::S hape'
Doh! I totally forgot about that warning. Will get a patch up in a bit.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch v0: simple cleanup (deleted) — Splinter Review
ForwardDeclare always assumes the type was declared as a class, rather than a struct. This triggers build warnings on Windows. The solution here is to just make all of our the GC Thing types classes. Windows try run up at: https://tbpl.mozilla.org/?tree=Try&rev=37f2f9ba8e8f
Attachment #691529 - Flags: review?(wmccloskey)
Comment on attachment 691529 [details] [diff] [review] v0: simple cleanup Review of attachment 691529 [details] [diff] [review]: ----------------------------------------------------------------- Did you look outside the engine?
No, I made the silly assumption that users would be using our type declarations rather than making their own. The Try run should tell us, either way.
Comment on attachment 691529 [details] [diff] [review] v0: simple cleanup Review of attachment 691529 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Root.h @@ +498,5 @@ > typedef Unrooted<type*> Unrooted##type; \ > typedef type * Raw##type > > # define ForwardDeclareJS(type) \ > + class JS##type; \ The backslash at the end needs to be lined up. @@ +507,4 @@ > } \ > + namespace JS { \ > + typedef Handle<JS##type*> Handle##type; \ > + typedef MutableHandle<JS##type*> MutableHandle##type; \ It seems weird that these are defined in ForwardDeclareJS but not ForwardDeclare. @@ +508,5 @@ > + namespace JS { \ > + typedef Handle<JS##type*> Handle##type; \ > + typedef MutableHandle<JS##type*> MutableHandle##type; \ > + } \ > + class JS##type Why is this here? Is it to handle the extra semicolon in the macro? If so, I don't think it's needed. Semicolon is a valid declaration in C++ as far as I know. @@ +547,4 @@ > } \ > + namespace JS { \ > + typedef Handle<JS##type*> Handle##type; \ > + typedef MutableHandle<JS##type*> MutableHandle##type; \ Same remarks apply here as above.
Attachment #691529 - Flags: review?(wmccloskey) → review+
(In reply to Bill McCloskey (:billm) from comment #32) > @@ +508,5 @@ > > + namespace JS { \ > > + typedef Handle<JS##type*> Handle##type; \ > > + typedef MutableHandle<JS##type*> MutableHandle##type; \ > > + } \ > > + class JS##type > > Why is this here? Is it to handle the extra semicolon in the macro? If so, I > don't think it's needed. Semicolon is a valid declaration in C++ as far as I > know. gcc gives: warning: extra ‘;’ [-pedantic] It's specifically because of a semicolon after the closing } of a namespace. Just about anywhere else is fine. Perhaps there's a specific warning we could disable instead.
> gcc gives: warning: extra ‘;’ [-pedantic] > > It's specifically because of a semicolon after the closing } of a namespace. > Just about anywhere else is fine. Perhaps there's a specific warning we > could disable instead. No, I'm wrong. The warning is for extra semicolons at the outer scope (outside of any functions). static int foo;; would give the warning too.
Looks like the bug 823080 fixed the warnings.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → DUPLICATE
I think "Implement Raw<T> and make Return<T> stronger" is not a dupe.
Resolution: DUPLICATE → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: