Closed Bug 614138 Opened 14 years ago Closed 14 years ago

Remove the temp value rooting traceable quickstubs do

Categories

(Core :: XPConnect, defect, P1)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla2.0b8

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

(Blocks 1 open bug)

Details

(Keywords: perf, Whiteboard: fixed-in-tracemonkey)

Attachments

(3 files, 5 obsolete files)

I looked through the uses of xpc_qsArgValArray; the following places write to it: A) Conversion of the return value. This is converting into a JSObject on the stack that we then return, so we can rely on stack scanning here. B) |this| unwrapping (FromCcx or not). This calls getWrapper on the given JSObject (which hands back an obj2 which is on the parent or proto chain of the given object or something that object wraps) and then calls castNative, which just calls getNative, which just hands out the object it was given. So the value we write into the rooted array is something that seems to be reachable from the original object we had, and hence is being kept alive by it, unless I'm missing something. C) Argument unboxing for traceable natives. This works like so: 1) Take incoming |v| and converts to |JSObject *src| (no object crations here) 2) getWrapper gets an obj2 which is on the parent or proto chain of |src| or something |src| wraps. 3a) Create a wrapped JS (this always sets the argNref.ptr too, so that strong ref will keep things alive). or 3b) Call castNative on obj2. This calls one of getNativeFromWrapper or getNative. getNativeFromWrapper just calls getNative. getNative just hands out the obj that was passed in as *vp. So again, I think not storing to the array for this stuff is safe. I'd really appreciate a sanity check on the above! In fact, except for the return value none of these really care about the resulting jsval at all... The non-traceable versions might, though (they stick it into the actual vp, etc). But if not, can we just simplify some of these function signatures? In any case, patch to nix the memset+rooting coming up, since it seems to be unnecessary.
Blocks: 614145
Whiteboard: [need review]
Comment on attachment 492557 [details] [diff] [review] We don't need to temp-root our various JSObjects, since they're either reachable from our arguments, kept alive by strong refs to an nsXPCWrappedJS, or reachable via conservative stack scanning. Please review my assumptions here carefully!
Attachment #492557 - Flags: review?(peterv)
Attachment #492557 - Flags: review?(jorendorff)
Comment on attachment 492557 [details] [diff] [review] We don't need to temp-root our various JSObjects, since they're either reachable from our arguments, kept alive by strong refs to an nsXPCWrappedJS, or reachable via conservative stack scanning. >diff --git a/content/canvas/src/CustomQS_WebGL.h b/content/canvas/src/CustomQS_WebGL.h In content/canvas/src/CustomQS_WebGL.h: > static inline void FASTCALL > helper_nsIDOMWebGLRenderingContext_Uniform_x_iv_tn(JSContext *cx, JSObject *obj, JSObject *locationobj, > JSObject *arg, int nElements) > { > XPC_QS_ASSERT_CONTEXT_OK(cx); > > nsIDOMWebGLRenderingContext *self; > xpc_qsSelfRef selfref; >- xpc_qsArgValArray<3> vp(cx); >- if (!xpc_qsUnwrapThis(cx, obj, nsnull, &self, &selfref.ptr, &vp.array[0], nsnull)) { >- js_SetTraceableNativeFailed(cx); >- return; >+ { /* scope for |dummy| */ >+ jsval dummy; >+ if (!xpc_qsUnwrapThis(cx, obj, nsnull, &self, &selfref.ptr, &dummy, >+ nsnull)) { >+ js_SetTraceableNativeFailed(cx); >+ return; >+ } > } Hmm. Well, only obj is necessarily rooted, and obj might not be the XPCWrappedNative that keeps the COM object, self, alive. The XPCWrappedNative may be on obj's prototype chain. In the words of your comment 0, item B, this is "reachable from the original object we had"; but it won't necessarily stay that way the whole time we need self to stay alive. If we call back into JS code, obj.__proto__ could be assigned. The XPCWN could become garbage. GC could destroy self. I don't know if that could actually happen or not in this code. If self is an XPCWrappedJS, it probably gets AddRef'd and stored in selfref, right? So the only remaining question is whether any of the Uniform*iv_array methods, or any of the other junk we're calling (js_CreateTypedArrayWithArray and so on) could call user code. A quick look at traceableArgumentConversionTemplate in qsgen.py suggests that we can't reenter, and [traceable] is opt-in, so ... perhaps none of the methods that are [traceable] have any implementations that can possibly re-enter, even in unusual cases? I dunno. That's a pretty big assumption for me to swallow. How badly do we need to get rid of this rooting code? This seems precarious. I'd really rather not, unless I'm missing something that makes it safer, or measurement indicates this is much faster. But if we ever manage to ban assigning to __proto__, this would be OK. r- for now, and clearing r?peterv.
Attachment #492557 - Flags: review?(peterv)
Attachment #492557 - Flags: review?(jorendorff)
Attachment #492557 - Flags: review-
Whiteboard: [need review]
If you guarantee that a stack variable points to the object thats sufficient for rooting and cheaper.
> If self is an XPCWrappedJS, it probably gets AddRef'd and stored in selfref, > right? Yes. > If we call back into JS code I don't think that's a problem for the GL stuff, but for general quickstubs it could be, in theory. Think dispatchEvent, where the actual C++ method to be called can most definitely enter JS code. And for general argument conversions, of course, we can call various toString methods and whatnot on later arguments after we've unwrapped the earlier arguments, right? Though maybe we don't do that in the traceable natives. > How badly do we need to get rid of this rooting code? For some microbenchmarks (I was looking at dromaeo) it's 20% or more of total runtime. At least once I get dromaeo back on trace. I guess plan B might be to more carefully manage our rooting array so things are only rooted if really needed; for getElementById we create a rooted slot for the string, then don't use it, say. That'll give a smaller speedup, but still noticeable. > If you guarantee that a stack variable points to the object We can't guarantee that, can we? The whole point is that we have these JSObject* that aren't used directly (we're working with their private ptrs only), so the compiler could optimize them away...
The comments should say it all. This would need an MSVC port.
Attachment #494876 - Flags: review?(jorendorff)
There may need to be a "memory" operand to the asm there, too. I'm not sure we've really prevented all the reorderings we need to. G++ could still move (what it believes to be) irrelevant code past the asm.
(In reply to comment #5) > And for general argument conversions, of course, we can call various toString > methods and whatnot on later arguments after we've unwrapped the earlier > arguments, right? Though maybe we don't do that in the traceable natives. We don't currently. But it's hard to be sure, and hard to vouch for future changes to qsgen.py. jimb's approach is better, though the lack of certainty is mildly scary.
Comment on attachment 494876 [details] [diff] [review] Add an API for holding GC objects while we use values they own. Yes, this sounds good. feedback+. Who can provide the MSVC code? Surely other C++ programs that use conservative GC have precisely this problem. Whose code can we look at? FWIW, does the test fail without the Anchor? Did you look at the effect this has on generated code? Ultimately, we need to use Anchors in a bunch of places, I think -- or rather, Anchor-based RAII classes: StringChars, FunctionScript, that kind of thing.
Attachment #494876 - Flags: review?(jorendorff) → feedback+
(In reply to comment #9) > Yes, this sounds good. feedback+. Who can provide the MSVC code? Surely other > C++ programs that use conservative GC have precisely this problem. Whose code > can we look at? I'll check Guile, although I don't know that it's been ported to MSVC. > FWIW, does the test fail without the Anchor? Indeed it does. > Did you look at the effect this has on generated code? No, I have not. > Ultimately, we need to use Anchors in a bunch of places, I think -- or rather, > Anchor-based RAII classes: StringChars, FunctionScript, that kind of thing. Should there be 'typedef Anchor<JSString *> StringAnchor;' and things like that? I was thinking there should be some dummy 'traits' template to ensure it's only used at the types it makes sense for.
(In reply to comment #9) > Yes, this sounds good. feedback+. Who can provide the MSVC code? Surely other > C++ programs that use conservative GC have precisely this problem. Whose code > can we look at? Guile simply passes the value to a no-op function on non-GCC platforms. We need to do better than that.
Here's the effect on the machine code. Looking through it now...
So, it affected register allocation (as it should), which affected some prologue and epilogue code (more registers to save/restore). This is all expected. It also rearranged some of the cold branches (calls to JS_Assert). I didn't see any changes in their actual behavior. I didn't see any other effects on the code in this particular case. It certainly didn't deoptimize the world.
This version has the memory-clobbering notation, and it also prevents instantiating the template at types that the conservative GC wouldn't actually notice.
Attachment #494876 - Attachment is obsolete: true
Well, then. Very much let's get some MSVC code for this and land it.
Who can we shanghai into doing the MSVC port? Let's name names.
Would it make sense to be able to ask for the &hold of the Anchor so it can be passed to a function? Otherwise I end up needing to use this pattern: jsval temp; GetFoo(&temp); js::Anchor temp_anchor(temp);
Comment on attachment 495090 [details] [diff] [review] Add an API for holding GC objects while we use values they own. >Bug 614138: Add an API for holding GC objects while we use values they own. > >+ * >+ * The "memory" clobber operand ensures that G++ will not move prior memory >+ * accesses after the asm --- it's a barrier. Unfortunately, it also means that >+ * G++ will assume that all memory has changed after the asm, as it would for a >+ * call to an unknown function. I don't know of a way to >+ */ >+ asm volatile("":: "g" (hold) : "memory"); This comment trails off. Also, I'm naively wondering why the "hold" field can't be marked volatile with plain old C++? Does that not actually work?
Perhaps a dumb question, but why isn't it enough to use an RAII class that uses the value at destruction time? class Anchor<T> { ~Anchor() { if (!hold) *(int*)hold = 17; // Or ++static_nonsense_variable, or whatever. } } Is that too easy to reorder? Or am I missing something else? If reordering is a problem, maybe additionally making 'hold' volatile would help? (Probably not.) Or perhaps static void* global_hold; class Anchor<T> { T* hold; Anchor(T* v) : hold(v) { global_hold = (void*) v; } ~Anchor() { if (!hold) global_hold = NULL; } } though that obviously adds a small amount of useless work. But nothing reads global_hold, so the write will never stall the pipeline. gcc will think that the GC-triggering thing in the body might read global_hold, so it can't move the read of 'hold' up, and moving the store to 'hold' at the top wouldn't matter. Or I'm just confusing myself. I'll be quiet now.
(In reply to comment #18) > Also, I'm naively wondering why the "hold" field can't > be marked volatile with plain old C++? Does that not actually work? AIUI, volatile just means the value in memory can change without you writing to it, so you'd better re-fetch it every time you need it. But if you never use it, then that's irrelevant; you can still optimize it away entirely, because all zero uses are properly re-fetching it from memory.
> *(int*)hold = 17; // Or ++static_nonsense_variable, or whatever. |hold| is not necessarily a pointer type. In fact, in the case of jsval it most definitely is not.
Revised; comment fixes, permit const variants.
Attachment #495090 - Attachment is obsolete: true
(In reply to comment #21) > > *(int*)hold = 17; // Or ++static_nonsense_variable, or whatever. > > |hold| is not necessarily a pointer type. In fact, in the case of jsval it > most definitely is not. Ok. But that still seems like a relatively minor detail, if you can eliminate the compiler-specific construct. It's already enumerating the exact set of types that it supports, so it can just specialize the non-pointer type to do something else with a jsval (the only non-pointer it accepts.) Is it good enough to have the address of a jsval on the stack? (If so, you don't even need to specialize for jsval. You can make the general case take the address of the anchored value, and specialize for pointers by just grabbing the pointer.) FWIW, I modified the patch to just do ~Anchor() { if (!reinterpret_cast<int*>(hold)) { *reinterpret_cast<int*>(hold) = 17; } } and it passed the test (and failed if I commented out the destructor body.) I also made a version with a specialization for jsval. The test doesn't use an Anchor<jsval>, though, so I don't know if it really works. (I added a dummy declaration just to be sure it compiles, but I don't have a test for it.)
> Is it good enough to have the address of a jsval on the stack? No. Note that the idea here is to have zero impact, btw. Doing stuff in the dtor sorta fails that test. For my use case (the one this bug is about), the cost of the zero writes to the stack in the current setup is sufficient to cause a slowdown. Note that I'm trying to shave off processor cycles (as in, probably less than 10) per DOM call here; the relevant stupid benchmarks are measuring operations that are taking order of 100 processor cycles on my hardware.
Adds a portable alternative to the __GNUC__-only implementation. The tests can detect whether the 'volatile' qualifier is present in the destructor, so I think we'll at least have some warning if this fails.
Attachment #496203 - Attachment is obsolete: true
(In reply to comment #19) > Perhaps a dumb question, but why isn't it enough to use an RAII class that uses > the value at destruction time? > > class Anchor<T> { > ~Anchor() { > if (!hold) > *(int*)hold = 17; // Or ++static_nonsense_variable, or whatever. > } > } > > Is that too easy to reorder? Or am I missing something else? ... the code you wrote above writes a 17 into the JSObject, doesn't it? But in general, it's too easy to reorder, or to throw away entirely. Imagine the compiler simply substituting the destructor's body into the function at the appropriate point(s), and see if you can throw it away.
Add a T &get() member and a no-argument constructor, as requested in comment #17.
Attachment #496235 - Attachment is obsolete: true
Attached patch Merged on top of jimb's patch (deleted) — Splinter Review
Attachment #496267 - Flags: review?(jorendorff)
Attachment #492557 - Attachment is obsolete: true
Whiteboard: [need review]
Attachment #496250 - Flags: review?(jorendorff)
Comment on attachment 496250 [details] [diff] [review] Add an API for holding GC objects while we use values they own. I still say all this conservative GC stuff is a big mistake. Soon someone will want Anchor<js::Value>, but we can burn that bridge when we come to it.
Attachment #496250 - Flags: review?(jorendorff) → review+
Comment on attachment 496267 [details] [diff] [review] Merged on top of jimb's patch Phew. It's such a relief not to have to care about the kind of details I was worrying about in comment 3. >@@ -1366,19 +1369,21 @@ def writeTraceableQuickStub(f, customMet > # Call the method. > comName = header.methodNativeName(member) >- if getBuiltinOrNativeTypeName(member.realtype) == '[jsval]': >- argNames.append("&vp.array[0]") >- elif not isVoidType(member.realtype): >+ # If we start allowing [jsval] return types here, we need to >+ # tack the return value onto the arguments list... and handle >+ # properly returning it too. See bug 604198. >+ assert getBuiltinOrNativeTypeName(member.realtype) is not '[jsval]' >+ if not isVoidType(member.realtype): > argNames.append(outParamForm(resultname, member.realtype)) To me, it seems like that comment and assertion belong in outParamForm, not here. r=me either way.
Attachment #496267 - Flags: review?(jorendorff) → review+
Attachment #496250 - Flags: approval2.0+
Attachment #496267 - Flags: approval2.0+
> To me, it seems like that comment and assertion belong in outParamForm, Bah! There's jsval code there too! OK, I'll move the comment + assert there.
OK, pushed: http://hg.mozilla.org/tracemonkey/rev/929ed9d5d81b http://hg.mozilla.org/tracemonkey/rev/00524af0568d and then several attempts to make the anchor stuff compile on Windows and on both debug and opt and for all types: http://hg.mozilla.org/tracemonkey/rev/bac4a8d70492 http://hg.mozilla.org/tracemonkey/rev/ae03fdb249ad http://hg.mozilla.org/tracemonkey/rev/4d97e9955bfb jimb, I'd really appreciate you looking over |hg diff -r 00524af0568d -r 4d97e9955bfb| to make sure you're ok with those changes.
Whiteboard: [need review] → fixed-in-tracemonkey
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
(In reply to comment #32) > jimb, I'd really appreciate you looking over |hg diff -r 00524af0568d -r > 4d97e9955bfb| to make sure you're ok with those changes. That looks correct to me. I wonder if there isn't a simpler way to do it, but that could be a separate patch.
Status: RESOLVED → UNCONFIRMED
Ever confirmed: false
Resolution: FIXED → ---
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Blocks: 618217
Filed neaten-up bug 618217.
Target Milestone: --- → mozilla2.0b8
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: