Closed Bug 561359 Opened 15 years ago Closed 13 years ago

"Assertion failure: &shape.methodObject() == &prev.toObject(),"

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla9
Tracking Status
firefox6 - ---
blocking2.0 --- .x+

People

(Reporter: gkw, Assigned: jorendorff)

References

Details

(Keywords: assertion, regression, testcase, Whiteboard: [softblocker][js-triage-done][inbound])

Attachments

(4 files, 8 obsolete files)

for (let z = 0; z < 2; z++) { with({ x: function() {} }) { for (l in [x]) {} } } asserts js debug shell on TM tip without -j at Assertion failure: sprop->methodValue() == prev, at ../jsscope.cpp:1167
autoBisect shows this is probably related to bug 558058: The first bad revision is: changeset: 40454:61de331861af user: Andreas Gal date: Thu Apr 08 07:53:09 2010 -0700 summary: No need to lookup parent/proto for iterator objects, and cache the last free one (bug 558058, r=brendan).
Blocks: 558058
blocking2.0: --- → ?
blocking2.0: ? → final+
Comment #0 now asserts in Assertion failure: &sprop->methodObject() == &prev.toObject(), at ../jsscope.cpp:1215
OS: Mac OS X → All
Hardware: x86 → All
Summary: "Assertion failure: sprop->methodValue() == prev, at ../jsscope.cpp" → "Assertion failure: &sprop->methodObject() == &prev.toObject(), at ../jsscope.cpp:1215"
Comment #0 has further morphed to Assertion failure: &shape.methodObject() == &prev.toObject(),
Summary: "Assertion failure: &sprop->methodObject() == &prev.toObject(), at ../jsscope.cpp:1215" → "Assertion failure: &shape.methodObject() == &prev.toObject(),"
Assignee: general → lw
If I put the assert JS_ASSERT(&methodObject() == &pobj->nativeGetSlot(slot).toObject()); in Shape::get(), in the isMethod() branch, I get an assert with just: for (let z = 0; z < 2; z++) with({ x: function() {} }) { x; } I can trace this assert back to JSOP_INITMETHOD on the second iteration where testForInit hits on the offending shape (whose methodObject is different than the value assigned to the obj slot).
Attached patch hack (deleted) — Splinter Review
This patch fixes comment 0 and comment 4 (with the tighter assert), but only for the interpreter and I'm not sure if this is even the right fix since I'm not very familiar with propcache invariants. The bug is this: 1. the first time around the loop, JSOP_INITMETHOD caches the transition from an empty object to an object with a single method property (function() {}). 2. the with creates a call object 3. the second time around the loop, JSOP_LAMBDA clones the compiled function object (b/c of the call object) 4. the propcache hits and we transition from an empty object to the same shape as (1), whose methodObject is the compiled object and whose slot value is the clone, hence the assert on read.
Attachment #491583 - Flags: feedback?(jorendorff)
Jason said there are some big changes to INITMETHOD/SETMETHOD coming in so I think this should wait and reevaluate after those land.
Attachment #491583 - Flags: feedback?(jorendorff)
Comment on attachment 491583 [details] [diff] [review] hack I was wrong; the main changes landed. There's still a bit left in bug 614051, but it doesn't affect this. If possible, filling more carefully would be better than this hack. But that depends on knowing ahead of time if the INITMETHOD optimization will fly or not. The distance in time between LAMBDA and SETMETHOD/INITMETHOD is an issue. I need to look closer at what JSOP_LAMBDA is doing, and understand the root cause of this bug better. Setting f?me.
Attachment #491583 - Flags: feedback?(jorendorff)
Comment on attachment 491583 [details] [diff] [review] hack This patch should be a last resort. More explanation coming in a moment.
Attachment #491583 - Flags: feedback?(jorendorff) → feedback-
Brendan, please take a look. To paraphrase Luke's comment 5: - First time through JSOP_LAMBDA, fp->scopeChain() is the global, so the method optimization is enabled. - JSOP_INITMETHOD fills the property cache with the method shape. - The with-statement forces the enclosing Block to be reified. - Second time through, fp->scopeChain() is the Block object, so the method optimization is disabled. The bug is that the method optimization depends on whether or not an enclosing Block happens to have been reified. That's not deterministic enough: for cache correctness, JSOP_LAMBDA to apply the method optimization based solely on criteria known at jsemit time and/or covered by the recipient object's shape. (JSStackFrame::scopeChain doesn't actually get the scope chain. It exposes the optimization that we reify scope objects lazily. JSStackFrame::notTheScopeChain would be a better name for it.) That we setMethodAtom at run time in JSOP_LAMBDA, instead of doing it statically in the emitter, also seems weird. One approach is to get rid of that `if (obj->parent == parent)` test and use `script->compileAndGo` instead. But it's not quite that simple: 1. There's a case where a JSFunction can be a null closure, or claim to be one, but its behavior actually depends on Call objects and such: function f(s) { var obj = {m: function () { return a; }}; eval(s); return obj; } var obj = f("var a = 'right';"); var a = 'wrong'; assertEq(obj.m(), 'right'); Here the function obj.m is a null closure, and we emit JSOP_INITMETHOD; but JSOP_LAMBDA sees a Call object and so the method optimization is inhibited: we pass the test. The change I propose would break this. But it seems to me obj.m shouldn't be a null closure. 2. Brendan mentioned on IRC that the `if (obj->parent == parent)` test in JSOP_LAMBDA protects against API abuse, where a compile-n-go script is incorrectly executed in a different scope from the one where it was compiled. But such abuse is an expressway to crashville. We should assert against it in JS_Execute*. We currently don't, because the scope object passed to JS_Compile* is discarded after compilation. In DEBUG builds, we should hang on to it. Luke, do you still want this one? I can take.
You got it partner; I was just blocker hunting.
Assignee: lw → jorendorff
Whiteboard: softblocker
Bug 494637 seems relevant -- if we fixed it, would we have anything to do here? /be
Yes, I think we'd still have work here. (Though, bug 494637 is not a good thing to have lying around either.) Fixing now.
This depends on the bug where some functions are marked as null closures when they shouldn't be (because they are inside a with-statement or a function that calls eval directly). Does anyone know that bug#? I can't find it. I think it'll be pretty easy to fix that bug as far as direct eval is concerned. To detect enclosing with-statements, I think I need a new tcflag or something, to be set in Parser::functionDef. Also fairly straightforward. Brendan, you are the likely reviewer here. Let me know if I'm off target!
methodjit's treatment of JSOP_LAMBDA could be better, I think -- more of the branches can be decided statically, leaving less to be done at run time -- but we can tidy that up in a follow-up bug.
(In reply to comment #14) > This depends on the bug where some functions are marked as null closures when > they shouldn't be (because they are inside a with-statement or a function that > calls eval directly). Does anyone know that bug#? I can't find it. bug 617609 maybe?
That's the one. Thanks.
(In reply to comment #17) > That's the one. Thanks. bug 617609 is now fixed-on-TM. :)
In the wake of all softblocker blocking2.0-final+ bugs soon-to-be switched to blocking2.0-, renominating this one in the hope of getting a ".x" flag. The seemingly endless switching of the assert message in this bug every now and then is annoying but nothing too severe for the fuzzers to handle.
blocking2.0: final+ → ?
blocking2.0: ? → .x
I've been stalling review to work on blockers. Let me know if that's wrong. /be
Attached patch v1 (obsolete) (deleted) — Splinter Review
There are basically three things going on here. 1. fun->joinable() should be false if we were not compiled with the COMPILE_N_GO flag. So fix that. 2. Don't emit JSOP_{INIT,SET}METHOD after a JSOP_LAMBDA for a function that isn't joinable. Duh. 3. Tidy up afterwards. Use fun->joinable() instead of wordier ways of saying the same thing. Eliminate a few branches. This part changed a lot of indentation, so I'll post a diff -b version in a sec.
Attachment #514963 - Flags: review?(brendan)
Attached patch v1, diff -b (obsolete) (deleted) — Splinter Review
Why the parent to scope chain (unreified) removal in the JSOP_LAMBDA-interp case? That seems desirable to detect a joined function object and avoid the further overhead, but if it is a bogus test (it may be already due to scope chain variability for a given op in a loop) a better test still seems worth doing. /be
It is definitely a bogus test. That's the line of code that causes the bug. (comment 9) > The bug is that the method optimization depends on whether or not an enclosing > Block happens to have been reified. That's not deterministic enough[...] But yes, a better test can be done. I can test fun->joinable() instead. How about that?
Seems like exactly the right test! Sorry, slow here due to cold. /be
Mentioned on IRC: compile-time "joinable" makes sense but once we've decided, "joined" for function objects is better. /be
Attachment #514963 - Attachment is obsolete: true
Attachment #514964 - Attachment is obsolete: true
Attachment #514963 - Flags: review?(brendan)
Attached patch v2, diff -b (obsolete) (deleted) — Splinter Review
Attachment #515698 - Flags: review?(brendan)
Comment on attachment 515698 [details] [diff] [review] v2 - Only the one change mentioned in comments 25-26. >diff --git a/js/src/jscompartment.cpp b/js/src/jscompartment.cpp >--- a/js/src/jscompartment.cpp >+++ b/js/src/jscompartment.cpp >@@ -213,18 +213,17 @@ JSCompartment::wrap(JSContext *cx, Value > JS_ASSERT(str->asCell()->compartment() == cx->runtime->atomsCompartment); > return true; > } > } > > /* > * Wrappers should really be parented to the wrapped parent of the wrapped > * object, but in that case a wrapped global object would have a NULL >- * parent without being a proper global object (JSCLASS_IS_GLOBAL). Instead >-, >+ * parent without being a proper global object (JSCLASS_IS_GLOBAL). Instead, > * we parent all wrappers to the global object in their home compartment. > * This loses us some transparency, and is generally very cheesy. > */ > JSObject *global; > if (cx->hasfp()) { > global = cx->fp()->scopeChain().getGlobal(); > } else { > global = cx->globalObject; >diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp >--- a/js/src/jsinterp.cpp >+++ b/js/src/jsinterp.cpp >@@ -5598,17 +5598,17 @@ BEGIN_CASE(JSOP_LAMBDA) > JSObject *obj = FUN_OBJECT(fun); > > /* do-while(0) so we can break instead of using a goto. */ > do { > JSObject *parent; > if (FUN_NULL_CLOSURE(fun)) { > parent = &regs.fp->scopeChain(); > >- if (obj->getParent() == parent) { >+ if (fun->joinable()) { > jsbytecode *pc2 = AdvanceOverBlockchainOp(regs.pc + JSOP_LAMBDA_LENGTH); > JSOp op2 = JSOp(*pc2); > > /* > * Optimize var obj = {method: function () { ... }, ...}, > * this.method = function () { ... }; and other significant > * single-use-of-null-closure bytecode sequences. > * >diff --git a/js/src/jsparse.cpp b/js/src/jsparse.cpp >--- a/js/src/jsparse.cpp >+++ b/js/src/jsparse.cpp >@@ -304,17 +304,19 @@ Parser::newFunctionBox(JSObject *obj, JS > funbox->tcflags |= TCF_IN_WITH; > return funbox; > } > > bool > JSFunctionBox::joinable() const > { > return FUN_NULL_CLOSURE((JSFunction *) object) && >- !(tcflags & (TCF_FUN_USES_ARGUMENTS | TCF_FUN_USES_OWN_NAME)); >+ (tcflags & (TCF_FUN_USES_ARGUMENTS | >+ TCF_FUN_USES_OWN_NAME | >+ TCF_COMPILE_N_GO)) == TCF_COMPILE_N_GO; > } > > bool > JSFunctionBox::inAnyDynamicScope() const > { > for (const JSFunctionBox *funbox = this; funbox; funbox = funbox->parent) { > if (funbox->tcflags & (TCF_IN_WITH | TCF_FUN_CALLS_EVAL)) > return true; >@@ -4713,18 +4715,16 @@ CloneParseTree(JSParseNode *opn, JSTreeC > > #undef NULLCHECK > } > return pn; > } > > #endif /* JS_HAS_DESTRUCTURING */ > >-extern const char js_with_statement_str[]; >- > static JSParseNode * > ContainsStmt(JSParseNode *pn, TokenKind tt) > { > JSParseNode *pn2, *pnt; > > if (!pn) > return NULL; > if (PN_TYPE(pn) == tt) >diff --git a/js/src/jstracer.cpp b/js/src/jstracer.cpp >--- a/js/src/jstracer.cpp >+++ b/js/src/jstracer.cpp >@@ -15580,17 +15580,17 @@ TraceRecorder::record_JSOP_LAMBDA() > * via identity and mutation. But don't clone if our result is consumed by > * JSOP_SETMETHOD or JSOP_INITMETHOD, since we optimize away the clone for > * these combinations and clone only if the "method value" escapes. > * > * See jsinterp.cpp, the JSOP_LAMBDA null closure case. The JSOP_SETMETHOD and > * JSOP_INITMETHOD logic governing the early ARECORD_CONTINUE returns below > * must agree with the corresponding break-from-do-while(0) logic there. > */ >- if (FUN_NULL_CLOSURE(fun) && FUN_OBJECT(fun)->getParent() == &cx->fp()->scopeChain()) { >+ if (fun->joinable()) { > jsbytecode *pc2 = AdvanceOverBlockchainOp(cx->regs->pc + JSOP_LAMBDA_LENGTH); > JSOp op2 = JSOp(*pc2); > > if (op2 == JSOP_INITMETHOD) { > stack(0, w.immpObjGC(FUN_OBJECT(fun))); > return ARECORD_CONTINUE; > } > >diff --git a/js/src/methodjit/StubCalls.cpp b/js/src/methodjit/StubCalls.cpp >--- a/js/src/methodjit/StubCalls.cpp >+++ b/js/src/methodjit/StubCalls.cpp >@@ -1415,70 +1415,78 @@ stubs::RegExp(VMFrame &f, JSObject *rege > if (!obj) > THROWV(NULL); > return obj; > } > > JSObject * JS_FASTCALL > stubs::LambdaForInit(VMFrame &f, JSFunction *fun) > { >+ JS_ASSERT(fun->joinable()); >+ JS_ASSERT(FUN_NULL_CLOSURE(fun)); >+ JS_ASSERT(f.fp()->script()->compileAndGo); > JSObject *obj = FUN_OBJECT(fun); >- if (FUN_NULL_CLOSURE(fun) && obj->getParent() == &f.fp()->scopeChain()) { >+ JS_ASSERT(obj->getParent() != NULL); >+ >+ fun->setMethodAtom(f.fp()->script()->getAtom(GET_SLOTNO(f.regs.pc))); >+ return obj; >+} >+ >+JSObject * JS_FASTCALL >+stubs::LambdaForSet(VMFrame &f, JSFunction *fun) >+{ >+ JS_ASSERT(fun->joinable()); >+ JS_ASSERT(FUN_NULL_CLOSURE(fun)); >+ JS_ASSERT(f.fp()->script()->compileAndGo); >+ JSObject *obj = FUN_OBJECT(fun); >+ JS_ASSERT(obj->getParent() != NULL); >+ >+ const Value &lref = f.regs.sp[-1]; >+ if (lref.isObject() && lref.toObject().canHaveMethodBarrier()) { > fun->setMethodAtom(f.fp()->script()->getAtom(GET_SLOTNO(f.regs.pc))); > return obj; > } > return Lambda(f, fun); > } > > JSObject * JS_FASTCALL >-stubs::LambdaForSet(VMFrame &f, JSFunction *fun) >-{ >- JSObject *obj = FUN_OBJECT(fun); >- if (FUN_NULL_CLOSURE(fun) && obj->getParent() == &f.fp()->scopeChain()) { >- const Value &lref = f.regs.sp[-1]; >- if (lref.isObject() && lref.toObject().canHaveMethodBarrier()) { >- fun->setMethodAtom(f.fp()->script()->getAtom(GET_SLOTNO(f.regs.pc))); >- return obj; >- } >- } >- return Lambda(f, fun); >-} >- >-JSObject * JS_FASTCALL > stubs::LambdaJoinableForCall(VMFrame &f, JSFunction *fun) > { >+ JS_ASSERT(fun->joinable()); >+ JS_ASSERT(FUN_NULL_CLOSURE(fun)); >+ JS_ASSERT(f.fp()->script()->compileAndGo); > JSObject *obj = FUN_OBJECT(fun); >- if (FUN_NULL_CLOSURE(fun) && obj->getParent() == &f.fp()->scopeChain()) { >- /* >- * Array.prototype.sort and String.prototype.replace are >- * optimized as if they are special form. We know that they >- * won't leak the joined function object in obj, therefore >- * we don't need to clone that compiler- created function >- * object for identity/mutation reasons. >- */ >- int iargc = GET_ARGC(f.regs.pc); >+ JS_ASSERT(obj->getParent() != NULL); > >- /* >- * Note that we have not yet pushed obj as the final argument, >- * so regs.sp[1 - (iargc + 2)], and not regs.sp[-(iargc + 2)], >- * is the callee for this JSOP_CALL. >- */ >- const Value &cref = f.regs.sp[1 - (iargc + 2)]; >- JSObject *callee; >+ /* >+ * Array.prototype.sort and String.prototype.replace are >+ * optimized as if they are special form. We know that they >+ * won't leak the joined function object in obj, therefore >+ * we don't need to clone that compiler- created function >+ * object for identity/mutation reasons. >+ */ >+ int iargc = GET_ARGC(f.regs.pc); > >- if (IsFunctionObject(cref, &callee)) { >- JSFunction *calleeFun = callee->getFunctionPrivate(); >- Native native = calleeFun->maybeNative(); >+ /* >+ * Note that we have not yet pushed obj as the final argument, >+ * so regs.sp[1 - (iargc + 2)], and not regs.sp[-(iargc + 2)], >+ * is the callee for this JSOP_CALL. >+ */ >+ const Value &cref = f.regs.sp[1 - (iargc + 2)]; >+ JSObject *callee; > >- if (native) { >- if (iargc == 1 && native == array_sort) >- return obj; >- if (iargc == 2 && native == str_replace) >- return obj; >- } >+ if (IsFunctionObject(cref, &callee)) { >+ JSFunction *calleeFun = callee->getFunctionPrivate(); >+ Native native = calleeFun->maybeNative(); >+ >+ if (native) { >+ if (iargc == 1 && native == array_sort) >+ return obj; >+ if (iargc == 2 && native == str_replace) >+ return obj; > } > } > return Lambda(f, fun); > } > > JSObject * JS_FASTCALL > stubs::LambdaJoinableForNull(VMFrame &f, JSFunction *fun) > { >diff --git a/js/src/tests/js1_8_5/regress/jstests.list b/js/src/tests/js1_8_5/regress/jstests.list >--- a/js/src/tests/js1_8_5/regress/jstests.list >+++ b/js/src/tests/js1_8_5/regress/jstests.list >@@ -16,16 +16,20 @@ script regress-553778.js > script regress-555246-0.js > script regress-555246-1.js > script regress-559402-1.js > script regress-559402-2.js > script regress-559438.js > script regress-560101.js > script regress-560998-1.js > script regress-560998-2.js >+script regress-561359-1.js >+script regress-561359-2.js >+script regress-561359-3.js >+script regress-561359-4.js > script regress-563210.js > script regress-563221.js > script regress-566549.js > script regress-566914.js > script regress-567152.js > script regress-569306.js > script regress-569464.js > script regress-571014.js >diff --git a/js/src/tests/js1_8_5/regress/regress-561359-1.js b/js/src/tests/js1_8_5/regress/regress-561359-1.js >new file mode 100644 >--- /dev/null >+++ b/js/src/tests/js1_8_5/regress/regress-561359-1.js >@@ -0,0 +1,13 @@ >+// Any copyright is dedicated to the Public Domain. >+// http://creativecommons.org/licenses/publicdomain/ >+// Contributor: Gary Kwong <gary@rumblingedge.com> >+ >+for (let z = 0; z < 2; z++) { >+ with({ >+ x: function() {} >+ }) { >+ for (l in [x]) {} >+ } >+} >+ >+reportCompare(0, 0, 'ok'); >diff --git a/js/src/tests/js1_8_5/regress/regress-561359-2.js b/js/src/tests/js1_8_5/regress/regress-561359-2.js >new file mode 100644 >--- /dev/null >+++ b/js/src/tests/js1_8_5/regress/regress-561359-2.js >@@ -0,0 +1,14 @@ >+// Any copyright is dedicated to the Public Domain. >+// http://creativecommons.org/licenses/publicdomain/ >+// Contributor: Jason Orendorff <jorendorff@mozilla.com> >+ >+function f(s) { >+ var obj = {m: function () { return a; }}; >+ eval(s); >+ return obj; >+} >+var obj = f("var a = 'right';"); >+var a = 'wrong'; >+assertEq(obj.m(), 'right'); >+ >+reportCompare(0, 0, 'ok'); >diff --git a/js/src/tests/js1_8_5/regress/regress-561359-3.js b/js/src/tests/js1_8_5/regress/regress-561359-3.js >new file mode 100644 >--- /dev/null >+++ b/js/src/tests/js1_8_5/regress/regress-561359-3.js >@@ -0,0 +1,13 @@ >+// Any copyright is dedicated to the Public Domain. >+// http://creativecommons.org/licenses/publicdomain/ >+// Contributor: Jason Orendorff <jorendorff@mozilla.com> >+ >+function f(s) { >+ with (s) >+ return {m: function () { return a; }}; >+} >+var obj = f({a: 'right'}); >+var a = 'wrong'; >+assertEq(obj.m(), 'right'); >+ >+reportCompare(0, 0, 'ok'); >diff --git a/js/src/tests/js1_8_5/regress/regress-561359-4.js b/js/src/tests/js1_8_5/regress/regress-561359-4.js >new file mode 100644 >--- /dev/null >+++ b/js/src/tests/js1_8_5/regress/regress-561359-4.js >@@ -0,0 +1,10 @@ >+// Any copyright is dedicated to the Public Domain. >+// http://creativecommons.org/licenses/publicdomain/ >+// Contributor: Jason Orendorff <jorendorff@mozilla.com> >+ >+var x; >+for (let i = 0; i < 2; i++) >+ x = {m: function () {}, n: function () { i++; }}; >+x.m; >+ >+reportCompare(0, 0, 'ok');
Argh, how did that happen? Chrome bug, I think (my Minefield was rebuilding and I could not get Chrome to switch to edit the patch). Sorry for this mess. I think I wrote something like this in the small textarea, which was trumped by the (invisible) switch to patch-editing mode: >- if (FUN_NULL_CLOSURE(fun) && FUN_OBJECT(fun)->getParent() == &cx->fp()->scopeChain()) { >+ if (fun->joinable()) { Don't we need to keep the FUN_NULL_CLOSURE(fun) test to match the interpreter? /be
No, because only null closures are joinable.
(In reply to comment #32) > No, because only null closures are joinable. Right (I knew that, I swear! ;-)) but this suggests making the interpreter case for JSOP_LAMBDA test only fun->joinable() in the outer if that currently tests FUN_NULL_CLOSURE(fun), and get rid of the inner if. Right? The #ifdef DEBUG code that updates rt->unjoinedFunctionCountMap would kick in only for joined function objects that could not be optimized via the method barrier, but that seems better too. /be
Yes, that totally makes sense.
Attached patch v3 (obsolete) (deleted) — Splinter Review
Attachment #515698 - Attachment is obsolete: true
Attachment #516976 - Flags: review?(brendan)
Attachment #515698 - Flags: review?(brendan)
Attached patch b3, diff -b (obsolete) (deleted) — Splinter Review
Attachment #515700 - Attachment is obsolete: true
Comment on attachment 516976 [details] [diff] [review] v3 Thanks, r=me. /be
Attachment #516976 - Flags: review?(brendan) → review+
(In reply to comment #35) > Created attachment 516976 [details] [diff] [review] > v3 Can this pls be landed soon? (Just hoping to prevent bitrot)
I landed it Monday: http://hg.mozilla.org/tracemonkey/rev/39de74c74b20 but it bounced: http://hg.mozilla.org/tracemonkey/rev/11c970a9eacc It reliably causes windows to leak here when I start the browser. :-|
Is this bug going to get fixed? Can I help? Just LMK, thanks. /be
nominate with a reason for tracking-6
A patch in bug 666713 is hitting this. Updating the patch.
Attached patch v4 - same as v3, rebased to m-c tip (obsolete) (deleted) — Splinter Review
Attachment #516976 - Attachment is obsolete: true
This patch still causes DOMWindows to leak when I open and close the browser. How do you figure out this kind of problem? I can only think of two approaches: 1. Figure out where the printf is that says we leaked windows, and dump the JS heap at that point, then figure out what path through the graph is keeping windows alive and try and figure out what it might have to do with this patch. 2. Reduce the patch and figure out the minimum change that causes the leak. Try to figure out what extra edges that change adds. #2 sounds more appealing. Anyone else have any ideas?
peterv, what other tricks do you have for debugging leaking DOMWindows?
You can dump the cycle collector graph, then use it to see which object is making the CC hold on to the window. That object will be something where the number of known references the CC sees is less than the ref count. Sometimes it will just be the window itself that has an unknown reference, but it can be useful. Once you have that, you can do refcount logging on the object holding on to the window, to see what is holding onto it.
I've been meaning to write this up in detail, as I've had to explain it to a number of people so far, but I haven't gotten around to it yet...
This bug needs an owner. Please steal it. It's important to get bug 666713 landed quickly, and that means either fixing this bug or finding some kind of temporary hack so that it doesn't crash (see bug 666713 comment 16 through 19). Comment 46 and option 2 in comment 44 still seem like the best things to try next.
Assignee: jorendorff → general
Whiteboard: softblocker → [softblocker][js-triage-done]
I'll look into the leak caused by this patch.
There are two nsGlobalWindows that leak with this patch, just from opening and closing the browser. One is just the inner window, so this is really just one that is leaking. Here is what the cycle collector graph says is rooting this object: 0x118de3628 [JS Object (ChromeWindow) (global=118de3628)] --[xpc_GetJSPrivate(obj)]-> 0x118e66a00 [XPCWrappedNative (ChromeWindow)] --[mIdentity]-> 0x105694fe8 [nsGlobalWindow] In other words, there's a JS Object holding on to an XPCWrappedNative, which is holding onto the window. The A --[FOO]-> B notation means that object A holds onto object B via a field or whatever named FOO. The JS Object (ChromeWindow) is a root because the GC marked it, which means it is reachable from a JS root. I'll try a JS GC dump to figure out why the GC is rooting the JS Object (ChromeWindow).
I set up XPConnect to do a JS_DumpHeap every time the cycle collector invokes the GC. The JS Object (ChromeWindow) shows up exactly once on the left side (the address is different because this is a different run, but it is the same object): 0x11a814a60 ChromeWindow 1055b4570 via mScopeObject I assume this means that the ChromeWindow is being stored directly on mScopeObject, and that's why it is kept alive. I don't know what mScopeObject is, so I'm not sure what that means. The string mScopeObject doesn't appear in the patch anywhere. sfink said in IRC: "all I can tell is that it ends up as the aScope for CallEventHandler at http://mxr.mozilla.org/mozilla-central/source/dom/base/nsJSEnvironment.cpp#1827" In summary, the nsGlobalWindow that is leaking is being held alive in the cycle collector by a ChromeWindow. The ChromeWindow is (maybe) being held alive in the garbage collector because it is being held by mScopeObject. Unfortunately, I think this is all I will really be able to do, not knowing anything about the context. If anybody else wants to look into this, I can let you know how to reproduce at least what I have found so far.
This patch will dump a JS GC heap log to a file gc-edges-X.log every time the CC invokes the GC. The CC does this every time at shutdown, which is what we are interested in for shutdown leaks. The easiest way to dump a CC log at every CC is to apply the patch in Bug 679779, then uncomment the line //#define ALWAYS_LOG_CC_GRAPH Here's a rough guide to what I did. Apply these two patches, rebuild Firefox, then start up Firefox, and exit it. This will produce a number of cc-edges-X.log and gc-edges-X.log files. You are interested in the last ones produced, as they are from the final GC and CC before the CC gave up because it stopped finding anything. Look at the last cc-edges file, for the nsGlobalWindow. There will be two. One has no children. This is the inner window, and is not interesting. The other one is the one being held onto. Write down the address of this object. If you trace back a couple of steps, you'll find the Chrome window I describe. Write down the address of that window, then look it up in the GC log file.
jorendorff found what looks like the source of the mScopeObject rooting: http://mxr.mozilla.org/mozilla-central/source/dom/src/events/nsJSEventListener.cpp#118 I'll try to hunt down somebody who knows that code.
The ChromeWindow in question is the mScopeObject for a huge number of nsJSEventListeners. I'm not sure why nsJSEventListeners are acting as a black root (as opposed to a grey one) on the JS side.
I'm pretty sure the nsJSEventListener is just a red herring. I need to get JS_DumpHeap to dump all paths to the ChromeWindow, not just the first one...
Okay, after multiple rounds of hacking on things, I made JS_DumpHeap not display anything reachable only from grey roots, which helped clear out red herrings like nsJSEventListener. The remaining paths (of length <= 10) from black roots to the ChromeWindow look like this (after deleting addresses): ChromeWindow via machine stack(BackstagePass).PlacesUtils(Object)._shutdownFunctions(Array).element[0](Function).parent(Block).parent(Call).XYZ XYZ varies, but starts with either aURI, aCallback, or aScope. Here's an example: aURI(XPCWrappedNative_NoHelper).proto(XPC_WN_NoMods_NoCall_Proto_JSClass).parent So, it looks like a function is being stuck into the shutdown functions array of BackstagePass, and that function seems to be dragging along its entire context, which includes the ChromeWindow, which in turn drags along the nsGlobalWindow. Hopefully that helps somebody who understands the patch.
Andrew, thank you so much. Victory! Can you easily get the fun->script()->filename and fun->script()->lineno of the function or functions at issue here? If so, I could then look at the source and produce a small test case. If not, I'll ponder some more and come back to you if I get stuck. Before I forget: it sounds like your hacks (1) nailed the problem (2) involved a nontrivial amount of work to create. We should check that stuff in (enabled by a parameter to JS_DumpHeap or something, not an #ifdef) ...unless you can think of something even better we could do instead. I'll be happy to file a follow-up bug for that if you like. Taking this bug back. If I recall correctly, the patch was only intended to affect null closures, and it wasn't intended ever to turn a null closure into a "full" closure (parented to the enclosing Call). If so, the leak must stem from a bug in the patch. I will look at it next week.
Assignee: general → jorendorff
(In reply to Jason Orendorff [:jorendorff] from comment #57) > Andrew, thank you so much. Victory! Can you easily get the > fun->script()->filename and fun->script()->lineno of the function or > functions at issue here? If so, I could then look at the source and produce > a small test case. If not, I'll ponder some more and come back to you if I > get stuck. Sure, that sounds doable. Would that be from the "Function" part of the path I showed above? Or something deeper in? > Before I forget: it sounds like your hacks (1) nailed the problem (2) > involved a nontrivial amount of work to create. We should check that stuff > in (enabled by a parameter to JS_DumpHeap or something, not an #ifdef) > ...unless you can think of something even better we could do instead. I'll > be happy to file a follow-up bug for that if you like. It wasn't too bad in the end, it was mostly a matter of figuring out what to do. I filed Bug 680482 about a replacement for JS_DumpHeap, but a few smaller tweaks might be helpful in the mean time.
(In reply to Jason Orendorff [:jorendorff] from comment #57) > Andrew, thank you so much. Victory! Can you easily get the > fun->script()->filename and fun->script()->lineno of the function or > functions at issue here? The filename is resource://gre/modules/PlacesUtils.jsm and lineno is 2088. Which does not look very helpful to me: + "SELECT 1 FROM moz_items_annos a "
With this patch, every time the CC calls the GC, it will dump a heap to a file with a name like "gc-edges-1.log". This heap dump will not visit any grey roots. It also prints out the file and line number of (some) functions. (It also sets the thingToFind to the first ChromeWindow it finds, but I don't think that really matters as it turns out.) To reproduce what I've shown here with both my patch and jorendorff's patches applied: 1. start the browser and close it right away. 2. open the last gc-edges file, which should be gc-edges-3.log 3. search in the file for " ChromeWindow ". All lines with this string are paths from black roots to the leaking ChromeWindow. 4. Within one of those lines, look for the part of the path that looks like ".element[0](0x123066a28 Function).parent". 0x123066a28 (or whatever) is the address of the function. 5. Search for the line containing "0x123066a28 Function ", it will contain the file and line number like so: 0x123066a28 Function 11ae0d180 resource://gre/modules/PlacesUtils.jsm:2088
Attachment #553941 - Attachment is obsolete: true
(In reply to Andrew McCreight [:mccr8] from comment #59) > (In reply to Jason Orendorff [:jorendorff] from comment #57) > > Andrew, thank you so much. Victory! Can you easily get the > > fun->script()->filename and fun->script()->lineno of the function or > > functions at issue here? > > The filename is resource://gre/modules/PlacesUtils.jsm and lineno is 2088. > > Which does not look very helpful to me: > + "SELECT 1 FROM moz_items_annos a " There are some #ifdefs in this source file. In my build directory, the preprocessed .jsm file has this at line 2088: this.registerShutdownFunction(function () { this._asyncGetBookmarksStmt.finalize(); }); And note that there are indeed arguments named aURI, aCallback, and aScope in the enclosing function. This is the one.
Oh, I see. This patch was only supposed to affect the method optimization. The leak happens because it also completely disables the null closure optimization in non-compile-and-go code. Oops. This will be easy to fix, I think.
Attached patch v5 (deleted) — Splinter Review
The -b version will be a lot easier to review, give me a sec...
Attachment #516977 - Attachment is obsolete: true
Attachment #546893 - Attachment is obsolete: true
Attachment #555117 - Flags: review?(dvander)
OK. In the methodjit, this moves a few of the branches in JSOP_LAMBDA from runtime back to compile time. Nothing too tricky. LambdaJoinableForNull becomes a no-op; I don't have enough methodjit-fu to eliminate the stub call in that case. Maybe you can suggest a mini-patch that I can fold into this one.
Comment on attachment 555117 [details] [diff] [review] v5 Sorry for the delay here, and thanks for the whitespace-adusted diff!
Attachment #555117 - Flags: review?(dvander) → review+
Whiteboard: [softblocker][js-triage-done] → [softblocker][js-triage-done][inbound]
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla9
Getting this fixed does help, thanks all!
Depends on: 684178
Depends on: 684489
Depends on: 684779
Depends on: 684972
Depends on: 685864
A testcase for this bug was automatically identified at js/src/jit-test/tests/basic/bug561359-1.js.
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: