Closed Bug 493657 Opened 16 years ago Closed 16 years ago

TM: Wrong callee is restored when side-exiting from a trace

Categories

(Core :: JavaScript Engine, defect, P1)

x86
macOS
defect

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: bzbarsky, Assigned: gal)

References

(Blocks 1 open bug)

Details

(Keywords: verified1.9.1, Whiteboard: fixed-in-tracemonkey)

Attachments

(5 files, 9 obsolete files)

This is the testcase from bug 460050 comment 70.  Output, with current tip m-c opt shell:

mozilla% ./js/src/opt-obj/js ~/bar.js
starting init() 
test 2 (findChainTouching...) run time: 186
test 3 (both) run time: 795
mozilla% ./js/src/opt-obj/js -j ~/bar.js
starting init() 
test 2 (findChainTouching...) run time: 87
/Users/bzbarsky/bar.js:140: TypeError: counts[chainId] is undefined

I get similar behavior in a debug shell.
Attached file The testcase (obsolete) (deleted) β€”
Blocks: 460050
Attachment #378183 - Attachment is obsolete: true
Attached file reduced testcase (obsolete) (deleted) β€”
Assignee: general → gal
Attachment #378187 - Attachment is obsolete: true
Attached file further reduced with output (deleted) β€”
Attachment #378191 - Attachment is obsolete: true
Attachment #378205 - Attachment is patch: false
Attached file reduced to 43 lines (obsolete) (deleted) β€”
Attached file 25 lines (obsolete) (deleted) β€”
Attachment #378210 - Attachment is obsolete: true
Attachment #378212 - Attachment mime type: application/octet-stream → text/plain
Attached file 17 lines (obsolete) (deleted) β€”
Attachment #378212 - Attachment is obsolete: true
Attached file 17 lines (deleted) β€”
Attachment #378214 - Attachment is obsolete: true
The callee gets burned into the FrameInfo, so when we restore from within the second closure, we restore the wrong callee.

    fi->callee = JSVAL_TO_OBJECT(fval);
    fi->block = fp->blockChain;
    fi->pc = fp->regs->pc;
Why is callee in FrameInfo at all, when it's on the native stack and tracked by the tracker? If we need to synthesize frames by restoring the correct callee, why can't we get that object ref from the native stack?

/be
Attached file 38 lines based on 43 lines (deleted) β€”
I didn't see the problem using the first 17 line version on my build (changeset:  28465:c5c18dde8bb9 Mac), so I tried re-reducing starting at the 43 line version.

To me, it looks like the counts array that's inside the anonymous function (lines 23-24) isn't the same array as on the outside (lines 26-27).

Another weird thing is that line 8 does something.  If I comment it out using strict mode and JIT, I get: strict warning: reference to undefined property counts[i.chainId].
Here's the output that I get running this without and with JIT:
--start--
Mac-mini:GoBug aaron$ ~/mozilla/js/src/js -s -f 17b_lines_am.js 
in: 1
in: 1
in: 1
out: 1
in: 2
in: 2
in: 2
out: 2
Mac-mini:GoBug aaron$ ~/mozilla/js/src/js -s -j -f 17b_lines_am.js 
in: 1
in: 1
in: 1
out: 1
in: 1
in: 1
in: 1
out: 2
--end--
Attached patch patch (obsolete) (deleted) β€” β€” Splinter Review
Attachment #378251 - Flags: review?(brendan)
Attachment #378251 - Flags: review?(brendan) → review+
Comment on attachment 378251 [details] [diff] [review]
patch

>@@ -3529,17 +3535,17 @@ js_SynthesizeFrame(JSContext* cx, const 
>     /* Claim space for the stack frame and initialize it. */
>     JSInlineFrame* newifp = (JSInlineFrame *) newsp;
>     newsp += nframeslots;
> 
>     newifp->frame.callobj = NULL;
>     newifp->frame.argsobj = NULL;
>     newifp->frame.varobj = NULL;
>     newifp->frame.script = script;
>-    newifp->frame.callee = fi.callee;
>+    newifp->frame.callee = fi.callee; /* Roll with a potential wrong callee for now. */

s/potential/&ly/

Could cite bug 471425 here with a FIXME comment. r=me with that. Worse is better!

/be
Comment on attachment 378251 [details] [diff] [review]
patch

Whoops, just noticed that the synthesized frame's scopeChain was computed from the wrong callee, and that will make for bad security and scope bugs!

/be
Attachment #378251 - Flags: review+ → review-
Attached patch v2 (obsolete) (deleted) β€” β€” Splinter Review
Attachment #378251 - Attachment is obsolete: true
Attached patch v2, I wish I could attach patches right (obsolete) (deleted) β€” β€” Splinter Review
Attachment #378254 - Attachment is obsolete: true
Attachment #378255 - Flags: review?(brendan)
Comment on attachment 378255 [details] [diff] [review]
v2, I wish I could attach patches right

>+    newifp->frame.callee = fi.callee; /* Roll with a potential wrong callee for now. */

Use // style comment here, as you did later?

>-    newifp->frame.scopeChain = OBJ_GET_PARENT(cx, fi.callee);
>+    newifp->frame.scopeChain = NULL; // will be updated in FlushNativeStackFrame

...

>@@ -4587,34 +4596,34 @@ LeaveTree(InterpState& state, VMSideExit
>         JS_ASSERT(ti->nGlobalTypes() > innermost->numGlobalSlots);
>         globalTypeMap = (uint8*)alloca(ngslots * sizeof(uint8));
>         memcpy(globalTypeMap, getGlobalTypeMap(innermost), innermost->numGlobalSlots);
>         memcpy(globalTypeMap + innermost->numGlobalSlots,
>                ti->globalTypeMap() + innermost->numGlobalSlots,
>                ti->nGlobalTypes() - innermost->numGlobalSlots);
>     }
> 
>+    /* write back native stack frame */
>+#ifdef DEBUG
>+    int slots =
>+#endif
>+        FlushNativeStackFrame(cx, innermost->calldepth,
>+                              getStackTypeMap(innermost),
>+                              stack, NULL);
>+    JS_ASSERT(unsigned(slots) == innermost->numStackSlots);
>+
>+    if (innermost->nativeCalleeWord)
>+        SynthesizeSlowNativeFrame(cx, innermost);
>+
>     /* write back interned globals */
>     double* global = (double*)(&state + 1);
>     FlushNativeGlobalFrame(cx, ngslots, gslots, globalTypeMap, global);
>     JS_ASSERT(*(uint64*)&global[STOBJ_NSLOTS(JS_GetGlobalForObject(cx, cx->fp->scopeChain))] ==
>               0xdeadbeefdeadbeefLL);
> 
>-    /* write back native stack frame */
>-#ifdef DEBUG
>-    int slots =
>-#endif
>-        FlushNativeStackFrame(cx, innermost->calldepth,
>-                              getStackTypeMap(innermost),
>-                              stack, NULL);
>-    JS_ASSERT(unsigned(slots) == innermost->numStackSlots);
>-
>-    if (innermost->nativeCalleeWord)
>-        SynthesizeSlowNativeFrame(cx, innermost);
>-

Comment the reason for doing FlushNativeStackFrame before FlushNativeGlobalFrame.

Pre-existing nit: shouldn't the former by plural-Frames?

/be
Attachment #378255 - Flags: review?(brendan) → review+
I applied the patch from comment 13 to changeset: 28527:aac8141341bf.

The first testcaes "The testcase" gives me these results:
Mac-mini:GoBug aaron$ ~/mozilla/js/src/js -f the_testcase.js 
starting init() 
test 2 (findChainTouching...) run time: 263
test 3 (both) run time: 1461
Mac-mini:GoBug aaron$ ~/mozilla/js/src/js -j -f the_testcase.js 
starting init() 
test 2 (findChainTouching...) run time: 120
test 3 (both) run time: 1033

It looks like the original error is gone.

The following test cases now produce the same output with or without JIT.
 reduced testcase
 17 lines   (274 bytes)
 17 lines   (256 bytes)
 38 lines based on 43 lines
 15 lines based on the second 17 lines testcase
Words are not enough to express how much we want to block on this. Bad correctness bug.
Severity: major → critical
Flags: blocking1.9.1?
Priority: -- → P1
Summary: Incorrect execution on attached testcase with jit on → TM: Wrong callee is restored when side-exiting from a trace
Flags: blocking1.9.1? → blocking1.9.1+
Both try server runs died for unclear reasons (timeout/infrastructure problems I believe). Maybe we get more lucky on the actual tree. I don't want to wait another day.

http://hg.mozilla.org/tracemonkey/rev/cec8ee353407
Whiteboard: fixed-in-tracemonkey
backed out

http://hg.mozilla.org/tracemonkey/rev/c4cea7365f4e
Whiteboard: fixed-in-tracemonkey
re-landed

http://hg.mozilla.org/tracemonkey/rev/8f6c242a75ff
Whiteboard: fixed-in-tracemonkey
backed out again

http://hg.mozilla.org/tracemonkey/rev/a18035c7c3d2
Whiteboard: fixed-in-tracemonkey
Fired off a try server build that flips around the order in which stacks and globals are written back, which this patch does. I want to see whether that has any side effects.
Just flipping around the reconstruction code seems to have passed the try server.
Attached patch v3 (deleted) β€” β€” Splinter Review
Only set scopeChain to parent(callee) if we actually synthesized that frame, otherwise we might trample over some with object or similar monstrosities.
Attachment #378255 - Attachment is obsolete: true
Passed try server. This better sticks this time since its 4am and I really don't feel like backing it out again.

http://hg.mozilla.org/tracemonkey/rev/ca3db3beaa41
Whiteboard: fixed-in-tracemonkey
Looks like it stuck, woo! Onward to m-c and 1.9.1!
http://hg.mozilla.org/mozilla-central/rev/ca3db3beaa41
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Depends on: 494045
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/554965ebd0f1
Keywords: fixed1.9.1
Verified fixed by running the xpshell test with the following debug builds:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre)
Gecko/20090522 Minefield/3.6a1pre ID:20090522133810

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre)
Gecko/20090522 Shiretoko/3.5pre ID:20090522153422
Status: RESOLVED → VERIFIED
Flags: in-testsuite?
Target Milestone: --- → mozilla1.9.2a1
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: