Closed Bug 592791 Opened 14 years ago Closed 14 years ago

Talos regressions from JM landing

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- beta7+

People

(Reporter: sayrer, Assigned: dmandelin)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Win7 Improvement: SunSpider decrease 12.6% on Win7 TraceMonkey Improvement: Dromaeo (jslib) increase 44.9% on Win7 TraceMonkey Improvement: V8 decrease 51.6% on Win7 TraceMonkey Regression: Dromaeo (DOM) decrease 8.75% on Win7 TraceMonkey Regression: Dromaeo (SunSpider) decrease 26.2% on Win7 TraceMonkey XP Improvement: Dromaeo (V8) increase 17.2% on XP TraceMonkey Improvement: V8 decrease 50.3% on XP TraceMonkey Improvement: SunSpider decrease 11.2% on XP TraceMonkey Improvement: Dromaeo (jslib) increase 43.2% on XP TraceMonkey Regression: Dromaeo (DOM) decrease 8.01% on XP TraceMonkey Regression: Dromaeo (SunSpider) decrease 25.5% on XP TraceMonkey Regression: Tp4 (Private Bytes) increase 3.58% on XP TraceMonkey Regression: Tp4 (Memset) increase 3.41% on XP TraceMonkey Regression: Ts, MIN Dirty Profile increase 0.541% on XP TraceMonkey Win2k3 Regression: Trace Malloc Allocs increase 4.23% on WINNT 5.2 TraceMonkey Linux x86-32 Improvement: V8 decrease 43.1% on Linux TraceMonkey Improvement: SunSpider decrease 11.3% on Linux TraceMonkey Improvement: Dromaeo (jslib) increase 27.3% on Linux TraceMonkey Improvement: Dromaeo (V8) increase 5.25% on Linux TraceMonkey Regression: Dromaeo (DOM) decrease 5.44% on Linux TraceMonkey Regression: Dromaeo (String/Array/Eval/Regex) decrease 3.4% on Linux TraceMonkey Regression: Dromaeo (SunSpider) decrease 33% on Linux TraceMonkey Linux x64 Improvement: V8 decrease 25.3% on Linux x64 TraceMonkey Improvement: SunSpider decrease 2.25% on Linux x64 TraceMonkey Improvement: Dromaeo (CSS) increase 4% on Linux x64 TraceMonkey Improvement: Dromaeo (jslib) increase 19.6% on Linux x64 TraceMonkey Regression: Dromaeo (DOM) decrease 8.98% on Linux x64 TraceMonkey Regression: Dromaeo (V8) decrease 23% on Linux x64 TraceMonkey Regression: Dromaeo (String/Array/Eval/Regex) decrease 21.5% on Linux x64 TraceMonkey Regression: Dromaeo (SunSpider) decrease 46.9% on Linux x64 TraceMonkey Regression: Tp4 increase 3.49% on Linux x64 TraceMonkey Regression: Tp4 (Private Bytes) increase 2.13% on Linux x64 TraceMonkey Regression: Tp4 (RSS) increase 4.61% on Linux x64 TraceMonkey MacOS 10.5 (32-bit) Improvement: SunSpider decrease 10.6% on MacOSX 10.5.8 TraceMonkey Improvement: V8 decrease 43.1% on MacOSX 10.5.8 TraceMonkey Improvement: Dromaeo (jslib) increase 28.6% on MacOSX 10.5.8 TraceMonkey Improvement: Dromaeo (V8) increase 3.92% on MacOSX 10.5.8 TraceMonkey Improvement: GFX decrease 41.7% on MacOSX 10.5.8 TraceMonkey Regression: Dromaeo (String/Array/Eval/Regex) decrease 9.84% on MacOSX 10.5.8 TraceMonkey Regression: Dromaeo (SunSpider) decrease 33.3% on MacOSX 10.5.8 TraceMonkey MacOS 10.6 (64-bit) Improvement: V8 decrease 21.5% on MacOSX 10.6.2 TraceMonkey Improvement: SunSpider decrease 2.74% on MacOSX 10.6.2 TraceMonkey Improvement: Dromaeo (CSS) increase 2.71% on MacOSX 10.6.2 TraceMonkey Improvement: Dromaeo (jslib) increase 18.5% on MacOSX 10.6.2 TraceMonkey Improvement: GFX decrease 55.6% on MacOSX 10.6.2 TraceMonkey Regression: Dromaeo (DOM) decrease 9.9% on MacOSX 10.6.2 TraceMonkey Regression: Dromaeo (V8) decrease 20.9% on MacOSX 10.6.2 TraceMonkey Regression: Dromaeo (String/Array/Eval/Regex) decrease 22.1% on MacOSX 10.6.2 TraceMonkey Regression: Dromaeo (SunSpider) decrease 41.8% on MacOSX 10.6.2 TraceMonkey
Blocks: JaegerSpeed
It looks to me like there is a problem with the Dromaeo harness, because we regressed SunSpider there, and improved the standalone version. I think we should investigate that problem first, and see where that leaves us on the rest of these issues.
Did we stop tracing the dromaeo loops in some cases when we shouldn't have? For Dromaeo DOM, if we did so we might now be hitting fastnatives instead of traceable natives...
Assignee: general → dmandelin
An easy check is this one: http://dromaeo.com/?sunspider-bitops-3bit-bits-in-byte Roughly, on this one: pre-JM, no tracejit 50 pre-JM, tracejit 5000 JM, no tracejit 500 JM, tracejit 500 JM, no jit at all 50 So, it looks like the tracer just isn't turning on.
Oh, and I forgot, in JM, if I have the methodjit off and tracejit on, I get a score of about 50. So it is not tracing even if when the methodjit is off.
David, thanks! On that test, after disabling methodjit, I get: Abort recording of tree sunspider-bitops-3bit-bits-in-byte.html:32@20 at sunspider-bitops-3bit-bits-in-byte.html:33@29: Global object not owned by this context. three times and possibly related: trace stopped: 13148: JSOP_CALL or JSOP_NEW crosses global scopes right after that third abort. The abort happens because cx->fp()->getScopeChain()->getGlobal()->title.ownercx != cx in recordLoopEdge. That check has been around (modulo mechanical changes) for a good long while (I stopped when I tracked it back to January 2010). So I presume that what changed is the inputs. Examining this stuff in a debugger, cx is the JSContext for the main test page. cx->fp()->getScopeChain()->getGlobal()->title.ownercx is <sigh>. It's not a JSContext associated with a window, because it doesn't have JSOPTION_PRIVATE_IS_NSISUPPORTS (and has a null context private anyway). It has jit disabled on it. Its globalObject is a BackstagePass. The cx->fp()->getScopeChain()->getGlobal() itself is the inner window for the actual test subframe.
fwiw, cx->globalObject->title.ownercx is the JSContext for the subframe. So somehow just the title of the subframe's inner window seems confused.
It looks like the title for the inner window gets claimed by session store... and then never gets claimedby anyone else?
blocking2.0: --- → ?
TM uses CheckGlobalObjectShape which locks the global object if it doesn't have OWN_SHAPE set in its flags. Does JM do any locking at all? Title-based locking is going away, so we could break harder to unregress, but it might be better in the interim to claim that title, once. /be
Note that in this case I disabled the methodjit. Should that not have taken me through the TM codepath described in comment 8?
Thanks, that helps a lot. I confirmed that in the new code the global object already has its own shape when we start trace recording, so we don't do the lock/unlock pair, and so we get the mismatch. I'm pretty sure making sure we claim the title every time (lock/unlock when we start recording? js_ClaimTitle? not sure which is preferred) will fix it. I want to find out why the global-own-shape property changed. My guess is that it has something to do with predefining properties on the global object.
Oops, we always had a latent bug in jstracer.cpp Dave, one JS_LOCK_OBJ/JS_UNLOCK_OBJ pair is enough and better -- if by some bad circumstance the global is shared among threads, this will find out the hard way and the check if (cx->fp()->getScopeChain()->getGlobal()->title.ownercx != cx) { in TraceRecorder::recordLoopEdge will abort recording. What globalObj->flags were set besides OWN_SHAPE? Probably there were methods branded into shapes for the global. /be
(In reply to comment #5) > It's not a > JSContext associated with a window, because it doesn't have > JSOPTION_PRIVATE_IS_NSISUPPORTS (and has a null context private anyway). It > has jit disabled on it. Its globalObject is a BackstagePass. Do we want to look, in a separate bug, at why the JIT is disabled for this (session-restore?) context?
In my case, probably because I disabled chrome tracejit to reduce the spew noise while debugging this issue.
Updates: - In some test runs just now, the cx-mismatch problems didn't occur. It seems that it was probably caused by adding more instrumentation. So the lock/unlock cure for the aborts still seems valid. - More importantly, even if I fix the abort, the perf is not good. But I found out why: in the new, JM-enabled version, we try to avoid the jump on/off trace problem (when an inner loop traces but its outer loop does not) by blacklisting once a trace is executed 300 times. In 3bit, the benchmark workload is a 2-loop nest. The harness calls a function with that many times, so the outer loop there gets blacklisted. Thus, the new heuristic actually *causes* a jump on/off trace problem in Dromaeo benchmarks that trace well and get a score of 300+. dvander says the main effect of taking out that heuristic entirely is to kill our earley-boyer score. So, we could just take it out but we should probably spend some time brainstorming new tricks to use here.
> - In some test runs just now, the cx-mismatch problems didn't occur. If, as I suspect, it's just triggered by sessionstore, then it will be timing-dependent: sessionstore runs itself off a timer.
Depends on: 593195
Depends on: 593444
Depends on: 593495
Depends on: 593532
New in-browser scores for TM repo: V8 Dromaeo (a) before JM TM-only 1918 488 (b) JM+TM end of last week 1916 478 (c) JM+TM today 3011 505 (b) represents the earlier blacklisting/tuning efforts, which got us to within 2% of the pre-JM scores. The difference with (c) is probably due to ICs for scripted calls speeding up all our calls. Anyway, it looks like we are no longer regressing in the real tests. What does Talos show?
Are the numbers in comment 16 for the sunspider part of dromaeo? Or whole benchmark? How do the dromaeo-DOM and dromaeo-String/Array/Eval/Regex numbers look?
blocking2.0: ? → beta6+
(In reply to comment #17) > Are the numbers in comment 16 for the sunspider part of dromaeo? Or whole > benchmark? > > How do the dromaeo-DOM and dromaeo-String/Array/Eval/Regex numbers look? Finally got around to testing dromaeo-DOM at least. That one seems to be OK: 1030 in the beta vs. 1014 in today's TM nightly. TM is a bit faster on mods and queries, but a bit slower on attrs and traversal. I think this is OK for now--the tuning project should hopefully make this better though, assuming the differences aren't entirely noise.
(In reply to comment #17) > Are the numbers in comment 16 for the sunspider part of dromaeo? Or whole > benchmark? > > How do the dromaeo-DOM and dromaeo-String/Array/Eval/Regex numbers look? On the dromaeo group (String/etc), I was not able to get a full test run with a beta (sometimes Dromaeo tests just stop), but for the 4 that did run correctly in the beta, the JS preview builds are faster.
> hat one seems to be OK: 1030 in the beta vs. 1014 in today's TM nightly. Yeah, ok. So 2% regression, not 8%.... that might in fact be noise. If we're that close, I assume we're tracing it all; last I measured methodjit-only did 2x slower than tracejit on the attr and traversal tests. So tuning is not likely to help much.
OK. I'm going to presumptively consider this to be now good enough for landing, and thus this bug can be considered fixed. If that is not so, freely reopen, with an explanation of exactly what still needs improvement.
Whiteboard: fixed-in-tracemonkey
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.