Closed
Bug 516075
Opened 15 years ago
Closed 15 years ago
Move shape into JSObjectMap from JSScope, const-ipate and use LIR_ldc* to get to it
Categories
(Core :: JavaScript Engine, defect, P1)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla1.9.2
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta1-fixed |
People
(Reporter: brendan, Assigned: brendan)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(3 files, 3 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
More from the big patch in bug 504587.
/be
Flags: wanted1.9.2?
Assignee | ||
Comment 1•15 years ago
|
||
No tracer-level shape guard CSE here, just a TR::guardShape helper and LIR_ldc* usage for nanojit-level CSE.
This triggers 75 traces instead of 69, which must be due to CSE enabled via LIR_ldc*, right? Stats with patch:
recorder: started(5), aborted(3), completed(32), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(1), breaks(0), returns(0), unstableInnerCalls(1), blacklisted(0)
monitor: triggered(75), exits(75), type mismatch(0), global mismatch(0)
and without:
recorder: started(5), aborted(3), completed(32), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(1), breaks(0), returns(0), unstableInnerCalls(1), blacklisted(0)
monitor: triggered(69), exits(69), type mismatch(0), global mismatch(0)
Same "leaving trace at ..." lines for the first 69, modulo address differences; in particular, same .js file and line number / bytecode offset. Then these six with the patch to get to 75 traces triggered from 69:
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7854914
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7806855
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=8912820
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7811465
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7932582
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7950128
I will dig deeper into what is causing branch exits, but we are not hitting the MAX_BRANCHES limit.
Julian, please feel free to fragment-profile this.
/be
Attachment #400233 -
Flags: review?(gal)
Assignee | ||
Comment 2•15 years ago
|
||
Bizarre. These are all the LIR_ldc uses with this bug's patch:
$ grep -n LIR_ldc jstracer.cpp
8145: ops_ins = addName(lir->insLoad(LIR_ldcp, map_ins, int(offsetof(JSObjectMap, ops))), "ops");
8146: LIns* n = lir->insLoad(LIR_ldcp, ops_ins, op_offset);
8303: addName(lir->insLoad(LIR_ldcp, cx_ins, offsetof(JSContext, runtime)),
8458: * LIR_ldcp is ok to use here even though Array classword can
8465: lir->insLoad(LIR_ldcp, v_ins, offsetof(JSObject, classword)),
10555: LIns* type_ins = lir->insLoad(LIR_ldcb, typep_ins, 0);
Lines 814[56] are in TR::map_is_native, line 8303 is for cx->runtime (pre-exists this patch), 8465 is from this patch with comment about tricksy-ness, and line 10555 pre-exists (typemap byte load).
I went from bottom of file up changing each of the three LIR_ldcp uses added by this bug's patch to LIR_ldp and got 8 extra traces changing the classword load to LIR_ldp, then undoing that and changing either of the map_is_native ones got 5 (not six) extra traces.
These extras all of course were at richards.js:190 and were LOOP exits.
Smells like a latent bug involving CSE and treecalls. The loop at line 190 has interesting structure:
while (this.currentTcb != null) {
if (this.currentTcb.isHeldOrSuspended()) {
this.currentTcb = this.currentTcb.link;
} else {
this.currentId = this.currentTcb.id;
this.currentTcb = this.currentTcb.run();
}
}
/be
Assignee | ||
Comment 3•15 years ago
|
||
Er, the cx->runtime LIR_ldcp is from this bug's patch. Anyway, taking out the three other new LIR_ldcp uses gets me two extra traces at richards.js:190 that LOOP exit. Changing the cx->runtime load to a LIR_ldp gets me three!
Going to bed, this is vexing me.
/be
Assignee | ||
Comment 4•15 years ago
|
||
Something between e4944123b960 and 74c704b49e7f made them go away. Need to confirm this and find the relevant cset. Help welcome, I've been on a plane all day.
/be
Attachment #400233 -
Attachment is obsolete: true
Attachment #400233 -
Flags: review?(gal)
Assignee | ||
Updated•15 years ago
|
Attachment #400423 -
Attachment description: merged up to tm tip -- extra traces gone!? → merged up to tm tip -- extra trace runs gone!?
Assignee | ||
Comment 5•15 years ago
|
||
4 extra traces now that I'm off the plane. Stinks of UMR. Anyone valgrinding?
/be
Comment 6•15 years ago
|
||
The cx->runtime ldc change causing extra traces is insane. We should zoom in on that and investigate.
Comment 7•15 years ago
|
||
(In reply to comment #5)
> 4 extra traces now that I'm off the plane. Stinks of UMR. Anyone valgrinding?
I compared (tip = 32243:2a3c66f0a36c) against (same, + patch in comment #4).
I didn't see any memory errors running v8-richards, even with the
various paranoia-levers set to max.
I am wondering if CseFilter is 100% reliable. At the moment we don't
have any way to verify its claims, but we could easily do so, by
formalising the LIR_assert thing that was part of one of the bug
504587 patches, and using that to actually check equality on
expressions claimed to be CSEs.
In any case I plain to start hacking on CSE later this week, to do
alias set stuff, and I could do that as the first thing to do.
Comment 8•15 years ago
|
||
(In reply to comment #7)
> 504587 patches, and using that to actually check equality on
> expressions claimed to be CSEs.
Actually .. that doesn't distinguish the cases
CseFilter is somehow subtly broken
vs
CseFilter is correct, but the assurances about memory CSEs you're
giving it via LIR_ldc* turn out not to be so, for whatever reason
Verifying the latter would entail runtime checking for locations so
claimed, in order to verify that they really are read-only for the
lifetime of the trace. Hmm.
Comment 9•15 years ago
|
||
If it's any use, bug 502778 has a patch that extensively overhauls CseFilter.
Comment 10•15 years ago
|
||
(In reply to comment #8)
> Verifying the latter would entail runtime checking for locations so
> claimed, in order to verify that they really are read-only for the
> lifetime of the trace. Hmm.
I hacked up nanojit so as to generate code to pass a hint to Memcheck
every time a LIR_ldc* is executed, saying "this location is now
read-only until further notice". Then I hacked up Memcheck to take
notice of said hints. I verified this works by doing a store of the
data immediately after an LIR_ldc and checking that Memcheck
complains.
Unfortunately after all that, when running richards, there is no
complaint. Well, I guess that removes the "bogus ldc* annotations"
theory as a cause of badness.
Assignee | ||
Comment 11•15 years ago
|
||
Attachment #400423 -
Attachment is obsolete: true
Assignee | ||
Comment 12•15 years ago
|
||
Ok, this is very strange. I tested by applying pieces of the patch, showing no regressions, until I got to everything except the guardShape changes and the LIR_ldcp changes. No problems. Adding the LIR_ldcp changes did not cause any regression, although I was over-confidently testing only one trial. I then took out the LIR_ldcp selections and added the guardShape factoring. Still good.
Adding LIR_ldcp regressed things, I saw four extra traces. Then six. Reverted to test without this patch (LIR_ldp instead of LIR_ldcp in those four places where we load const pointers). Got zero extra traces on first run, then six again on second and repeated runs!
I'm using DiffMerge and dvander's suggestion to diff the grep '^leaving trace' lines of the full JIT spew. It shows word diffs on line as well as line diffs. I trust it saw my grep output file changes, as it prompted me to reload based on the mtime changing.
It seems the guardShape changes in addition to LIR_ldcp are needed.
I have no idea why the results would be non-deterministic if valgrind gives a clean bill of health.
/be
Assignee | ||
Comment 13•15 years ago
|
||
You've seen this in bug 504587.
Sorry for all the noise, Julian pointed out that v8/run-richards.js is typical v8 hide-the-compiler-costs "run as many times in a given time period as you can" benchmarketing, which means it is non-deterministic. I think that's all that was going on here. Grrr!
/be
Attachment #400507 -
Flags: review?(jorendorff)
Assignee | ||
Comment 14•15 years ago
|
||
My results now do look like warm-up effects: more traces on 2nd run. Hate those non-deterministic benchmarks! Is the SunSpider v8-richards.js really free of all ND, including as invoked by the sunspider driver?
/be
Assignee | ||
Comment 15•15 years ago
|
||
Even if we run an ND-free richards, why did Julian see extra fragment runs when profiling? My attempts to reproduce what he profiled led to my confusing TMFLAGS stats differences running the ND version of richards.
Assuming that source of ND is ruled out, either the extra frag runs are gone, or I can't see them. With the combined patch I see 71 traces triggered, from 33 compmleted, same as without the patch.
/be
Assignee | ||
Comment 16•15 years ago
|
||
(In reply to comment #15)
> Assuming that source of ND is ruled out, either the extra frag runs are gone,
> or I can't see them. With the combined patch I see 71 traces triggered, from 33
> completed, same as without the patch.
$ TMFLAGS=stats ./Darwin_DBG.OBJ/js -j !$
TMFLAGS=stats ./Darwin_DBG.OBJ/js -j ~/Hacking/webkit.org/WebKit/SunSpider/tests/v8-richards.js
recorder: started(5), aborted(2), completed(33), different header(0), trees trashed(4), slot promoted(0), unstable loop variable(1), breaks(0), returns(0), unstableInnerCalls(1), blacklisted(0)
/be
Comment 17•15 years ago
|
||
Comment on attachment 400507 [details] [diff] [review]
combined patch
> JSScope *
> JSScope::createEmptyScope(JSContext *cx, JSClass *clasp)
> {
> JS_ASSERT(!emptyScope);
>
>- JSScope *scope = (JSScope *) cx->malloc(sizeof(JSScope));
>- if (!scope)
>+ void *mem = cx->malloc(sizeof(JSScope));
>+ if (!mem)
> return NULL;
>
>- scope->map.ops = map.ops;
>+ JSScope *scope = new (mem) JSScope(ops);
> scope->object = NULL;
Why not cx->create<JSScope>(ops) here?
>+static inline JSScope *
>+OBJ_SCOPE(JSObject *obj)
>+{
>+ JS_ASSERT(OBJ_IS_NATIVE(obj));
>+ return (JSScope *) obj->map;
>+}
>+
>+static inline uint32
>+OBJ_SHAPE(JSObject *obj)
>+{
>+ JS_ASSERT(obj->map->shape != JSObjectMap::SHAPELESS);
>+ return obj->map->shape;
>+}
`static` isn't necessary on these functions since the header is already C++-only.
In jstracer.h:
>+#include "jsdhash.h"
This one isn't necessary in this patch, right? And when you land the other half you'll presumably use js::HashMap.
Great patch.
Attachment #400507 -
Flags: review?(jorendorff) → review+
Comment 18•15 years ago
|
||
I renew my call to reduce richards, especially the rate measurement part of it (which might explain the weird ND behavior).
Assignee | ||
Comment 19•15 years ago
|
||
(In reply to comment #18)
> I renew my call to reduce richards, especially the rate measurement part of it
> (which might explain the weird ND behavior).
No need for reduction. I was running the wrong benchmark, which will run extra loops some of the time.
Julian said he'd have fragment profile confirmation by end of today PDT.
/be
Assignee | ||
Comment 20•15 years ago
|
||
Just enough change here to want another r+. I booted the LIR_dbreak/insAssert to bug 504587's patch (to be updated).
Got another idea I have been haunted by to try for avoiding shape-based branch exits -- details in bug 504587 with patch.
/be
Attachment #400507 -
Attachment is obsolete: true
Attachment #400595 -
Flags: review?(jorendorff)
Updated•15 years ago
|
Attachment #400595 -
Flags: review?(jorendorff) → review+
Comment 21•15 years ago
|
||
Comment on attachment 400595 [details] [diff] [review]
patch to commit
You can avoid the (JSObject *)NULL cast, I think, by making the second parameter default to NULL.
Comment 22•15 years ago
|
||
(In reply to comment #13)
>
> Sorry for all the noise, Julian pointed out that v8/run-richards.js is typical
> v8 hide-the-compiler-costs "run as many times in a given time period as you
> can" benchmarketing, which means it is non-deterministic.
The ./sunspider --v8 tests use a consistent number of iterations.
Comment 23•15 years ago
|
||
Comparing 32246:7eff6f4aee73 vs 32246:7eff6f4aee73 + comment #20 patch
All looks good to me.
No change in #frag entries -- 12863820
Static code measures are good:
before:
Total displayed code bytes = 87087, exit bytes = 110283
Total displayed static exits = 3678
after:
Total displayed code bytes = 85726, exit bytes = 107427
Total displayed static exits = 3583
Cachegrindage is good:
before:
I refs: 3,519,034,997
I1 misses: 27,635,209
L2i misses: 5,933
I1 miss rate: 0.78%
L2i miss rate: 0.00%
D refs: 1,724,507,309 (1,145,431,422 rd + 579,075,887 wr)
D1 misses: 186,091 ( 158,617 rd + 27,474 wr)
L2d misses: 44,502 ( 23,156 rd + 21,346 wr)
D1 miss rate: 0.0% ( 0.0% + 0.0% )
L2d miss rate: 0.0% ( 0.0% + 0.0% )
L2 refs: 27,821,300 ( 27,793,826 rd + 27,474 wr)
L2 misses: 50,435 ( 29,089 rd + 21,346 wr)
L2 miss rate: 0.0% ( 0.0% + 0.0% )
Branches: 543,251,301 ( 543,152,745 cond + 98,556 ind)
Mispredicts: 12,475,007 ( 12,447,678 cond + 27,329 ind)
Mispred rate: 2.2% ( 2.2% + 27.7% )
after:
I refs: 3,460,187,860
I1 misses: 26,172,205
L2i misses: 5,944
I1 miss rate: 0.75%
L2i miss rate: 0.00%
D refs: 1,694,400,527 (1,116,362,352 rd + 578,038,175 wr)
D1 misses: 193,432 ( 165,750 rd + 27,682 wr)
L2d misses: 44,591 ( 23,371 rd + 21,220 wr)
D1 miss rate: 0.0% ( 0.0% + 0.0% )
L2d miss rate: 0.0% ( 0.0% + 0.0% )
L2 refs: 26,365,637 ( 26,337,955 rd + 27,682 wr)
L2 misses: 50,535 ( 29,315 rd + 21,220 wr)
L2 miss rate: 0.0% ( 0.0% + 0.0% )
Branches: 528,853,452 ( 528,755,712 cond + 97,740 ind)
Mispredicts: 11,968,920 ( 11,941,676 cond + 27,244 ind)
Mispred rate: 2.2% ( 2.2% + 27.8% )
Assignee | ||
Comment 24•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/f723616174c1
Onward to bug 504587! Thanks to all,
/be
Whiteboard: fixed-in-tracemonkey
Comment 25•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 26•15 years ago
|
||
status1.9.2:
--- → beta1-fixed
Flags: wanted1.9.2? → wanted1.9.2+
You need to log in
before you can comment on or make changes to this bug.
Description
•