Closed
Bug 831268
Opened 12 years ago
Closed 12 years ago
IonMonkey: Investigate V8's --inline_new optimization
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
INVALID
People
(Reporter: h4writer, Unassigned)
References
(Blocks 1 open bug)
Details
Like I found out yesterday, V8 has a mode called "--inline_new".
js run-raytrace.js
RayTrace: 11248
d8 --inline_new run-raytrace.js
RayTrace: 17207
d8 --noinline_new run-raytrace.js
RayTrace: 9719
Disabling this feature makes V8 slower than IM. Therefore we need this optimization to be faster than V8 on raytrace.
Dvander told me and as far as I understand this optimization creates a stub for the constructing call, therefore removing the need to call the constructing call.
From the tracelogging profile, this is the time spend in the constructing calls (initialize) and they remove them.
ion_cannon,raytrace.js:84: 212292605 cycles (7%), called: 1726808, 122 cycles/call
ion_cannon,raytrace.js:223: 211970214 cycles (7%), called: 1820609, 116 cycles/call
ion_cannon,raytrace.js:531: 168080960 cycles (5%), called: 859969, 195 cycles/call
ion_cannon,raytrace.js:285: 33788262 cycles (1%), called: 180317, 187 cycles/call
So this optimization would remove 20% of our calling time. As the tracelogger profiler takes some overhead when also reporting ion->ion calls I actually assume if we implement this we would gain the same improvement as V8: 77%. Improving our score to 19900
Comment 1•12 years ago
|
||
Why aren't we inlining the 'new' calls fully in v8-raytrace already? (Or is that what this bug is about, I'm confused by comment 0.) That doesn't seem like it needs any fancy new optimizations, and it seems like the only additional thing such a special optimization mode would add would be to avoid using ICs for adding properties in the constructor. That latter bit could be a significant cost, but there are several good ways of fixing that, including improving the definite properties analysis to handle Class.create (probably simple to do) or using JM IC info to handle property adds with inline MIR nodes as we do for normal get/set.
Reporter | ||
Comment 2•12 years ago
|
||
We do inline constructing calls. On V8 that would be --inline_construct.
Note that on raytrace we now also inlining the constructing calls since bug 813784. This bug isn't about that.
--inline_new is something totally different. They have a special constructor scheme:
<dvander`home> I just meant, v8 takes function slike
<dvander`home> function f(x) { this.x = x; }
<dvander`home> if you do |new f(5)| they don't use either compiler
<dvander`home> they generate a one-off stub for just that closure
<dvander`home> it might be shared but it's definitely specially generated
Comment 3•12 years ago
|
||
OK, but it doesn't follow at all that exactly copying v8's optimization is what we should be doing too, nor that copying the optimization will give us the performance gains you discuss in comment 0 (there could be some perf fault in v8 that bites them on v8-raytrace without that optimization, because they didn't optimize for that case and the constructing stubs are a holdover from pre-crankshaft, which I suspect).
Put another way, there isn't any reason this can't be fixed by staying in the SSA based compiler. Can you write a microbenchmark which illustrates the performance flaw you're trying to fix?
Reporter | ||
Comment 4•12 years ago
|
||
I don't say we need to copy that optimization word by word, but we should do something similar. That fits our system ;). At least it should get investigated ... I'm not starting to implement yet, as I don't know how it works exactly and what we need exactly to fix this issue ;). More a reminder there are some gains here
Reporter | ||
Comment 5•12 years ago
|
||
To have more reason to look into this:
DeltaBlue: 17052 to 19221 (12%)
RayTrace: 9990 to 17890 (79%)
EarleyBoyer: 17857 to 30124 (68%)
----
Score (version 7): 11998 to 14447 (20%)
Reporter | ||
Updated•12 years ago
|
Summary: IonMonkey: Create constructing stubs like v8 → IonMonkey: Investigate V8's --inline_new optimization
Reporter | ||
Comment 6•12 years ago
|
||
(In reply to Hannes Verschore [:h4writer] from comment #5)
> To have more reason to look into this:
>
> DeltaBlue: 17052 to 19221 (12%)
> RayTrace: 9990 to 17890 (79%)
> EarleyBoyer: 17857 to 30124 (68%)
> ----
> Score (version 7): 11998 to 14447 (20%)
that's v8 difference with/without --inline_new
Comment 7•12 years ago
|
||
Looking into this a little more, I don't find this bug credible at all. On this microbenchmark:
function A() {
this.f = 0;
this.g = 1;
}
function foo() {
for (var i = 0; i < 10000000; i++)
var x = new A();
}
foo();
js: 312
d8: 57
d8 --noinline_new: 436
The reason that d8 is so much faster than us at this sort of test is that it uses a generational GC. The reason it is so much slower with --noinline_new is that it is doing something hilariously slow instead of generating its normal optimized code for constructing the A objects.
We are already generating basically optimal code for this test. The only way we can catch v8 is to build a generational GC or (in addition, really, since GGC will happen) add an escape analysis and stack allocate the object.
Reporter | ||
Comment 8•12 years ago
|
||
"inline_new" is indeed more than creating a stub for constructing calls.
I've now measured how much the stub creating alone accounts for:
// V8, with stubs
Richards: 3657
DeltaBlue: 4465
Crypto: 4586
RayTrace: 4452
EarleyBoyer: 7547
RegExp: 1199
Splay: 1509
NavierStokes: 5238
----
Score (version 7): 3525
// V8 without stubs
Richards: 3633
DeltaBlue: 4423
Crypto: 4548
RayTrace: 4168
EarleyBoyer: 6692
RegExp: 1201
Splay: 1483
NavierStokes: 5243
----
Score (version 7): 3427
Raytrace and earleyboyer both take a hit of 12%.
I was maybe to eagerly to say we could maybe hit a 70% increase. But I think it should be doable to get the 12%.
I already explained that we spend around 20% in the constructing functions. I don't know how long the constructing stub in V8 takes, but I assume it will be much lower than the 20%. I don't see a reason that we can't get the 12% V8 get's on it. But to be sure, I'll try to make a quick and dirty prototype and see what it gives us. (It will also give me ideas on how to do this properly)
@Brain, the testcase provided doesn't use the generate stubs. There is no speed difference with enabling/disabling it.
d8 --inline_new /tmp/test.js // Without stubs
0m0.323s
d8 --inline_new /tmp/test.js // With stubs
0m0.326s
d8 --noinline_new /tmp/test.js // Inline new totally disabled
0m1.100s
js /tmp/test.js
0m0.538s
Reporter | ||
Comment 9•12 years ago
|
||
With "stubs", I mean the optimization with specialized constructor stubs only. Not stubs in general.
Comment 10•12 years ago
|
||
OK, not trying to hound on this, but can you write a microbenchmark which illustrates the performance flaw you're trying to fix?
Basically, what we do and what v8 does are different in these constructors because of TI's definite properties analysis. If we can determine a constructor will be creating an object with specific properties in specific slots, we pregenerate its shape and do direct writes to the object's properties during the constructor. v8 doesn't have this, and I believe uses these constructor stubs for the same purpose. So adding constructor stubs ourselves would just be a second way to do the same thing, and I'd like to see why we don't have definite property info on v8-raytrace because I think this is an easy fix now that bug 638794 is done. Having definite properties would also improve speed for later property accesses as well, unlike constructor stubs.
Having a microbenchmark to look at would make this job easier, both to make sure we are talking about the same performance flaw, and to ensure the fix is addressing that flaw.
Reporter | ||
Comment 11•12 years ago
|
||
Ah sorry I didn't report this earlier, but I've discarded the idea the "constructors stubs" are the observed performance difference. I made some wrong assumptions when going through the V8 code base. Looks like the "constructors stubs" alone have no performance impact whatsoever on raytrace :(.
I think it's best to just close the bug, as too much wrong information is in here!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•