Closed
Bug 769765
Opened 12 years ago
Closed 10 years ago
2.5x Slower than Chrome on Compiled Sundown
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: azakai, Unassigned)
References
(Depends on 1 open bug, )
Details
(Whiteboard: [js:t])
Attachments
(1 obsolete file)
Someone compiled the Sundown markdown parser with Emscripten. It runs much faster on Chrome than Nightly or IonMonkey Nightly, Firefox gets
emscript x 2,517 ops/sec ±3.66% (8 runs sampled) (IonMonkey is slightly faster, 2,759 ops/sec)
pagedown x 750 ops/sec ±2.16% (8 runs sampled)
while Chrome gets
emscript x 6,062 ops/sec ±0.55% (9 runs sampled)
pagedown x 521 ops/sec ±2.26% (9 runs sampled)
emscript is Emscripten compiled Sundown, pagedown is the handwritten pagedown parser. For more background see
https://github.com/github/gollum/issues/404#issuecomment-6670367
So Firefox is a bit faster on the handwritten one, but on the high-performance one we are more than 2x slower.
Reporter | ||
Updated•12 years ago
|
Comment 1•12 years ago
|
||
Per the JIT inspector (https://addons.mozilla.org/en-US/firefox/addon/jit-inspector/) the problem here is that we're taking a bazillion stub calls for some x >>> 0 operations. Looks like some more fast paths would be handy.
Comment 2•12 years ago
|
||
So the issue here is that we always stub x >>> 0 accesses, regardless of the type of x or of the result. That could be improved to inline the opcode and convert to double if necessary, but the fastest thing would be to sniff the uint32 comparison which is happening here, e.g. x >>> 0 >= y >>> 0, generate plain int32s for the lhs/rhs and do an unsigned comparison. Alon, what are the possible cases when Emscripten generates x >>> 0 ?
Reporter | ||
Comment 3•12 years ago
|
||
x >>> 0 is generated
* When we need to use an i32 value in an unsigned manner, so for unsigned i32 / or % as well as comparisons <>=!=. This is the most important case. Here you see things like x >>> 0 >= y >>> 0.
* When converting floats to unsigned i32s and other conversions.
Comment 4•12 years ago
|
||
Peephole optimize comparisons of the form (X >>> 0) op (Y >>> 0). The >>> 0 operations are treated as if they are | 0, and unsigned comparisons are used for the operation on the resulting signed int32s. This will need a bit of port work to merge to IM.
function foo(n) {
for (var i = 0; i >>> 0 <= n >>> 0; i++) {}
}
foo(1000000);
Before: 479
After: 2
Assignee: general → bhackett1024
Attachment #638120 -
Flags: review?(dvander)
Comment on attachment 638120 [details] [diff] [review]
patch (8cc01c494f6a)
It's unfortunate that optimizing this requires lots of new instrumentation in the interpreter. It seems like it could be done entirely within a compiler like IonMonkey. Is there something about type inference that makes this necessary?
Comment 6•12 years ago
|
||
Keeping information in sync between the compiler and interpreter is good for ensuring a consistent representation in the presence of invalidation. What happens with an IM optimization on code like:
i >>> 0 >= somethingComplicated() >>> 0
If somethingComplicated() invalidates the calling script then we end up in the interpreter for the comparison, and if the interpreter's representation of i >>> 0 does not account for an overflow that occurred then the comparison may compute the wrong result.
For IonMonkey at least, that shouldn't be a problem; we could choose to not speculate if the whole expression wasn't idempotent. Otherwise, we could introduce uint32 types in the compiler, and then bailouts could be taught to handle those pretty easily.
Comment 8•12 years ago
|
||
Eh, choosing to not optimize this for anything other than the simplest ops seems a hack and will hurt robustness. Adding uint32 types doesn't seem any simpler than what's already happening here. This is a sideshow though, this bug is about the performance of the browser we're shipping today.
The difference is that solving this completely in the JIT wouldn't need to add new invariants to the VM.
Comment 10•12 years ago
|
||
I'm not sure why JIT complexity is a better thing to have than VM complexity. This change is pretty isolated, mirrors stuff we've done in the past (e.g. lazy arguments optimization) and preserves coherence between the JIT and interpreter, which *is* a good property to have both for cleanliness of implementation and simplicity of design.
Comment 11•12 years ago
|
||
The lazy arguments optimization is not a good model of simplicity. If it wasn't for the gnarly interactions with scopes in the VM, I would've tried to make it a simple jit-local thing. Having done the rest of the scope changes now, I wouldn't be surprised if there was a simple jit-local optimization strategy... I should revisit this...
Updated•12 years ago
|
Attachment #638120 -
Attachment is obsolete: true
Attachment #638120 -
Flags: review?(dvander)
Updated•12 years ago
|
Assignee: bhackett1024 → general
Reporter | ||
Comment 12•12 years ago
|
||
Seems v8 is making progress on optimizing uint32s,
http://code.google.com/p/v8/issues/detail?id=2097
which should make them even faster on this type of code I believe.
Comment 13•12 years ago
|
||
Would the potential performance improvements resulting from this bug be considered "low hanging fruit"? If so, what is preventing the fruit from being picked? There are a multitude of Ion performance bugs, and Ion will be shipping in a released browser relatively soon.
Comment 14•10 years ago
|
||
Right now:
Nightly 32.0a1 (2014-05-28)
emscript x 3.49 ops/sec ±19.57% (8 runs sampled)
pagedown x 0.23 ops/sec ±0.89% (6 runs sampled)
markdown deep x 2.37 ops/sec ±20.22% (9 runs sampled)
Chrome 35
emscript x 2.18 ops/sec ±5.81% (8 runs sampled)
pagedown x 0.24 ops/sec ±1.76% (6 runs sampled)
markdown deep x 7.01 ops/sec ±1.56% (9 runs sampled)
Should this bug remain open because of the 'markdown deep'?
Flags: needinfo?(azakai)
Reporter | ||
Comment 15•10 years ago
|
||
I think markdown deep is not emscripten-compiled? In any case, unless these were refreshed, I assume the compiled code here is from 2 years ago, and firefox does well on the emscripten version anyhow, so I think we can close this.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(azakai)
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•