Closed Bug 606648 Opened 14 years ago Closed 11 years ago

typeof(this.data[x]) != "undefined" is slow

Categories

(Core :: JavaScript Engine, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: developer, Assigned: jandem)

References

(Blocks 2 open bugs, )

Details

(Keywords: perf, Whiteboard: [jsperf])

Attachments

(4 files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.11) Gecko/20101012 Firefox/3.6.11 Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101022 Firefox/4.0b8pre In the peacekeeper performance tests, the "ripple" tests use typeof(this.data[x]) != "undefined" and typeof(this.data[x][y]) != "undefined" over and over again. If I swap out those checks for another equivalent test (and comment out the canvas render code), the fps almost double (27 fps to 51 fps) on Windows XP. I've made a page that shows the slowdown. If you go to http://www.peacekeeper.therichins.net/test10b.html, you'll see the time it took to do just 100 iterations using the typeof operator. If you go to http://www.peacekeeper.therichins.net/test10c.html, you'll see the same test without the typeof operator. On my box, it goes from about 2s to sub 1s. Chrome is sub 1s for both of the versions. The lines that are modified are: if (typeof(this.data[x]) != "undefined" && typeof(this.data[x][y]) != "undefined") { to: if (x > 0 && x < this.width && y > 0 && y < this.height) { Both are equivalent in this test. I originally though that this was the same as bug 470143 but they said it looked like it was a different bug. This should probably block 499198 as it is related to Peacekeeper. Reproducible: Always Steps to Reproduce: 1. Load http://www.peacekeeper.therichins.net/test10b.html 2. Load http://www.peacekeeper.therichins.net/test10c.html 3. Compare the times Actual Results: Half the time spent in test10c Expected Results: For test10b to much faster.
Thanks for the report! Having the typeof on the out-of-bounds access means we have to generate a separate trace due to the guard on being out-of-bounds. Which in this case means we end up having a somewhat branchy tree, I bet, since there are lots of get() calls in a row. That said, over here I see us take about 240ms on the slow version, 180ms on the fast one, if we trace the whole thing. That's much faster than Chrome (over 2x faster). I can confirm that in Firefox 3.6 those testcases are a lot slower (750ms for the fast one, 1100ms for the slow one over here). I filed bug 606650 on the fact that right this second this isn't tracing due to the heuristics patch.
Blocks: peacekeeper
Oh, in practice I guess we never end up in the undefined case.... So we don't end up with a branchy trace after all. Note bug 591172.
Status: UNCONFIRMED → NEW
Depends on: 591172
Ever confirmed: true
Whiteboard: [jsperf]
Keywords: perf
So at this point we do _faster_ on test10b than on test10c, because jitprofiling makes us trace the former but not the latter (see bug 607201). The tracer kicks V8's butt here, but JM is way slower than V8. I tried JM+TI and it's faster than just JM, but nowhere close to V8, much less the tracer.... Brian, want to take a look at this for TI? Per comments in bug 607201 at least the test10c case should be doable with better interval analysis.
From profiling the JS, we seem to spend most of the time in processBuffers. After editing the testcase to see what helps, I think doing the below would get JM+TI roughly on par with TM. - We need to inline the get() and set() calls --- JM+TI only inlines when the called function is a singleton (there can only ever be one in existence). This shouldn't be hard to fix, we just need to make sure we can expand the inlined frames correctly. - 'this.a' and 'this.b' are inferred as either bools or Buffers (due to the 'false' initializations of 'a' and 'b'). We need read barriers on these accesses as the only type that could be observed here is Buffer. - The 'this.a' and 'this.b' accesses in the processBuffers loop need to be LICM'ed, along with the read barriers on these accesses. - The Math.round call in the 'set' function can overflow (it can produce -0 at the very least), which causes us to never inline this call. We should be more robust.
So, I'm using Firefox 9.0.1 and both testcases take the same amount of time.
Yes. The bug sort of morphed once the tracer went away... Maybe it's better to file a new bug on the remaining issue.
So, there was a problem, the problem partially went away, and now it's back again. Nightly 26.0a1 (2013-09-05) test10b - 1362ms test10c - 385ms
Guilherme, are you willing to hunt down when 10b became slower than 10c (the nightly range should be fine)?
Flags: needinfo?(guijoselito)
I didn't do exactly what you asked. I tested a few versions, trying the avaliable combinations. The results are (10b/10c) Firefox 17 - 990/885 (JM+TI) - 1275/1210 (TI-OFF) Firefox 18 - 2400/1230 (JM+TI+ION) - 2100/910 (JM+TI) - 1700/920 (ION+TI) Firefox 23 - 1480/860 (BC+ION) - 2700/920 (JM) Firefox 24 - 1500/800 (BC+ION) - 5200/3700 (BC) Nightly 26 - 1360/380 (BC+ION) It seems to me that Ion just never was fast on 10b. And recently something made 10c much faster. If you want me to get one of the exactly regressions, just ask.
Flags: needinfo?(guijoselito)
Thanks. That's good enough. ;) So what I see in a profiler is a js::TypeOfOperation stubcall. I wonder whether we can get rid of that, esp in the != undefined case.....
Flags: needinfo?(jdemooij)
Attached file test10b shell testcase (deleted) —
Attached file test10c shell testcase (deleted) —
For the shell tests I get: 10b: SM : 584 ms d8 : 375 ms 10c: SM : 127 ms d8 : 305 ms So we're doing really well on the modified version, but 10b is the one that matters. Looking into that now.
Number of things we can do here, most important first: (1) LTypeOfV should not use a VM call for objects. Thanks to Tom, the object typeof hook is gone, so the remaining interesting cases are: obj->isCallable() => "function" EmulatesUndefined(obj) => "undefined" We could inline these checks, but thanks to TypeObject determining Class now, we should be able to guard against these statically as well and just use "object" if the type is an object. I will try this first, hopefully it will work here and in most cases. (2) We should optimize typeof x == "string" (bug 725966). (3) LTypeOfV should not emit code for types that are statically known to be impossible.
Depends on: 725966
Depends on: 914132
Clearing needinfo. With the patches in bug 914132 the difference is only 12 ms and almost 3x faster than v8.
Flags: needinfo?(jdemooij)
(In reply to Guilherme Lima from comment #11) > Nightly 26 - 1360/380 (BC+ION) Previous nightly - 557/112 ms Latest nightly - 78/112 ms
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Resolution: WORKSFORME → FIXED
Assignee: general → jdemooij
Target Milestone: --- → mozilla26
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: