Closed Bug 606648 Opened 14 years ago Closed 11 years ago

typeof(this.data[x]) != "undefined" is slow

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla26

People

(Reporter: developer, Assigned: jandem)

References

(Blocks 2 open bugs,
URL
)

Details

(Keywords: perf, Whiteboard: [jsperf])

Attachments

(4 files)

test10b: the original peacekeeper test, with typeof 14 years ago Boris Zbarsky [:bzbarsky] (deleted), text/html		Details
test10c: The modified test with comparison bounds-checks 14 years ago Boris Zbarsky [:bzbarsky] (deleted), text/html		Details
test10b shell testcase 11 years ago Jan de Mooij [:jandem] (deleted), application/x-javascript		Details
test10c shell testcase 11 years ago Jan de Mooij [:jandem] (deleted), application/x-javascript		Details

Trev

Reporter

Description

•

14 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.11) Gecko/20101012 Firefox/3.6.11 Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101022 Firefox/4.0b8pre In the peacekeeper performance tests, the "ripple" tests use typeof(this.data[x]) != "undefined" and typeof(this.data[x][y]) != "undefined" over and over again. If I swap out those checks for another equivalent test (and comment out the canvas render code), the fps almost double (27 fps to 51 fps) on Windows XP. I've made a page that shows the slowdown. If you go to http://www.peacekeeper.therichins.net/test10b.html, you'll see the time it took to do just 100 iterations using the typeof operator. If you go to http://www.peacekeeper.therichins.net/test10c.html, you'll see the same test without the typeof operator. On my box, it goes from about 2s to sub 1s. Chrome is sub 1s for both of the versions. The lines that are modified are: if (typeof(this.data[x]) != "undefined" && typeof(this.data[x][y]) != "undefined") { to: if (x > 0 && x < this.width && y > 0 && y < this.height) { Both are equivalent in this test. I originally though that this was the same as bug 470143 but they said it looked like it was a different bug. This should probably block 499198 as it is related to Peacekeeper. Reproducible: Always Steps to Reproduce: 1. Load http://www.peacekeeper.therichins.net/test10b.html 2. Load http://www.peacekeeper.therichins.net/test10c.html 3. Compare the times Actual Results: Half the time spent in test10c Expected Results: For test10b to much faster.

Boris Zbarsky [:bzbarsky]

Comment 1

•

14 years ago

Thanks for the report! Having the typeof on the out-of-bounds access means we have to generate a separate trace due to the guard on being out-of-bounds. Which in this case means we end up having a somewhat branchy tree, I bet, since there are lots of get() calls in a row. That said, over here I see us take about 240ms on the slow version, 180ms on the fast one, if we trace the whole thing. That's much faster than Chrome (over 2x faster). I can confirm that in Firefox 3.6 those testcases are a lot slower (750ms for the fast one, 1100ms for the slow one over here). I filed bug 606650 on the fact that right this second this isn't tracing due to the heuristics patch.

Blocks: peacekeeper

Boris Zbarsky [:bzbarsky]

Comment 2

•

14 years ago

Oh, in practice I guess we never end up in the undefined case.... So we don't end up with a branchy trace after all. Note bug 591172.

Status: UNCONFIRMED → NEW

Depends on: 591172

Ever confirmed: true

David Mandelin [:dmandelin]

Updated

•

14 years ago

Whiteboard: [jsperf]

Dão Gottwald [:dao]

Updated

•

14 years ago

Keywords: perf

Boris Zbarsky [:bzbarsky]

Comment 3

•

14 years ago

So at this point we do _faster_ on test10b than on test10c, because jitprofiling makes us trace the former but not the latter (see bug 607201). The tracer kicks V8's butt here, but JM is way slower than V8. I tried JM+TI and it's faster than just JM, but nowhere close to V8, much less the tracer.... Brian, want to take a look at this for TI? Per comments in bug 607201 at least the test10c case should be doable with better interval analysis.

Boris Zbarsky [:bzbarsky]

Comment 4

•

14 years ago

Attached file test10b: the original peacekeeper test, with typeof (deleted) — Details

Boris Zbarsky [:bzbarsky]

Comment 5

•

14 years ago

Attached file test10c: The modified test with comparison bounds-checks (deleted) — Details

Brian Hackett [Laid off!]

Updated

•

14 years ago

Blocks: infer-perf-regress

Brian Hackett [Laid off!]

Comment 6

•

14 years ago

From profiling the JS, we seem to spend most of the time in processBuffers. After editing the testcase to see what helps, I think doing the below would get JM+TI roughly on par with TM. - We need to inline the get() and set() calls --- JM+TI only inlines when the called function is a singleton (there can only ever be one in existence). This shouldn't be hard to fix, we just need to make sure we can expand the inlined frames correctly. - 'this.a' and 'this.b' are inferred as either bools or Buffers (due to the 'false' initializations of 'a' and 'b'). We need read barriers on these accesses as the only type that could be observed here is Buffer. - The 'this.a' and 'this.b' accesses in the processBuffers loop need to be LICM'ed, along with the read barriers on these accesses. - The Math.round call in the 'set' function can overflow (it can produce -0 at the very least), which causes us to never inline this call. We should be more robust.

Guilherme Lima

Comment 7

•

13 years ago

So, I'm using Firefox 9.0.1 and both testcases take the same amount of time.

Boris Zbarsky [:bzbarsky]

Comment 8

•

13 years ago

Yes. The bug sort of morphed once the tracer went away... Maybe it's better to file a new bug on the remaining issue.

Guilherme Lima

Comment 9

•

11 years ago

So, there was a problem, the problem partially went away, and now it's back again. Nightly 26.0a1 (2013-09-05) test10b - 1362ms test10c - 385ms

Boris Zbarsky [:bzbarsky]

Comment 10

•

11 years ago

Guilherme, are you willing to hunt down when 10b became slower than 10c (the nightly range should be fine)?

Flags: needinfo?(guijoselito)

Guilherme Lima

Comment 11

•

11 years ago

I didn't do exactly what you asked. I tested a few versions, trying the avaliable combinations. The results are (10b/10c) Firefox 17 - 990/885 (JM+TI) - 1275/1210 (TI-OFF) Firefox 18 - 2400/1230 (JM+TI+ION) - 2100/910 (JM+TI) - 1700/920 (ION+TI) Firefox 23 - 1480/860 (BC+ION) - 2700/920 (JM) Firefox 24 - 1500/800 (BC+ION) - 5200/3700 (BC) Nightly 26 - 1360/380 (BC+ION) It seems to me that Ion just never was fast on 10b. And recently something made 10c much faster. If you want me to get one of the exactly regressions, just ask.

Flags: needinfo?(guijoselito)

Boris Zbarsky [:bzbarsky]

Comment 12

•

11 years ago

Thanks. That's good enough. ;) So what I see in a profiler is a js::TypeOfOperation stubcall. I wonder whether we can get rid of that, esp in the != undefined case.....

Flags: needinfo?(jdemooij)

Jan de Mooij [:jandem]

Assignee

Comment 13

•

11 years ago

Attached file test10b shell testcase (deleted) — Details

Jan de Mooij [:jandem]

Assignee

Comment 14

•

11 years ago

Attached file test10c shell testcase (deleted) — Details

Jan de Mooij [:jandem]

Assignee

Comment 15

•

11 years ago

For the shell tests I get: 10b: SM : 584 ms d8 : 375 ms 10c: SM : 127 ms d8 : 305 ms So we're doing really well on the modified version, but 10b is the one that matters. Looking into that now.

Jan de Mooij [:jandem]

Assignee

Comment 16

•

11 years ago

Number of things we can do here, most important first: (1) LTypeOfV should not use a VM call for objects. Thanks to Tom, the object typeof hook is gone, so the remaining interesting cases are: obj->isCallable() => "function" EmulatesUndefined(obj) => "undefined" We could inline these checks, but thanks to TypeObject determining Class now, we should be able to guard against these statically as well and just use "object" if the type is an object. I will try this first, hopefully it will work here and in most cases. (2) We should optimize typeof x == "string" (bug 725966). (3) LTypeOfV should not emit code for types that are statically known to be impossible.

Depends on: 725966

Jan de Mooij [:jandem]

Assignee

Updated

•

11 years ago

Depends on: 914132

Jan de Mooij [:jandem]

Assignee

Comment 17

•

11 years ago

Clearing needinfo. With the patches in bug 914132 the difference is only 12 ms and almost 3x faster than v8.

Flags: needinfo?(jdemooij)

Jan de Mooij [:jandem]

Assignee

Comment 18

•

11 years ago

(In reply to Guilherme Lima from comment #11) > Nightly 26 - 1360/380 (BC+ION) Previous nightly - 557/112 ms Latest nightly - 78/112 ms

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → WORKSFORME

Boris Zbarsky [:bzbarsky]

Updated

•

11 years ago

Resolution: WORKSFORME → FIXED

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Assignee: general → jdemooij

Target Milestone: --- → mozilla26

You need to log in before you can comment on or make changes to this bug.