1821183 - EmberJS-TodoMVC spends a lot of time in Ion-compiled code for Cache.prototype.get

Reporter

Description

•

2 years ago

In the function Cache.prototype.get in the EmberJS-TodoMVC benchmark, we spend 1149 samples in its "JS-only self time", of which we spend 957 samples in the Ion-generated code itself. V8 spends 556 samples in the function's "JS-only self time", 374 of which are in the Turbofan-generated code.

To reproduce, run index.js inside https://github.com/mozilla/Speedometer/tree/ab0e3b8406bbc0356244f8cd430900e9a346737b/resources/todomvc/architecture-examples/emberjs/dist in the JS shell.

In Spidermonkey, the function's full name is Cache</Cache.prototype.get

Profiles (focused on Cache.prototype.get's "JS-only self time"):
Spidermonkey: https://share.firefox.dev/3ZxdRk5
V8: https://share.firefox.dev/3Zxfz4Z

I don't have a way to share the generated assembly code at the moment.

Jira Integration Bot

Updated

•

2 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-269

Iain Ireland [:iain]

Comment 1

•

2 years ago

I took a look at the generated assembly for this. Nothing stands out in particular. The hot spot is the code generated for this.store.get(key): after inlining it's this.store.data[key], and in particular the MegamorphicLoadSlotByValue we generate for data[key] is 87/153 of the self samples in my profile.

    Cache.prototype.get = function get(obj) {
      var key = this.key === undefined ? obj : this.key(obj);
      var value = this.store.get(key);
      if (value === undefined) {
        this.misses++;
        value = this._set(key, this.func(obj));
      } else if (value === UNDEFINED) {
        this.hits++;
        value = undefined;
      } else {
        this.hits++;
        // nothing to translate
      }

      return value;
    };

MegamorphicGetPropertyByValue has been pretty aggressively optimized already.

It's a little interesting that Markus's profile has so many more samples overall than mine. Mine ran for 1.1s with 13K samples; Markus's profile ran for 1.5 with 150K samples. That might just be differences in our perf setup, though.

Markus, can you confirm that this was an apples-to-apples comparison (eg we aren't missing places where V8 inlined this function)? Our codegen here seems reasonably good; I'm not sure how V8 could be twice as fast as us.

Markus Stange [:mstange]

Reporter

Comment 2

•

2 years ago

Thanks for taking a look!

Markus, can you confirm that this was an apples-to-apples comparison (eg we aren't missing places where V8 inlined this function)? Our codegen here seems reasonably good; I'm not sure how V8 could be twice as fast as us.

I filed this bug before we had understood that the high sampling frequency is introducing rather high stack-depth-dependent skew. Since our stacks are deeper than V8s, it is not as much of an apples-to-apples comparison as I initially assumed.

I will compare again once I have profiles with uniform overhead.

Markus Stange [:mstange]

Reporter

Comment 3

•

2 years ago

The new profiles tell a similar story to the old profiles.

SM: https://share.firefox.dev/3YEmUOW
V8: https://share.firefox.dev/3ZSe2X2

634 samples spent in Ion-generated code of 1061 total JS-self-time. V8 had a total JS-self-time of 566 samples for this function.

Something similar seems to happen in CachedTag.prototype.value (code is here), 5333 vs 2322 samples, 3457 in Ion:

SM: https://share.firefox.dev/3larGpx
V8: https://share.firefox.dev/3ywd9Yl

eg we aren't missing places where V8 inlined this function

This was running V8 with inlining turned off. And I have scoured the profile for places where V8 is spending more time than we are and haven't found any that seemed related to the work done in Cache.prototype.get.

Markus's profile ran for 1.5 with 150K samples. That might just be differences in our perf setup, though.

We were able to set the sampling frequency to 100kHz by rebooting the machine and setting /proc/sys/kernel/perf_cpu_time_max_percent to zero.

Will Medina [:willyelm]

Updated

•

2 years ago

Blocks: 1801194

Type: defect → task

Priority: -- → P2

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 4

•

1 year ago

EmberJS has been removed from sp3. Does the issue show up in other tests?

Flags: needinfo?(mstange.moz)

Markus Stange [:mstange]

Reporter

Comment 5

•

1 year ago

Unclear. This particular function is unique to this test, so this bug no longer blocks sp3.

No longer blocks: speedometer3

Flags: needinfo?(mstange.moz)

Whiteboard: [sp3]

Iain Ireland [:iain]

Comment 6

•

1 year ago

I'm going to go ahead and close this, then. Realistically we're never going to get around to optimizing one particular function in a benchmark we don't care about, particularly since my initial investigation didn't turn up anything interesting. If there's an important issue here, it will show up elsewhere.

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → WONTFIX

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

1 year ago

Whiteboard: [sp3]

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

1 year ago

See Also: https://mozilla-hub.atlassian.net/browse/SP3-269 →

Whiteboard: [sp3]

Bugzilla

Quick Search

EmberJS-TodoMVC spends a lot of time in Ion-compiled code for Cache.prototype.get

Categories

(Core :: JavaScript Engine, task, P2)

Tracking

()

People

(Reporter: mstange, Unassigned)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Updated

Updated