Calling filter() on a proxied array is quite a bit slower in SM than V8
Categories
(Core :: JavaScript Engine: JIT, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox118 | --- | fixed |
People
(Reporter: jrmuizel, Assigned: dthayer, NeedInfo)
References
(Blocks 2 open bugs, Regressed 1 open bug)
Details
(Whiteboard: [sp3])
Attachments
(10 files)
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details |
This example comes from the updated vue SP3 test.
958 vs 459 samples (diff 499, 2.09x): https://share.firefox.dev/3K0JMnG vs https://share.firefox.dev/3JK33se
The difference is about 3% of the SM time and 5% of the V8 time
All the time is spent in js::ProxyGetPropertyByValue and js::ProxyHas.
The following code is a rough approximation of what Vue is doing.
It runs in 3.29ms vs V8's 1.7ms
function wrap(o) {
return new Proxy(o, {
get: function (target, propKey) {
var p = Reflect.get(target, propKey);
return p
},
has: function (target, P) {
return Reflect.has(target, P)
},
});
}
function bench() {
var todos = []
for (var i = 0; i < 100; i++) {
todos.push(i)
}
todos = wrap(todos)
for (var i = 0; i < 100; i++) {
todos = todos.filter(t => t !== i)
todos = wrap(todos)
}
}
var start = performance.now();
bench();
var end = performance.now()
console.log(`took ${end - start}ms`)
FWIW, JSC does even worse (3.9ms)
Reporter | ||
Comment 1•2 years ago
|
||
The non-proxy case is also quite a bit worse in SM:
function bench() {
var todos = []
for (var i = 0; i < 100; i++) {
todos.push(i)
}
for (var i = 0; i < 100; i++) {
todos = todos.filter(t => t !== i)
}
}
0.78ms vs 0.26ms
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 2•2 years ago
|
||
For the given profiles it looks like we spend 1682 samples vs V8's 374 in the proxy machinery. If we can reduce the cost to match V8's that would give an improvement of about 8.5% on the shell version of this test.
Reporter | ||
Comment 3•2 years ago
|
||
JSC has also been working on Proxy performance recently:
https://github.com/WebKit/WebKit/commit/1d89fdbdb2bae963b88585f0b94091cc2e3e590c
https://github.com/WebKit/WebKit/commit/fbc9cd14c68a3a4f93790bf31543dcf5180da4f3
https://github.com/WebKit/WebKit/commit/fe0d0e1818d0267e38495a3935584229e1625525
https://github.com/WebKit/WebKit/commit/0f0092e17fdbdd3127963eeb9eb45e8db4506726
https://github.com/WebKit/WebKit/commit/b45d2e9c3d34a09f74191f07275eef8cf57f6f4d
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Updated•1 years ago
|
Comment 4•1 years ago
|
||
Putting this here just because this was one of the linked commits, and to write this down somewhere.
Spent some time looking into the microbenchmarks the webkit team provided for Reflect.get on this commit where they found an improvement of 1.3% on Jetstream3/proxy-vue
You can see a huge performance improvement between Safari and Safari Technical Preview today:
Safari: reflect_get times: 106.0, 86.0, 85.0, 85.0, 85.0, 85.0, 85.0, 85.0, 85.0, 86.0
Safari reflect_get_with_reciever times: 336.0, 312.0, 311.0, 313.0, 313.0, 311.0, 312.0, 311.0, 310.0, 310.0
STP reflect_get times: 15.0, 12.0, 4.0, 3.0, 4.0, 3.0, 3.0, 4.0, 3.0, 3.0
STP reflect_get_with_reciever times: 51.0, 39.0, 46.0, 39.0, 46.0, 47.0, 46.0, 47.0, 47.0, 46.0
However, Firefox is already pretty decent here:
Firefox Nightly reflect_get times: 18.0, 19.0, 17.0, 18.0, 18.0, 19.0, 16.0, 19.0, 17.0, 17.0
Firefox Nightly reflect_get_with_reciever times: 67.0, 40.0, 37.0, 39.0, 38.0, 38.0, 66.0, 66.0, 66.0, 67.0
And Chrome is between the two:
Chrome reflect_get times: 78.2, 51.4, 51.0, 46.7, 46.6, 46.5, 46.6, 46.5, 46.5, 46.6
Chrome reflect_get_with_reciever: 142.0, 114.1, 114.1, 116.5, 114.0, 114.2, 113.9, 114.2, 114.0, 113.6
My conclusion here is that we needn't investigate Reflect.get
Comment 5•1 years ago
|
||
Here's some straight microbenchmark of the proxy machinery:
Safari 16.4
Safari proxy_set times 77.0, 54.0, 55.0, 55.0, 54.0, 55.0, 54.0, 55.0, 54.0, 55.0
Safari proxy_set_miss_handler times 114.0, 114.0, 114.0, 114.0, 115.0, 114.0, 114.0, 114.0, 115.0, 114.0
Safari proxy_get times 15.0, 16.0, 17.0, 15.0, 16.0, 16.0, 16.0, 15.0, 16.0, 16.0
Safari proxy_get_miss_handler times 5.0, 6.0, 5.0, 5.0, 6.0, 5.0, 5.0, 6.0, 5.0, 5.0
Safari Technical Preview
SafariTP proxy_set times 27.0, 19.0, 18.0, 17.0, 16.0, 12.0, 12.0, 13.0, 12.0, 12.0
SafariTP proxy_set_miss_handler times 15.0, 14.0, 15.0, 14.0, 14.0, 15.0, 14.0, 14.0, 14.0, 15.0
SafariTP proxy_get times 12.0, 12.0, 14.0, 15.0, 16.0, 11.0, 12.0, 11.0, 12.0, 11.0
SafariTP proxy_get_miss_handler times 3.0, 1.0, 1.0, 2.0, 1.0, 1.0, 1.0, 2.0, 1.0, 1.0
Chrome (114)
Chrome proxy_set times 14.8, 14.6, 14.8, 14.8, 14.7, 14.7, 14.9, 14.7, 14.9, 14.7
Chrome proxy_set_miss_handler times 125.7, 126.1, 125.4, 125.5, 125.1, 125.5, 125.4, 125.3, 125.3, 125.3
Chrome proxy_get times 13.1, 13.0, 13.1, 13.0, 13.2, 13.0, 13.0, 13.2, 13.0, 13.0
Chrome proxy_get_miss_handler times 20.4, 20.3, 20.2, 20.3, 20.2, 20.3, 20.3, 20.2, 20.3, 20.4
Firefox (2023-06-05)
Firefox proxy_set times 80.0, 68.0, 69.0, 68.0, 68.0, 68.0, 69.0, 69.0, 68.0, 68.0
Firefox proxy_set_miss_handler times 102.0, 101.0, 101.0, 101.0, 101.0, 102.0, 101.0, 102.0, 101.0, 102.0
Firefox proxy_get times 79.0, 78.0, 78.0, 79.0, 79.0, 78.0, 78.0, 78.0, 78.0, 78.0
Firefox proxy_get_miss_handler times 46.0, 46.0, 46.0, 45.0, 46.0, 45.0, 46.0, 46.0, 45.0, 46.0
Confirms earlier analysis that there's some gaps here that we may be able to resolve.
Assignee | ||
Comment 6•1 year ago
|
||
I'm putting this up for review now, but there are a few caveats. This passes
tests on x64, but it's broken on ARM and x86. x86 is difficult because of the
number of registers, and ARM I just haven't debugged yet. But, assuming this
plan sounds all right to you, my plan is to try to get this in just for x64 and
immediately work on ARM/maybe x86 in follow-ups. Anyway, I can also just
support everything in one bug, but I figured I should put it up now anyway as
it's a pretty chonky patch and might take a bit to get through.
Assignee | ||
Comment 7•1 year ago
|
||
Assignee | ||
Comment 8•1 year ago
|
||
Depends on D182380
Assignee | ||
Comment 9•1 year ago
|
||
Depends on D182381
Updated•1 year ago
|
Comment 10•1 year ago
|
||
Do you have some data on which cases we care about for sp3? How often do we skip/perform result validation? What kind of JSClass
es does the benchmark access through proxies? Do these objects typically have accessor properties or just data properties?
Also this probably needs a lot of jit-tests for various corner cases involving bailouts, GC etc because it likely isn't covered well by existing jit-tests as this is the first time we're optimizing them?
Assignee | ||
Comment 11•1 year ago
|
||
Alright, so, tests are incoming, but regarding numbers:
In sp3, of ~45000 get trap calls per run, this patch lets us skip 100% of them. The only two classes we see in the context of sp3 are ArrayObjects and PlainObjects. 38k get trap calls were for PlainObjects, and 7k were for ArrayObjects. In the wild just browsing around popular sites I see something similar. I did see a handful of other JSClasses, but overwhelmingly it's PlainObjects, with a smaller portion of ArrayObjects. In the wild, again with very unscientific browsing around the web, I still saw the patch skip 100% of get trap validations.
Assignee | ||
Comment 12•1 year ago
|
||
Depends on D182133
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Comment 13•1 year ago
|
||
Depends on D182506
Assignee | ||
Comment 14•1 year ago
|
||
So, this patch is very tentative. The reason for this is that if we just change
the alias sets for these ops, we end up failing this assertion: https://searchfox.org/mozilla-central/rev/85269d4444c2553e7f4c669fe4de72d64f4fe438/js/src/jit/CodeGenerator.cpp#332.
Our solution to this is to give the ops resume points, and... blow right past
the warnings telling us not to have multiple effectful operations for a single
transpiled IC. Iain suggested that I rope you in here, Jan, to get your take on
whether this is safe or not. Nothing seems to blow up in tests.
Depends on D184721
Comment 15•1 year ago
|
||
Comment 16•1 year ago
|
||
Backed out for causing multiple failures.
- Push with failures - jsreftests failures
- Failure Log
- Failure line: Assertion failure: trap->hasJitEntry(), at /builds/worker/checkouts/gecko/js/src/jit/CacheIRWriter.h:629
- Push with failures - another jsreftests failure
- Failure Log
- Failure line: REFTEST PROCESS-CRASH | application crashed [@ None + None] | js/src/tests/non262/Intl/NumberFormat/formatRangeToParts-approximately-sign-compact.js
- Push with failures - mochitests failures on Android
- Failure Log
- Failure line: No tests run or test summary not found
- Push with failures - mochitests failures on Linux
- Failure Log
- Failure line: SUMMARY: AddressSanitizer: SEGV (<unknown module>)
Comment 17•1 year ago
|
||
Comment 18•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5066066a54b1
https://hg.mozilla.org/mozilla-central/rev/05cd60595797
https://hg.mozilla.org/mozilla-central/rev/75eed2f4f607
https://hg.mozilla.org/mozilla-central/rev/0e618ce1438d
https://hg.mozilla.org/mozilla-central/rev/630e159488e2
https://hg.mozilla.org/mozilla-central/rev/028f628b7ddc
https://hg.mozilla.org/mozilla-central/rev/ca712509847a
Comment 19•1 year ago
|
||
Backed out for causing frequent JavaScript crashes (bug 1848391)
Backout: https://hg.mozilla.org/mozilla-central/rev/807120c9e29aa700f975e2739512c1934aa7722a
Assignee | ||
Comment 20•1 year ago
|
||
Fairly straightforward patch. I didn't add a test for this specifically,
because it turns out the previous tests would have been failing if I had just
correctly loaded the JSClass out of the BaseShape in the first place because
PlainObject has null class ops. We were just actually loading the proto
pointer before off the BaseShape, and then loading an offset from that which
just happened to usually be non-null, causing us to take the more conservative
path.
Depends on D185350
Comment 21•1 year ago
|
||
Comment 22•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/08e262b73951
https://hg.mozilla.org/mozilla-central/rev/ea3a9279bb6e
https://hg.mozilla.org/mozilla-central/rev/4eba7945d082
https://hg.mozilla.org/mozilla-central/rev/29a2d71f75f7
https://hg.mozilla.org/mozilla-central/rev/cb261e0fd7ff
https://hg.mozilla.org/mozilla-central/rev/642835f86462
https://hg.mozilla.org/mozilla-central/rev/08e79d0af023
https://hg.mozilla.org/mozilla-central/rev/7a44fa1875aa
Comment 23•1 year ago
|
||
bug 1848277 was also apparently caused by this, maybe it can just be closed now if the new patches above don't regress it.
Comment 24•1 year ago
|
||
Rerunning the benchmark from Comment #5 shows our improvement
Firefox proxy_set times 94.0, 68.0, 68.0, 69.0, 68.0, 68.0, 69.0, 68.0, 68.0, 69.0
Firefox proxy_set_miss_handler times 101.0, 100.0, 100.0, 100.0, 101.0, 100.0, 100.0, 100.0, 100.0, 100.0
Firefox proxy_get times 5.0, 4.0, 4.0, 4.0, 5.0, 5.0, 4.0, 4.0, 4.0, 5.0
Firefox proxy_get_miss_handler times 3.0, 1.0, 3.0, 2.0, 2.0, 3.0, 2.0, 2.0, 2.0, 3.0
the proxy_get times dropped from 79 -> ~5, and proxy_get_miss_handler times dropped from 46 -> ~2
Looking at Safari Tech Preview's numbers today:
Safari proxy_set times 2.0, 4.0, 5.0, 5.0, 5.0, 2.0, 1.0, 2.0, 1.0, 1.0
Safari proxy_set_miss_handler times 63.0, 63.0, 63.0, 63.0, 63.0, 62.0, 63.0, 63.0, 63.0, 63.0
Safari proxy_get times 3.0, 3.0, 4.0, 5.0, 5.0, 2.0, 1.0, 2.0, 1.0, 1.0
Safari proxy_get_miss_handler times 3.0, 1.0, 1.0, 2.0, 1.0, 1.0, 1.0, 2.0, 1.0, 1.0
We've matched them on get times, but still are slower on set and set miss-handler.
Comment 25•1 year ago
|
||
The improvement here is visible in the backfilled numbers for the Vue TodoMVC test. See: https://docs.google.com/spreadsheets/d/1FM5HnjupceXchdoL5zrWXPVuobDl71iaE4VIRf5hOaM/edit
There was a ~40% improvement over the last 8 months, with this bug alone explaining about 6pp of that.
Description
•