Closed
Bug 497788
Opened 15 years ago
Closed 3 years ago
[meta]Improve performance on Ben Galbraith's linked bubblemark
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: bzbarsky, Unassigned)
References
()
Details
(Keywords: meta, perf)
Attachments
(6 files, 1 obsolete file)
On the url in the url field, when set to 128 balls, we get about 23fps on my machine. Safari 4 gets closer to 80.
I profiled this on m-c, and the time split on a high level (ignoring the time spent on the URI-classifier thread, which seems to be 7% of the profiler hits) is:
22% in js_Interpret (we mostly fail to trace this testcase, at least in part because of non-stub getters/setters, but there are other issues too)
15% under js_MonitorLoopEdge, breaking down as:
10% js_ExecuteTree (largely LeaveTree, JS_ArenaAllocate,
BuildNativeStackFrame, etc).
2% js_CheckEntryTypes
1% Running jitted code
2% various other small stuff (attempting to extend trees, etc)
10% under js_GetPropertyHelper, breaking down as:
2% self
4% js_FillPropertyCache
4% js_LookupPropertyWithFlags (in that function, and under it
in js_SearchScope)
8% under js_SetPropertyHelper, breaking down as:
3.5% DOM SetTop
3.5% DOM SetLeft
1% js_LookupPropertyWithFlags and js_FillPropertyCache
6% under js_ValueToNumber, mostly calling js_strtod.
4% under js_FullTestPropertyCache
2% under js_SetProperty
2% under js_ValueToString (mostly calling js_NumberToString)
2% allocating jsdoubles
2% js_FindPropertyHelper
~5% in other smaler js things (js_NativeGet, js_fun_call, math_abs, js_GetProperty, etc).
Non-JS bits:
15% painting
4% style recomputation
2% reflow
Reporter | ||
Comment 1•15 years ago
|
||
Reporter | ||
Comment 2•15 years ago
|
||
Reporter | ||
Comment 3•15 years ago
|
||
Reporter | ||
Comment 4•15 years ago
|
||
Reporter | ||
Comment 5•15 years ago
|
||
Oh, the attached JS differs from the original in one important way. In the first js file, this part:
// process collisions
for (i=0; i<_this._N; i++) {
for (var j=i+1; j<_this._N; j++) {
_this._ballsO[i].doCollide(_this._ballsO[j]);
}
}
the original testcase has just |j=i+1| (so js is a global variable).
Reporter | ||
Comment 6•15 years ago
|
||
Filed bug 497789 on the other thing that makes us fall off trace here, and do so on the O(N^2) part of the benchmark, which seems to dominate for N==128 per above profile data.
Keywords: perf
Reporter | ||
Comment 7•15 years ago
|
||
Attachment #382905 -
Attachment is obsolete: true
Reporter | ||
Comment 8•15 years ago
|
||
Reporter | ||
Comment 9•15 years ago
|
||
So as an experiment, I tried just commenting out the guard that leads to the aborts in bug 497789. That helps a bit: fps goes from 23 to 50.... for the first 10-15 seconds. Then it collapses back to 23. And there are more "inner tree is trying to grow" aborts. Maybe the commenting out is just too hacky to really work...
Reporter | ||
Comment 10•15 years ago
|
||
This is a totally nonminimal JS shell version of the testcase. It still outputs fps, and gcs every so often just like the browser.
This testcase gets 60fps or so until the first gc on m-c right now, then drops off to 45fps. With the patch in bug 497789 it goes up to 400fps or so... until the first gc. Then it drops to about 45fps. That's what comment 9 is about. I'll be filing a bug on that.
Reporter | ||
Comment 11•15 years ago
|
||
OK, with the fix for bug 497789 (and all other pending patches for bugs blocking this one applied) the new profile looks like:
js-related stuff:
7% running jitted code, boxing and unboxing doubles, etc.
5% js_VaueToString
4% getting .style off nodes (JS-wrapping, unwrapping, etc).
Slimwrapper might help.
2.5% interpreter time from js_fun_call. Might get better once we trace
getters/setters.
1% other.
Total js-related: 20% or so. A big step up from the 68% in comment 0.
Non-js-related:
36% painting (see bug 498579, though it's not happening as much in this
profile; compositor might help here)
15% setting style.top/left (at least 3/4 under DeclarationChanged)
10% style recomputation (see bug 479655)
4% reflow
3% js_LookupPropertyWithFlags
The remaining 10% or so looks like mostly profiler artifacts (dtrace_get_cpu_int_stack_top, I'm looking at you).
Reporter | ||
Comment 13•15 years ago
|
||
Quick update: We now hit 40fps on trunk here (compared to 28fps back in June). If I hack the JS to avoid bug 497789, we hit 68fps.
Safari 4 on the same hardware is at 90fps on the original testcase; 95fps on the hacked on. Chrome is at 130fps, but cheating on the timeouts.
With that hack, 15% of the time is spent in jit-generated code or libmozjs. Also some xpconnect-ish time around. So pretty similar to comment 11...
Updated•15 years ago
|
QA Contact: general → brendan
Reporter | ||
Comment 14•15 years ago
|
||
In case it wasn't clear, the 68fps cap from comment 13 was due to bug 528208. With that fixed, and still with the hackaround bug 497789 in place, m-c is at 100fps on my machine. If I drop the interval clamping in core to below 10ms the fps actually goes down; I think the timer thread is screwing that over somehow. We could try more balls to see whether we can get useful head-to-head with chrome. I'll probably do that once bug 497789 lands and reprofile.
Reporter | ||
Comment 15•15 years ago
|
||
OK, I redid a profile with the current patch in bug 497789 and the DOM's rate-limiting on timeouts removed (so we could go over the 100fps we were hitting).
General breakdown:
js-related:
1% quickstubs glue setting style.top/left
9% js_NumberToStringWithBase
4.5% js_UnboxDouble
3.5% getting .style (unwrapping this, tearoffs, wrapping the decl)
1% js_ConcatStrings
0.5% js_BoxDouble
12% jit-generated code
1% js_ValueToString
0.4% in js_Interpret (yay!)
Total JS-related: 33%
non-js-related:
23.5% setting style.left/top (
14% processing restyles
3% reflow
21% painting
Total non-JS-related: 62%
The remaining 5% looks to mostly be cocoa widgetry stuff or something.
We'll likely win a few % here if painting moves to refresh driver; about 1/7 of the restyling time was happening off WillPaint.
In general, the time spent under the setTimeout call is about 58% of total, with JS accounting for a bit over half of that.
In the style attr setting there's some COM stuff to be killed off; zwol is working on that.
Reporter | ||
Comment 16•15 years ago
|
||
Oh, and when not running under the profiler, for the comment 15 setup we hit 133fps on my machine. Chrome hits about 148fps.
Updated•14 years ago
|
QA Contact: brendan → general
Reporter | ||
Comment 17•14 years ago
|
||
For what it's worth, current numbers for the comment 15 setup on my current machine are:
m-c: 175fps
Opera: 116fps
Chrome: 180fps
Assignee | ||
Updated•10 years ago
|
Assignee: general → nobody
Comment 18•3 years ago
|
||
Meta bug with all dependent bugs fixed. Shell test case is fast, too.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•