7.97% dromaeo_dom (linux64) regression on push b7be8fab1b5e497b75a372844a7f02e8ce3d538e (Thu Jan 31 2019)
Categories
(Core :: JavaScript: GC, defect, P1)
Tracking
()
People
(Reporter: Bebe, Unassigned)
References
Details
(Keywords: perf, regression, talos-regression)
Talos has detected a Firefox performance regression from push:
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
8% dromaeo_dom linux64 opt e10s stylo 2,990.16 -> 2,751.94
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=19107
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests
For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Running
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 1•6 years ago
|
||
[:sfink] bug 1442481 created a regression can you take a look please
Reporter | ||
Comment 2•6 years ago
|
||
Gecko profile:
Before:
Comment 3•6 years ago
|
||
(For what it's worth, bug 1442481 is indeed very likely to be responsible. I am looking at it.)
Comment 4•6 years ago
|
||
I wonder if I can pay for the dromaeo_dom 8% slowdown with a dromaeo_css 4-5% improvement... :-)
What job runs dromaeo_dom? I'm only seeing the dromaeo_css one.
Comment 5•6 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] from comment #4)
I wonder if I can pay for the dromaeo_dom 8% slowdown with a dromaeo_css 4-5% improvement... :-)
What job runs dromaeo_dom? I'm only seeing the dromaeo_css one.
Run mach try fuzzy --no-artifact and pick all opt-talos-g3-e10s jobs (there are only 3 of that type).
Comment 6•6 years ago
|
||
Yes, thanks, I found it by scanning through the other logs.
I don't have a solution here yet, but those profiles were valuable. The main difference I see between them is that the memory usage is higher with nursery strings on. Which is unexpected; if anything, it should be the other way around. I will be looking into that. https://bugzilla.mozilla.org/show_bug.cgi?id=1524435 seems to be showing higher memory usage as well.
I'm not sure how long you're ok with me investigating this before backing out. I am actively working on it. If bug 1442481 is backed out, please backout just b7be8fab1b5e. The other one is a bugfix that only applies when nursery strings are on.
But if it does need to backed out, it's fine. All of the code to use nursery strings is already present, just never used without the default toggle in b7be8fab1b5e. I'm hoping I can keep it in and then tune it to provide a net positive change, but given that it's just a matter of toggling a flag, it doesn't really change the investigative work I'm doing either way. This bug's slowdown is too high and I need to fix it.
Updated•6 years ago
|
Comment 7•6 years ago
|
||
I think I'd like to ignore this one.
One noticeable difference in the profiles is that there are many more minor GCs with nursery strings on the dromaeo_dom attr test. The reason is simple: every iteration of the inner loop allocates one object and one string. Without nursery strings, 32K objects get allocated in the nursery before it fills up. With nursery strings, you're allocating both objects and strings in the nursery, so you only get 16K of each before the nursery fills. As a result, you do many more nursery collections, and each one has a small fixed overhead.
To test this hypothesis (on jonco's suggestion), I tried doubling the size of the nursery. That should equalize the number of minor GCs. These two pushes test this:
- nursery strings off: https://treeherder.mozilla.org/#/jobs?repo=try&tier=1&revision=098c23e413e52bd0383cbf82e1fcc818ccd60a59
- nursery strings on, double the nursery size: https://treeherder.mozilla.org/#/jobs?repo=try&tier=1&revision=13c2c6426ba6a76feba7ad5c98ae49c8284434b6
Perfherder isn't showing me anything from those two pushes (???), but the scores for the dromaeo_dom test (restricted to only dom-attr.html) are
- nursery strings off: 3343 3343 3346 3353 3361 3363 3372 mean 3354
- nursery strings on : 3276 3322 3323 3334 3337 3342 3433 mean 3338
That's a 1% slowdown.
The reason I don't want to just double the nursery size is that it seems like this test is the only one showing a noticeable decline. But doubling the size of the nursery would use more space all the time, so I'd rather not do that without good reason. (We will be looking at splitting up the object and string nurseries, which might have the same effect, but only if needed.)
All that said, I still need to look further into this regression. With nursery strings off, we do far fewer minor GCs -- but far more major GCs, and they take about 3ms each, so I don't know why this isn't a significant improvement for nursery strings even with the nursery at its original size. (With nursery strings off, lots of dead strings get tenured, and that triggers major GCs.) Also, I don't have an explanation for why the process size is larger with nursery strings on. I was thinking it might be because fewer major GCs meant less stuff got cleared out, but there are two major GCs even with nursery strings on, and the major GCs don't seem to show up in the size graph anyway (which also makes no sense, at least in the nursery strings off case -- some number of dead strings should be building up and getting cleared out on every major GC.)
Comment 8•6 years ago
|
||
Steve, have you managed to look further into this regression?
Comment 9•6 years ago
|
||
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #8)
Steve, have you managed to look further into this regression?
I was hoping to just accept the hit. The further exploration I mentioned in the last paragraph of comment 7 doesn't affect that outcome.
It seems like the natural fixes for this particular test would regress other things (in either performance or memory usage), and leaving nursery strings in as they are now allows other fixes that should improve things generally.
Updated•6 years ago
|
Description
•