Closed
Bug 1483172
Opened 6 years ago
Closed 6 years ago
2.21 - 21.07% sessionrestore / sessionrestore_no_auto_restore / tsvgx (linux64, osx-10-10, windows10-64, windows7-32) regression on push e6005ad88309 (Fri Aug 10 2018)
Categories
(Firefox Build System :: General, defect)
Tracking
(firefox-esr52 unaffected, firefox-esr60 unaffected, firefox61 unaffected, firefox62 unaffected, firefox63 wontfix, firefox64 wontfix)
RESOLVED
WONTFIX
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox-esr60 | --- | unaffected |
firefox61 | --- | unaffected |
firefox62 | --- | unaffected |
firefox63 | --- | wontfix |
firefox64 | --- | wontfix |
People
(Reporter: igoldan, Unassigned)
References
Details
(Keywords: perf, regression, talos-regression)
Talos has detected a Firefox performance regression from push:
https://hg.mozilla.org/integration/autoland/rev/e6005ad88309
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
21% sessionrestore linux64 opt e10s stylo 274.50 -> 332.33
21% sessionrestore_no_auto_restore linux64 opt e10s stylo296.17 -> 358.08
21% sessionrestore_no_auto_restore linux64 pgo e10s stylo281.08 -> 339.17
21% sessionrestore linux64 pgo e10s stylo 263.00 -> 316.92
17% sessionrestore_no_auto_restore windows10-64 opt e10s stylo265.17 -> 311.00
17% sessionrestore_no_auto_restore osx-10-10 opt e10s stylo599.42 -> 701.42
17% sessionrestore windows7-32 opt e10s stylo 238.08 -> 278.33
17% sessionrestore_no_auto_restore windows10-64 pgo e10s stylo248.00 -> 289.33
16% sessionrestore windows7-32 pgo e10s stylo 231.75 -> 269.92
15% sessionrestore_no_auto_restore windows7-32 pgo e10s stylo265.67 -> 306.67
15% sessionrestore_no_auto_restore windows7-32 opt e10s stylo276.50 -> 317.08
14% sessionrestore osx-10-10 opt e10s stylo 562.08 -> 643.33
10% sessionrestore windows10-64 opt e10s stylo229.54 -> 253.00
10% sessionrestore windows10-64 pgo e10s stylo214.46 -> 235.67
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=14867
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests
For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Updated•6 years ago
|
Product: Testing → Firefox Build System
Reporter | ||
Updated•6 years ago
|
Flags: needinfo?(aswan)
Reporter | ||
Updated•6 years ago
|
status-firefox61:
--- → unaffected
status-firefox62:
--- → unaffected
status-firefox63:
--- → affected
status-firefox-esr52:
--- → unaffected
status-firefox-esr60:
--- → unaffected
Reporter | ||
Comment 1•6 years ago
|
||
I'll post the Gecko profiles soon.
Reporter | ||
Comment 2•6 years ago
|
||
Gecko profiles from Windows 10:
sessionrestore:
before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FPIlBl8r4T8WUl-qg-IMviQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore.zip
after: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FcUxB7x_BSTGM_ipTRFGvjQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore.zip
sessionrestore_no_auto_restore:
before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FPIlBl8r4T8WUl-qg-IMviQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore_no_auto_restore.zip
after: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FcUxB7x_BSTGM_ipTRFGvjQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore_no_auto_restore.zip
Comment 3•6 years ago
|
||
How did you attribute this to the push from comment 0? From a quick glance at some perfherder graphs, this regression was present on autoland before my patch landed, it looks like it came from the previous merge from central.
Flags: needinfo?(aswan) → needinfo?(igoldan)
Comment 4•6 years ago
|
||
:aswan, it showed up on all the merges, so we worked on pushes to try bisecting the merges and this ended up being the culprit:
https://treeherder.mozilla.org/#/jobs?repo=try&author=igoldan@mozilla.com&filter-searchStr=talos%20other&fromchange=ebee17803f232bac510c2d8a1f60420ab5825995
you can look at the results:
https://treeherder.mozilla.org/perf.html#/graphs?timerange=86400&series=try,1536960,1,1&zoom=1534227879872.951,1534238294403.6885,220,274.65855287701896
since this was a try bisection where we removed one patch at a time individually (not cumulative), we can easily see where the numbers are different, in this case it is this push:
https://hg.mozilla.org/try/pushloghtml?fromchange=8a26688268b167ffdea25c8e6dad0164ff1f7088&tochange=78c05420710573942c40d5e8a51f7db12c4b96b9
I am not sure why this was difficult to show up, maybe this is a reaction with another patch that landed and when merged around problems began.
Flags: needinfo?(igoldan)
Comment 5•6 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC+2) from comment #4)
> I am not sure why this was difficult to show up, maybe this is a reaction
> with another patch that landed and when merged around problems began.
I suspect this was the combination of this patch and bug 1480610. From eyeballing the profiles, with those two patches both present it looks like we're not doing any extra work, but that we are starting the extension process earlier. This is a cost that I think we're going to have to bear at some point (unless we just stick with bootstrapped extensions running in the parent process indefinitely)
Comment 6•6 years ago
|
||
(In reply to Andrew Swan [:aswan] from comment #5)
> (In reply to Joel Maher ( :jmaher ) (UTC+2) from comment #4)
> > I am not sure why this was difficult to show up, maybe this is a reaction
> > with another patch that landed and when merged around problems began.
>
> I suspect this was the combination of this patch and bug 1480610. From
> eyeballing the profiles, with those two patches both present it looks like
> we're not doing any extra work, but that we are starting the extension
> process earlier. This is a cost that I think we're going to have to bear at
> some point (unless we just stick with bootstrapped extensions running in the
> parent process indefinitely)
That's probably accurate. Also, this is probably a cost we're already paying for users who have extensions installed.
It is interesting, though, that the fxmonitor extension seems to be completely empty. That suggests that this overhead is all from just loading the extension framework. The profiles aren't especially illuminating, though.
Comment 7•6 years ago
|
||
(In reply to Andrew Swan [:aswan] from comment #5)
> (In reply to Joel Maher ( :jmaher ) (UTC+2) from comment #4)
> > I am not sure why this was difficult to show up, maybe this is a reaction
> > with another patch that landed and when merged around problems began.
>
> I suspect this was the combination of this patch and bug 1480610. From
> eyeballing the profiles, with those two patches both present it looks like
> we're not doing any extra work, but that we are starting the extension
> process earlier. This is a cost that I think we're going to have to bear at
> some point (unless we just stick with bootstrapped extensions running in the
> parent process indefinitely)
FWIW, I decided to backout that patch: please see bug 1483430 comment 0.
Comment 8•6 years ago
|
||
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #2)
> Gecko profiles from Windows 10:
Do you have profiles from Linux? The sampling rate on Windows is much lower, and these profiles don't have native symbols.
Flags: needinfo?(igoldan)
Comment 9•6 years ago
|
||
please see bug 1449055, when that was landed no perf changes, but when it was backed out it was as if all of these regression on this bug were fixed...now that it is landed again the regressions are back. I suspect something in code from this bug and code from bug 1449055 are tightly coupled.
Comment 10•6 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC+2) from comment #9)
> please see bug 1449055, when that was landed no perf changes, but when it
> was backed out it was as if all of these regression on this bug were
> fixed...now that it is landed again the regressions are back. I suspect
> something in code from this bug and code from bug 1449055 are tightly
> coupled.
That pretty much confirms my guess that this is just the overhead of loading the framework, and is not per-extension overhead.
Comment 11•6 years ago
|
||
given that igoldan did backouts of each patch on try, I am not sure how to gather performance profiles- maybe we can use the new regression:
https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,1640643,1,1&series=autoland,1646280,1,1&selected=mozilla-inbound,1640643,365881,528096659,1
:kmag, would that be useful?
Comment 12•6 years ago
|
||
this is the summary of alerts that came in when Bug 1449055 was relanded
== Change summary for alert #14867 (as of Fri, 10 Aug 2018 02:49:49 GMT) ==
Regressions:
21% sessionrestore linux64 opt e10s stylo 274.50 -> 332.33
21% sessionrestore_no_auto_restore linux64 opt e10s stylo296.17 -> 358.08
21% sessionrestore_no_auto_restore linux64 pgo e10s stylo281.08 -> 339.17
21% sessionrestore linux64 pgo e10s stylo 263.00 -> 316.92
17% sessionrestore_no_auto_restore windows10-64 opt e10s stylo265.17 -> 311.00
17% sessionrestore_no_auto_restore osx-10-10 opt e10s stylo599.42 -> 701.42
17% sessionrestore windows7-32 opt e10s stylo 238.08 -> 278.33
17% sessionrestore_no_auto_restore windows10-64 pgo e10s stylo248.00 -> 289.33
16% sessionrestore windows7-32 pgo e10s stylo 231.75 -> 269.92
15% sessionrestore_no_auto_restore windows7-32 pgo e10s stylo265.67 -> 306.67
15% sessionrestore_no_auto_restore windows7-32 opt e10s stylo276.50 -> 317.08
14% sessionrestore osx-10-10 opt e10s stylo 562.08 -> 643.33
10% sessionrestore windows10-64 opt e10s stylo229.54 -> 253.00
10% sessionrestore windows10-64 pgo e10s stylo214.46 -> 235.67
2% tsvgx linux64 opt e10s stylo 179.95 -> 183.93
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14867
Comment 13•6 years ago
|
||
I posted the wrong summary in comment 12, this is the new one:
== Change summary for alert #15086 (as of Thu, 16 Aug 2018 19:33:48 GMT) ==
Regressions:
27% sessionrestore linux64 pgo e10s stylo 263.75 -> 333.75
26% sessionrestore linux64 opt e10s stylo 272.00 -> 343.92
26% sessionrestore_no_auto_restore linux64 pgo e10s stylo283.42 -> 357.67
25% sessionrestore_no_auto_restore linux64 opt e10s stylo291.40 -> 365.67
24% sessionrestore windows10-64 pgo e10s stylo215.17 -> 267.83
24% sessionrestore_no_auto_restore windows10-64 opt e10s stylo267.75 -> 332.92
23% sessionrestore windows10-64 opt e10s stylo234.67 -> 289.75
23% sessionrestore_no_auto_restore windows10-64 pgo e10s stylo248.58 -> 304.92
21% sessionrestore windows7-32 opt e10s stylo 241.83 -> 293.75
21% sessionrestore_no_auto_restore osx-10-10 opt e10s stylo605.25 -> 729.67
19% sessionrestore osx-10-10 opt e10s stylo 571.58 -> 681.08
19% sessionrestore windows7-32 pgo e10s stylo 235.50 -> 280.58
18% sessionrestore_no_auto_restore windows7-32 pgo e10s stylo269.92 -> 319.58
16% sessionrestore_no_auto_restore windows7-32 opt e10s stylo288.17 -> 334.75
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15086
Comment 14•6 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC+2) from comment #11)
> given that igoldan did backouts of each patch on try, I am not sure how to
> gather performance profiles- maybe we can use the new regression:
> https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,
> 1640643,1,1&series=autoland,1646280,1,1&selected=mozilla-inbound,1640643,
> 365881,528096659,1
>
> :kmag, would that be useful?
I think so, yeah
Comment 15•6 years ago
|
||
treeherder view of profiles:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&filter-searchStr=linux%20talos%20other%20o-p&tochange=39dc85d0cc5a9255768fd0147d6909e69193b4b3&fromchange=fd0d6079d0d2eeac50d856d4322e99b86fd9c597&selectedJob=194572495
linux64 opt sessionrestore before:
https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FFHU-Q8FfSjOX5N86D3P1Ng%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore.zip/calltree/?globalTrackOrder=&localTrackOrderByPid=&thread&v=3
linux64 opt sessionrestore after:
https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FDhzpDJIXTDWmWa4CpxXyOQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore.zip/calltree/?thread&threadOrder=&v=3
Flags: needinfo?(igoldan)
Comment 17•6 years ago
|
||
Kris, should we assign this bug to you?
Comment 18•6 years ago
|
||
I've been working on removing the legacy bootstrap.js from Screenshots, and have found similar regressions in the same two Talos tests, see bug 1491997.
Comment 19•6 years ago
|
||
If this is what we get for now with bootstrap removal, then this is the cost to bear until we specifically target address the performance of loading the framework. And that would not happen for 63 or 64 (at this point).
I assume 64 would be affected as well.
Flags: needinfo?(kmaglione+bmo)
Comment 20•6 years ago
|
||
This affects our 2 webcompat system addons too https://bugzilla.mozilla.org/show_bug.cgi?id=1451485#c36 and https://bugzilla.mozilla.org/show_bug.cgi?id=1451484
Updated•6 years ago
|
Comment 21•6 years ago
|
||
This bug is about measurements for sessionrestore in code that has already landed in mozilla-central, the measurements from the webcompat patches are additional changes on top of those tracked in this bug. The issues may be related, but I don't think there's a dependency between these bugs.
Updated•6 years ago
|
Reporter | ||
Comment 22•6 years ago
|
||
Shouldn't we mark this as wontfix for Firefox 64 as well?
Reporter | ||
Updated•6 years ago
|
Flags: needinfo?(jaws)
Comment 23•6 years ago
|
||
At this point I believe we should mark it as wontfix for 64, and if we want it fixed for 65 we would need to find someone to work on this ASAP.
status-firefox64:
--- → wontfix
Flags: needinfo?(jaws)
Reporter | ||
Comment 24•6 years ago
|
||
Tim, can you find someone who can fix this regression? This is the last bug that's blocking bug 1470850.
Flags: needinfo?(tspurway)
Comment 25•6 years ago
|
||
:aswan, can you take this one?
Flags: needinfo?(tspurway) → needinfo?(aswan)
Comment 26•6 years ago
|
||
The cause here was identified back in comment 5 and apparently (?) confirmed in comment 10.
Note that we got a little bit of the original hit back with bug 1498027.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(aswan)
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•