Closed Bug 1295502 Opened 8 years ago Closed 8 years ago

2.1 - 4.09% tart / tresize / tsvgx (linux64, windows7-32, windows8-64) regression on push 6af49d08884d76de44efd9387bbe340921652364 (Fri Aug 12 2016)

Categories

(Firefox :: Theme, defect, P1)

51 Branch
defect

Tracking

()

RESOLVED FIXED
Firefox 51
Tracking Status
firefox48 --- unaffected
firefox49 --- unaffected
firefox50 --- unaffected
firefox51 --- fixed

People

(Reporter: ashiue, Assigned: dao)

References

Details

(Keywords: perf, regression, talos-regression)

Attachments

(1 file)

Talos has detected a Firefox performance regression from push 6af49d08884d76de44efd9387bbe340921652364. As author of one of the patches included in that push, we need your help to address this regression.

Summary of tests that regressed:

  tresize windows8-64 opt: 11.66 -> 13.5 (15.83% worse)
  tart summary linux64 opt e10s: 6.25 -> 6.38 (2.1% worse)
  tart summary windows7-32 pgo e10s: 4.86 -> 4.97 (2.3% worse)
  tsvgx summary windows8-64 opt: 154.84 -> 161.17 (4.09% worse)
  tart summary windows8-64 opt: 5.47 -> 5.66 (3.53% worse)
  cart summary windows8-64 opt: 30.63 -> 31.84 (3.94% worse)


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2480

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
This issue might be caused by one of following changesets: 
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=2069e3662c134694bbab8db555b633faaddf3549&tochange=6af49d08884d76de44efd9387bbe340921652364

Hi Dão, as you are the patch author, can you take a look at this and determine what is the root cause? Thanks!
Flags: needinfo?(dao+bmo)
Probably bug 1022601. I'll look into it.
Assignee: nobody → dao+bmo
Flags: needinfo?(dao+bmo)
thanks Dao!
note that we have added tp5o_scroll, sessionrestore*, tabpaint and tps to the list as we have more data

Summary of tests that regressed:

  tp5o_scroll summary windows8-64 opt: 4.02 -> 4.14 (2.98% worse)
  tresize windows8-64 opt: 11.66 -> 13.5 (15.83% worse)
  tresize windows7-32 opt e10s: 12.08 -> 12.39 (2.52% worse)
  tart summary linux64 opt e10s: 6.25 -> 6.38 (2.1% worse)
  sessionrestore windows8-64 opt e10s: 712.25 -> 727.83 (2.19% worse)
  sessionrestore_no_auto_restore windows8-64 opt e10s: 746.71 -> 764.08 (2.33% worse)
  tresize windows8-64 opt e10s: 10.62 -> 12.27 (15.47% worse)
  tsvgx summary windows8-64 opt e10s: 117.78 -> 120.95 (2.69% worse)
  tart summary windows8-64 opt e10s: 6.18 -> 6.4 (3.49% worse)
  cart summary windows8-64 opt e10s: 31.79 -> 33.12 (4.19% worse)
  tart summary windows7-32 pgo e10s: 4.86 -> 4.97 (2.3% worse)
  tart summary linux64 pgo: 4.02 -> 4.11 (2.13% worse)
  tart summary windows7-32 pgo: 4.4 -> 4.53 (2.8% worse)
  tsvgx summary windows8-64 opt: 154.84 -> 161.17 (4.09% worse)
  tart summary windows8-64 opt: 5.47 -> 5.66 (3.53% worse)
  cart summary windows8-64 opt: 30.63 -> 31.84 (3.94% worse)
  cart summary linux64 opt e10s: 31.4 -> 37.55 (19.59% worse)
  tabpaint summary windows8-64 opt: 84.82 -> 88.87 (4.78% worse)
  sessionrestore windows8-64 opt: 822.92 -> 840.75 (2.17% worse)
  sessionrestore_no_auto_restore windows8-64 opt: 859.5 -> 878.25 (2.18% worse)
  cart summary linux64 opt: 29.33 -> 35.29 (20.3% worse)
  sessionrestore windows8-64 pgo e10s: 574.33 -> 591 (2.9% worse)
  sessionrestore_no_auto_restore windows8-64 pgo e10s: 603.42 -> 619.5 (2.67% worse)
  tabpaint summary windows8-64 pgo: 71.44 -> 76.2 (6.65% worse)
  sessionrestore windows8-64 pgo: 656.17 -> 674.92 (2.86% worse)
  sessionrestore_no_auto_restore windows8-64 pgo: 683.79 -> 704.83 (3.08% worse)
  tresize windows8-64 pgo: 12.32 -> 13.47 (9.3% worse)
  tresize windows8-64 pgo e10s: 9.85 -> 11.25 (14.14% worse)
  tart summary windows8-64 pgo e10s: 5.06 -> 5.23 (3.44% worse)
  cart summary windows8-64 pgo e10s: 26.6 -> 28.14 (5.78% worse)
  tsvgx summary windows8-64 pgo: 132.18 -> 138.47 (4.76% worse)
  cart summary windows8-64 pgo: 25 -> 26.44 (5.79% worse)
  tart summary windows8-64 pgo: 4.52 -> 4.7 (3.98% worse)
  tresize windowsxp opt e10s: 12.52 -> 13.21 (5.51% worse)
  tp5o_scroll summary windowsxp opt: 4.17 -> 4.27 (2.38% worse)
  cart summary linux64 pgo: 21.73 -> 26.14 (20.27% worse)
  cart summary linux64 pgo e10s: 23.36 -> 27.89 (19.41% worse)
  tart summary linux64 pgo e10s: 4.7 -> 4.8 (2.03% worse)
  tart summary windowsxp pgo e10s: 4.72 -> 4.81 (1.95% worse)
  tps summary windows8-64 pgo: 57.87 -> 59.51 (2.84% worse)

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=2480
The backout of Bug 1022601 showed am improvement in windows 7 for tresize- that is the only non pgo windows 7 regression we have.

I am confused why linux64 results are not showing up there.  I did specifically add some windows8 jobs and did retriggers, there is a queue of about 400 prior jobs, so it might be 6 hours or so before these start running, I can try to bump up the priority in a few hours when I have more time.
I never really understood tresize and how/why it's an important test for performance.

If it's mostly PGO-only, then this might be an artifact of those optimizations rather than some part of my patch being inherently costly, right?
most of the failures are not pgo, the majority is windows8, it just so happens that windows7 only has a few changes related to this patch.  As for tresize, feel free to start a conversation if you feel the test is not a useful use of our resources:
https://wiki.mozilla.org/Buildbot/Talos/Tests#tresize

As it stands all the tests we run people have determined to be important and we should all treat them as such until there is a reason to not care about them.

About half of the jobs are either run or bumped up in priority- I imagine in the next 2 hours most of the win8 jobs will be completed.
ok, the win8 data we have is enough to show big wins:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=0c871ee2a7d0&newProject=try&newRevision=46b9692ed186&framework=1

the linux tart/cart regressions are now showing up with that specific patch, possibly it is the other patch, or they got accidentally mixed in.
Joel, how do these numbers look to you?
Flags: needinfo?(jmaher)
this looks like a lot of improvements!  Here is what I don't see with this patch:
  cart summary linux64 opt e10s: 31.4 -> 37.55 (19.59% worse)
  cart summary linux64 opt: 29.33 -> 35.29 (20.3% worse)
  cart summary linux64 pgo: 21.73 -> 26.14 (20.27% worse)
  cart summary linux64 pgo e10s: 23.36 -> 27.89 (19.41% worse)

tests not run on the pushes which missing improvements for:
  tp5o_scroll summary windows8-64 opt: 4.02 -> 4.14 (2.98% worse)
  tresize windows8-64 opt: 11.66 -> 13.5 (15.83% worse)
  tresize windows8-64 opt e10s: 10.62 -> 12.27 (15.47% worse)
  tresize windows8-64 pgo: 12.32 -> 13.47 (9.3% worse)
  tresize windows8-64 pgo e10s: 9.85 -> 11.25 (14.14% worse)
  tresize windowsxp opt e10s: 12.52 -> 13.21 (5.51% worse)
  tp5o_scroll summary windowsxp opt: 4.17 -> 4.27 (2.38% worse)
  tart summary windowsxp pgo e10s: 4.72 -> 4.81 (1.95% worse)
  tps summary windows8-64 pgo: 57.87 -> 59.51 (2.84% worse)


The most concerning ones are the linux cart regressions, I looked at these and it seems with high certainty that they are caused the the original patches.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher) from comment #13)
> The most concerning ones are the linux cart regressions, I looked at these
> and it seems with high certainty that they are caused the the original
> patches.

Based on the comparisons from comment 5, those are from bug 1022573. So we're dealing with two batches of regressions from two bugs. They'll need separate fixes (and should probably have separate bugs filed).
Attachment #8783628 - Flags: review?(felipc)
Component: Untriaged → Theme
filed bug 1297806 to track the linux cart regressions!
Summary: 2.1 - 15.83% cart / tart / tresize / tsvgx (linux64, windows7-32, windows8-64) regression on push 6af49d08884d76de44efd9387bbe340921652364 (Fri Aug 12 2016) → 2.1 - 4.09% tart / tresize / tsvgx (linux64, windows7-32, windows8-64) regression on push 6af49d08884d76de44efd9387bbe340921652364 (Fri Aug 12 2016)
Priority: -- → P1
Attachment #8783628 - Flags: review?(felipc) → review+
No longer blocks: 1022573
Pushed by dgottwald@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/30f6fa5980fa
Undo performance regressions from bug 1022601 by avoiding the fill filter. r=felipe
https://hg.mozilla.org/mozilla-central/rev/30f6fa5980fa
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
the tests which seem to not be fixed are:
* linux64 tart e10s: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-inbound,34025020e068f8204ca2174832747ef815aa3b65,1%5D&series=%5Bautoland,34025020e068f8204ca2174832747ef815aa3b65,0%5D&series=%5Bfx-team,34025020e068f8204ca2174832747ef815aa3b65,0%5D&selected=%5Bmozilla-inbound,34025020e068f8204ca2174832747ef815aa3b65%5D
* winxp tp5o_scroll: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-inbound,fa572b9b7f13d45f5eba63731ffdc8dfda465469,1%5D&series=%5Bautoland,fa572b9b7f13d45f5eba63731ffdc8dfda465469,0%5D&series=%5Bfx-team,fa572b9b7f13d45f5eba63731ffdc8dfda465469,0%5D&selected=%5Bmozilla-inbound,fa572b9b7f13d45f5eba63731ffdc8dfda465469%5D
* win8 tp5o_scroll: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-inbound,0f802cbf06209bc97d1300e62ac5964a0cd5eb8b,1%5D&series=%5Bautoland,0f802cbf06209bc97d1300e62ac5964a0cd5eb8b,0%5D&series=%5Bfx-team,0f802cbf06209bc97d1300e62ac5964a0cd5eb8b,0%5D&selected=%5Bmozilla-inbound,0f802cbf06209bc97d1300e62ac5964a0cd5eb8b%5D
* winxp tresize: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-inbound,3cdf3142f6e02274b13503439d78a800c1b0feab,1%5D&series=%5Bautoland,3cdf3142f6e02274b13503439d78a800c1b0feab,0%5D&series=%5Bfx-team,3cdf3142f6e02274b13503439d78a800c1b0feab,0%5D&selected=%5Bmozilla-inbound,3cdf3142f6e02274b13503439d78a800c1b0feab%5D
* win7 tresize e10s: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-inbound,60fdd6543300dbe223fee7f05d91ecfa5b540cff,1%5D&series=%5Bautoland,60fdd6543300dbe223fee7f05d91ecfa5b540cff,0%5D&series=%5Bfx-team,60fdd6543300dbe223fee7f05d91ecfa5b540cff,0%5D&selected=%5Bmozilla-inbound,60fdd6543300dbe223fee7f05d91ecfa5b540cff%5D

possibly some of these will be fixed in bug 1297806?
(In reply to Joel Maher ( :jmaher) from comment #18)

So I wanted to check if backing out the rest of bug 1022601 would still show a difference, but somehow all results claim that there was only one base run and one new run, so they all have low confidence. What did I do wrong?

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=a2883d74c3af&newProject=try&newRevision=e9077496131c&framework=1&showOnlyImportant=0
Flags: needinfo?(jmaher)
thanks for checking this out.  I have retriggered the talos jobs to collect more data (which is needed to increase our confidence interval).  It might take a few hours to get these scheduled and run, but when you have more data points it will be much clearer!
Flags: needinfo?(jmaher)
Ugh, I thought I had checked automatic retriggers on trychooser, but it looks like I missed it...? It would be helpful if this was checked by default when running talos jobs, since afaik comparetalos is the most common use case for running talos on try.
Depends on: 1301945
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: