1259923 - 2 - 15.37% almost all talos tests (windows7-32, windows8-64, windowsxp) regression on push 559a80645f20 (Thu Mar 24 2016)

Reporter

Description

•

9 years ago

Talos has detected a Firefox performance regression from push 559a80645f20. As author of one of the patches included in that push, we need your help to address this regression. This is a list of all known regressions and improvements related to the push: https://treeherder.mozilla.org/perf.html#/alerts?id=600 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#a11y https://wiki.mozilla.org/Buildbot/Talos/Tests#ts_paint https://wiki.mozilla.org/Buildbot/Talos/Tests#tpaint https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5 https://wiki.mozilla.org/Buildbot/Talos/Tests#Dromaeo_Tests https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvg-opacity https://wiki.mozilla.org/Buildbot/Talos/Tests#TART.2FCART https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5o_scroll https://wiki.mozilla.org/Buildbot/Talos/Tests#installer size https://wiki.mozilla.org/Buildbot/Talos/Tests#CanvasMark https://wiki.mozilla.org/Buildbot/Talos/Tests#tabpaint https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvgx https://wiki.mozilla.org/Buildbot/Talos/Tests#xperf https://wiki.mozilla.org/Buildbot/Talos/Tests#DAMP https://wiki.mozilla.org/Buildbot/Talos/Tests#tps https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore Reproducing and debugging the regression: If you would like to re-run this Talos test on a potential fix, use try with the following syntax: try: -b o -p win32,win64 -u none -t all[Windows 7,Windows 8,Windows XP] --rebuild 5 # add "mozharness: --spsProfile" to generate profile data (we suggest --rebuild 5 to be more confident in the results) To run the test locally and do a more in-depth investigation, first set up a local Talos environment: https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code Then run the following command from the directory where you set up Talos: talos --develop -e [path]/firefox -a a11yr:ts_paint:tpaint:tp5o:dromaeo_css:tsvgr_opacity:tart:cart:tp5o_scroll:tcanvasmark:tabpaint:tsvgx:tp5n:damp:tps:sessionrestore_no_auto_restore:sessionrestore (add --e10s to run tests in e10s mode) Making a decision: As the patch author we need your feedback to help us handle this regression. *** Please let us know your plans by Tuesday, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

9 years ago

:gps, this is showing quite a difference from what we saw on try server. I believe we should back this out and work on figuring out why our perf regressed so much on just about every test. Here is a comparison of the vs2015 push vs the previous one: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=8d59e338a5bd&newProject=mozilla-inbound&newRevision=559a80645f20&framework=1 If it was just the 3 tests that we saw on try regressing, I would be more comfortable with it. As Windows is our top platform for desktop users, making an across the board perf hit on what we ship (pgo, we really don't see regressions on opt!) doesn't seem like a good win. Possibly there are other thoughts or ideas on how to reduce/resolve these regressions?

Flags: needinfo?(gps)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

9 years ago

Component: Untriaged → Build Config

Product: Firefox → Core

Gregory Szorc [:gps]

Comment 2

•

9 years ago

The difference between Try and non-Try results appears alarming, I agree. Percentage wise, the biggest regression is in a11yr, with 5.5%-17% decrease. Looking into this deeper, I think something is wonky with PGO and this test. The base for a11yr opt windows7-32 is 747.78 ± 0.92%. Base for PGO is 393.22 ± 0.40%. That's nearly a 2x difference. Percentage wise, that's much larger than the benefit we typically see from PGO. There is something fishy going on. cart is showing a <5% regression. I /think/ in bug 1254767 you were only reporting regressions >5%, so cart didn't make the list there. I suspect it was always regressing, but just under the reporting threshold in the try runs. Ditto for some other tests like tps which also didn't regress by more than 2-3%. As for backing out VS2015 because of perf, that's not my call and I'm not sure whose it is. Perhaps we should escalate to RelMan? It's worth pointing out that in all cases PGO results are better than non-PGO results. It's just that VS2015u1 PGO isn't as good as VS2013 PGO, it appears. Apparently there are a number of improvements coming in VS2015u2 (which is currently in RC). We could do a try push with VS2015u2 and compare against VS2015u1. If Update 2 claws back the perf, perhaps we can live with Update 1 for a few weeks until Update 2 is officially released. We are already tentatively planning on aggressively adopting Update 2 (see bug 1259782). It would be a difficult pill to swallow to back out VS2015u1 and wait for Update 2 final, as I feel having central/Nightly on VS2015u1 is valuable. If we spend the engineering time to investigate why there are PGO regressions, we could report them to Microsoft and see if they can improve PGO performance in future Visual Studio releases. Google has had success getting Microsoft to listen. If we adopt and test Visual Studio releases quicker, we can catch these regressions and hopefully get them fixed sooner. Perhaps we should be standing up automation that periodically builds with the next pre-release versions of Microsoft's toolchains. We can certainly do that with VS2015u2RC right now...

Flags: needinfo?(gps)

Gregory Szorc [:gps]

Comment 3

•

9 years ago

I was able to build with Visual Studio Update 2 RC on Try. Here are PGO results comparing VS2015u1 and VS2015u2 RC: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=56540b2bbe8a&newProject=try&newRevision=df78bff64a89&framework=1&showOnlyImportant=0 Somewhat surprisingly, there were only 2 changes of statistical significance. There goes my theory that VS2015u2 would claw back some performance losses :/

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 4

•

9 years ago

Thanks for trying out the vs2015u2 release, it is too bad that didn't change our pgo performance. If we do think we can get somewhere in the next few weeks with other options, then I wouldn't see harm in leaving this in. Regarding pgo vs opt- We only ship pgo (at least for Windows), so the biggest concern for performance is the pgo numbers of vs2013 vs pgo numbers of vs2015. I agree that pgo numbers are noticeably better than non pgo- that is a good thing! In the past we have had pgo regressions which pop up related to random code added/removed- I suspect this is an outcome of the overall pgo process. Are there other flags/options that we could use? Should we talk to Microsoft's compiler team and see if they have suggestions? Maybe we could try different actions while running the browser to generate a different profile.

Gregory Szorc [:gps]

Comment 5

•

9 years ago

I did something different with yesterday's Try experiment. First, I triggered multiple Talos jobs from the initial build job. This showed 2 "important" changes (>2%). Later, I triggered whole new *build* jobs. These in turn scheduled Talos jobs. So, instead of Talos jobs derived from a single build job, we have ~6 Talos jobs derived from a single build job and another 10-11 Talos jobs derived from separate build jobs. The end result is the "important" changes disappeared! What I suspect is happening is that variations between each PGO profile run manifest in statistically different performance characteristics of the produced binary and these manifest in differences in Talos results. I think I'll conduct the same experiment with VS2013. This should introduce more variance in the VS2013 base numbers and should hopefully paint a clearer picture of what the change in behavior between VS2013 and VS2015 actually is.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 6

•

9 years ago

thanks for pointing that out- I would imagine we would see variance per build, maybe this will help answer some of the unknowns.

Gregory Szorc [:gps]

Comment 7

•

9 years ago

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=66a0b196cbe7&newProject=try&newRevision=56540b2bbe8a&framework=1&showOnlyImportant=0 compares VS2013 and VS2015u1 using multiple PGO builds. The results appear more or less consistent with the regressions reported with a single PGO build :/ Although I didn't dig into that data too deeply.

(not currently active) Ted Mielczarek

Updated

•

9 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1254767

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

9 years ago

Blocks: 1267562
No longer blocks: 1256666

status-firefox48: --- → disabled

status-firefox49: --- → affected

Version: unspecified → 49 Branch

Liz Henry (:lizzard Please n-i to RyanVM, jcristau, or pascal)

Comment 9

•

9 years ago

It sounds like we have a lot of good reasons to make these changes but aren't sure what impact it may have on Firefox users. So, good for developers to have a faster build time and better toolchain, but I worry about a 15% perf hit across the board for users, so it's a tradeoff. Tracking for 49 for now.

tracking-firefox49: --- → +

Liz Henry (:lizzard Please n-i to RyanVM, jcristau, or pascal)

Updated

•

9 years ago

tracking-firefox49: + → blocking

(no longer active)

Comment 10

•

9 years ago

There are a few options on what we can investigate/do here that I can think of, in no particular order. 1. Give up attempting to switch to Visual Studio 2015 and retry with a newer version of Visual Studio later. 2. Verify that we're actually PGOing. I remember something about a bug at some point which caused us to actually not PGO stuff as part of a PGO build which manifested itself as Talos regressions. For example using pgomgr to verify that we're collecting runtime statistics: <https://msdn.microsoft.com/en-us/library/2kw46d8w.aspx> 3. Enhance the set of things that we run in the profiling phase of the PGO build in the hopes of getting the PGO compiler to optimize more things. 4. Take the Talos regressions and move on. 5. Investigate other large projects such as Chromium to see if they have encountered similar issues when switching to 2015. 6. Try to investigate what's going on and report to Microsoft, similar to <https://randomascii.wordpress.com/2016/03/24/compiler-bugs-found-when-porting-chromium-to-vc-2015/> for example.

Liz Henry (:lizzard Please n-i to RyanVM, jcristau, or pascal)

Comment 11

•

9 years ago

I'm not sure if we need to decide this and act before the merge (next Monday) or if we can let 49 go to aurora and decide and change our approach then. Let's meet and discuss it today.

Liz Henry (:lizzard Please n-i to RyanVM, jcristau, or pascal)

Comment 12

•

8 years ago

We followed up by looking at other ways to test performance and nothing else showed a performance hit. So I don't think this needs to block vs2015 builds for 49. We should keep an eye out for bugs in aurora and beta. I'll mention that to QA and in the channel meeting. However, I'm not sure what that means for our tests. I'll leave that to Joel and the perf/Talos team. For external benchmarking: arewefastyet looking at mostly JS but also DOM, WebGL & Web Audio showed no noticeable slowdown for vs2015. Jukka tested for the openwebgames project: Overall it looks like that most demos have a tiny win for vs2015. From manual testing by Andrei and Engineering QA: The only difference we saw in terms of performance, following the comparison of these builds, is related to page scrolling and switching between tabs, both in favour of the vs2015 build. Loading pages, opening new windows, new tabs, scrolling pages, switching between tabs, etc. We've also benchmarked these builds using Dromaeo and CanvasMark.

status-firefox49: affected → wontfix

tracking-firefox49: blocking → +

Ryan VanderMeulen [:RyanVM]

Comment 13

•

8 years ago

Is this bug a wontfix at this point?

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 14

•

8 years ago

I would say wontfix, I will let :gps make the final call.

Flags: needinfo?(gps)

Gregory Szorc [:gps]

Comment 15

•

8 years ago

Yeah, it's a wontfix given comment #12.

Status: NEW → RESOLVED

Closed: 8 years ago

Flags: needinfo?(gps)

Resolution: --- → WONTFIX

BMO Automation

Updated

•

7 years ago

Product: Core → Firefox Build System

Bugzilla

2 - 15.37% almost all talos tests (windows7-32, windows8-64, windowsxp) regression on push 559a80645f20 (Thu Mar 24 2016)

Categories

(Firefox Build System :: General, defect)

Tracking

(firefox48 disabled, firefox49+ wontfix)

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Updated

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated