Closed Bug 1246443 Opened 9 years ago Closed 9 years ago

3-5% tp5o_scroll regression on inbound (v.47) on Feb 5th, from push 0dc0dc424f06

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(firefox47 fixed)

VERIFIED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: jmaher, Assigned: jnicol)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Attachments

(2 files)

Talos has detected a Firefox performance regression from your commit 0dc0dc424f066d349b7624f6e88ad42c98bec5d8. We need you to address this regression. This is a list of all known regressions and improvements related to your bug: http://alertmanager.allizom.org:8080/alerts.html?rev=0dc0dc424f066d349b7624f6e88ad42c98bec5d8&showAll=1 On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5o_scroll Reproducing and debugging the regression: If you would like to re-run this Talos test on a potential fix, use try with the following syntax: try: -b o -p linux64,win64,win32 -u none -t g1 # add "mozharness: --spsProfile" to generate profile data To run the test locally and do a more in-depth investigation, first set up a local Talos environment: https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code Then run the following command from the directory where you set up Talos: talos --develop -e <path>/firefox -a tp5o_scroll Making a decision: As the patch author we need your feedback to help us handle this regression. *** Please let us know your plans by Tuesday, or the offending patch will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
here is a compare view showing all the differences: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=2ada62724f2a&newProject=mozilla-inbound&newRevision=0dc0dc424f06&framework=1&filter=scroll there are two different bugs here, I don't have time in the next 2 days to investigate this much, so lets start with bug 1241917 :jnicol, can you take a look at this and let us know if you think this is your patch causing this talos regression?
Flags: needinfo?(jnicol)
Bug 1245550 is definitely not involved in this. It does not apply to any of the affected platforms.
It's not entirely surprising that bug 1241917 would have caused this - reducing the display port size could decrease scroll performance. That change is vital, however. Without it we're OOM crashing on lots of sites. Can we leave it in, and I'll have a more in depth look to see if there's a way to avoid this regression with it in?
Flags: needinfo?(jnicol)
thanks for the explanation! Is there anything actionable here (in the future), or should we mark this as wontfix?
From the links it seems we've regressed both e10s and non-e10s? Bug 1241917 should definitely not have regressed non-e10s. As for e10s, the scrollframe that we scroll is most likely the root scroll frame, and so bug 1241917 should not have an effect, so that seems odd as well.
Hmm yes that's a good point. The fact my patch shouldn't affect the root scroll frame I thought was a bit odd, and I hadn't even considered the non-e10s case. But Lee is also right that the other patch shouldn't have caused this. I'd suggest we keep the patch in, and keep this ticket open while I investigate further?
My guess is that transforming a rect from the root frame to every scroll frame in the document is the culprit. We should probably avoid that (the expensive transform) when 1) we don't have apz enabled, and 2) the scroll frame doesn't WantAsyncScroll(), and 3) maybe even if the scrollframe doesn't have a displayport.
Bug 1241917 made it so that a subframe's displayport base is restricted to the root composition bounds (in addition to its previous restrictions). This involved an expensive coordinate transformation causing a scrolling performance regression. This avoids restricting the displayport base to the root composition bounds unless: APZ is enabled, the frame is async-scrollable, and the frame has a display port. Review commit: https://reviewboard.mozilla.org/r/34195/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/34195/
Attachment #8717522 - Flags: review?(tnikkel)
Okay, the above patch implements what :tnikkel suggests. I think it might help, but I'm having a bit of trouble with talos and try and perfherder. My local results seem inconsistent. I did a try run and I now have this: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=2dfb45d74f42&newProject=try&newRevision=02d055409ed4&framework=1&showOnlyImportant=0 But since there's only 1 run the confidence is low. Joel, what can I do to make it do more runs / increase the confidence in the results?
Flags: needinfo?(jmaher)
You just need to retrigger the talos jobs a bunch more times so that there's more data samples. I retriggered your try push ones a bunch but you might need to do the inbound version you're comparing against as well.
(In reply to Jamie Nicol [:jnicol] from comment #9) > Okay, the above patch implements what :tnikkel suggests. I think it might > help, but I'm having a bit of trouble with talos and try and perfherder. My > local results seem inconsistent. I did a try run and I now have this: > https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla- > inbound&originalRevision=2dfb45d74f42&newProject=try&newRevision=02d055409ed4 > &framework=1&showOnlyImportant=0 > > But since there's only 1 run the confidence is low. Joel, what can I do to > make it do more runs / increase the confidence in the results? Click on the revisions in the header (e.g. https://treeherder.mozilla.org/#/jobs?repo=try&revision=02d055409ed4), this will bring you to the treeherder jobs view. From there, you can click on the jobs you want to retrigger and do so from the jobs panel (using the refresh button on the left). There's a bug open to streamline this process at some point. Feel free to ask questions like this on #treeherder, if jmaher's not around chances are someone else will be able to help during normally NA/Europe hours.
one trick is to use |--rebuild 5| at the end of your try syntax. As for the base revision, that needs to be done manually (I did that, jobs should show up in a while). In addition, I triggered some e10s jobs as the regression was there as well. This is great that you have a fix testing on try and working on verifying this! Thanks for the quick work:)
Flags: needinfo?(jmaher)
Thanks for the help everyone. It looks like that patch definitely helps with this regression. But unfortunately it caused windows7-32 glterrain to regress slightly.
Attachment #8717522 - Flags: review?(tnikkel) → review+
Comment on attachment 8717522 [details] MozReview Request: Bug 1246443 - Only restrict displayport base to root composition bounds if content has displayport; r?tnikkel https://reviewboard.mozilla.org/r/34195/#review31173 Okay, I change my mind. The only condition we want to check is if we have a displayport, because if we do we definitly want to do the more accurate displayport base calculation, and if we don't then we don't care. There is one un-obvious problem with this patch as is though. Which is that we check wasUsingDisplayPort before we call MaybeCreateDisplayPort, which (you guessed it) might create a displayport. I'll upload a patch to this bug that splits out the displayport base setting from the "maybe created display port" part so that you can fix this.
Comment on attachment 8718241 [details] [diff] [review] Split "setting displayport base" part out of MaybeCreateDisplayPort I don't think anything in MaybeCreateDisplayPort depends on the displayport base being set. I looked through the code. Correct me if I'm wrong though.
Attachment #8718241 - Flags: review?(botond) → review+
(In reply to Jamie Nicol [:jnicol] from comment #13) > Thanks for the help everyone. It looks like that patch definitely helps with > this regression. But unfortunately it caused windows7-32 glterrain to > regress slightly. I'm guessing the glterrain regression is probably not real.
do you have a link to the glterrain regression? I suspect this might be on try and that could be hard to confirm depending on what you are comparing to.
leave open til Jamie lands his revised patch.
Keywords: leave-open
Joel, I saw the glterrain regression in the link in comment 9
Comment on attachment 8717522 [details] MozReview Request: Bug 1246443 - Only restrict displayport base to root composition bounds if content has displayport; r?tnikkel Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34195/diff/1-2/
Attachment #8717522 - Attachment description: MozReview Request: Bug 1246443 - Only restrict displayport base to root composition bounds if necessary; r?tnikkel → MozReview Request: Bug 1246443 - Only restrict displayport base to root composition bounds if content has displayport; r?tnikkel
the data in compare view shows a <1% difference in glterrain, I wouldn't worry about it as it is a bimodal test on win7 and all it takes is 1 of the many data points to fall high or low and it throws the geometric mean off. Thanks for getting a patch up for review!
Latest version on mozreview is rebased on the other patch and makes the change :tnikkel suggests in comment 14, so the r+ carries over. Here is a try run with that patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f51039416fa3 Requesting checkin please.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
thanks for fixing this, I have verified this on the graphs!
Status: RESOLVED → VERIFIED
Depends on: 1346121
Assignee: nobody → jnicol
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: