Closed Bug 1169756 Opened 9 years ago Closed 9 years ago

4% Linux* tsvgx regression on Mozilla-Inbound-Non-PGO on May 26, 2015 from push 507b6aba4555

Categories

(Core :: Panning and Zooming, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox41 --- affected

People

(Reporter: jmaher, Assigned: kats)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression][gfx-noted])

Talos has detected a Firefox performance regression from your commit 507b6aba4555 in bug 889085. We need you to address this regression. This is a list of all known regressions and improvements related to your bug: http://alertmanager.allizom.org:8080/alerts.html?rev=507b6aba4555&showAll=1 On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvg.2C_tsvgx Reproducing and debugging the regression: If you would like to re-run this Talos test on a potential fix, use try with the following syntax: try: -b o -p linux -u none -t svgr # add "mozharness: --spsProfile" to generate profile data To run the test locally and do a more in-depth investigation, first set up a local Talos environment: https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code Then run the following command from the directory where you set up Talos: talos --develop -e <path>/firefox -a tsvgx Making a decision: As the patch author we need your feedback to help us handle this regression. *** Please let us know your plans by Monday, or the offending patch will be backed out! *** Our wiki page oulines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
I did a few retriggers: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=a7c9a6e1394e&tochange=93943a21b457&filter-searchStr=Ubuntu%20HW%2012.04%20x64%20mozilla-inbound%20talos%20svgr svgx went from 280-289 range up to 288-297 range. there are a series of patches landed (for the same bug): http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=d551aa12ebb1&tochange=507b6aba4555 :dvander, can you take a look at this and see if we can fix this or explain it?
Flags: needinfo?(dvander)
These patches only affect APZ, so it seems unlikely any regression here is really related.
Flags: needinfo?(dvander)
It's still possible though, and I think you are on the hook for proving that they are not assuming the range Joel got is correct. A few try pushes to bisect within the four patches might be a cheap first step.
I did a series of pushes (hg update <rev>): prior to push d551aa12ebb1 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=a1f918e801f4 first patch caa1f9974731 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb95c7e0e8e7 second patch 498b1d8f9d71 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=691afd429c15 third patch 21393b5c6963 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=e38fe4e94c7a fourth patch 507b6aba4555 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=3cf0599b555e in this case the fourth patch has everything applied. we will need to retrigger once the builds are done. I prefer 6 data points for each one.
ok, 498b1d8f9d71 is where we increase our numbers: http://hg.mozilla.org/integration/mozilla-inbound/rev/498b1d8f9d71
Certainly looks like that patch might have affected non-APZ codepaths.
Assignee: nobody → dvander
Whiteboard: [talos_regression] → [talos_regression][gfx-noted]
Can we get a timeline for fixing this?
dvander, let me know if you have cycles to look into this. If not maybe tn or mstange could, or I can look as well (on Monday).
I won't have time until after Whistler, at least. This doesn't look too high priority since it's a small delta, Linux-only.
:mstange, could you take a look at this apz patch: http://hg.mozilla.org/integration/mozilla-inbound/rev/498b1d8f9d71 and see why it might be affecting linux tsvgx performance. I only call you out as :kats mentioned you might be a good candidate this upcoming week to look at the issue. Maybe someone else can jump on it. We should fix this before the next uplift (June 29th)
Flags: needinfo?(mstange)
this bug is going silent- I want to make sure we have a decision on this before we uplift to aurora (monday after whistler)
I'll do some try pushes to see if I can determine the problem.
Assignee: dvander → bugmail.mozilla
Flags: needinfo?(mstange)
Thanks! Can you also do two try pushes (before and after) with "mozharness: --spsProfile"? I'm curious whether the profiles show the problem.
So I looked at the changes in the regressing patch (498b1d8f9d71) and did a couple of try pushes, reverting the main changes in that patch. The revert that I made at [1] was the winner, and fixed the regression (you can compare [2] and [3] to see this; [2] is the try push that joel did at 498b1d8f9d71 and [3] is that plus my revert). The at some point during the tsvgx test, !mShouldBuildScrollableLayer || mIsScrollableLayerInRootContainer is evaluating to true and the change dvander made causes the *aClipRect out-param to not get populated in this case. However the next patch in the original bug [4] removes that out-param entirely, and moves some of the code around. If my interpretation of this change is correct, then my new try push at [5], based on m-c, should fix the regression. [1] https://hg.mozilla.org/try/rev/3570daa8fb57 [2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=691afd429c15 [3] https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a786479f4f4 [4] http://hg.mozilla.org/integration/mozilla-inbound/rev/21393b5c6963 [5] https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b83738ca10f
(In reply to Markus Stange [:mstange] from comment #13) > Thanks! Can you also do two try pushes (before and after) with "mozharness: > --spsProfile"? I'm curious whether the profiles show the problem. Before: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e7ee4b75225f After: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ce7ab5a1a13 Note that in both of these tsvgx is pretty high (the "before" should really have values < 350), so you may want to do a couple of retriggers to get a good profile of the "before" case.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #14) > If my interpretation of this > change is correct, then my new try push at [5], based on m-c, should fix the > regression. > [5] https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b83738ca10f That change was deliberate. If we have an abs pos child of the scroll frame with opacity we would wrongly clip it. This could lead to perf regressions because we don't draw something we should be drawing for correctness. I don't know if tsvgx is actually triggering this case though.
(In reply to Timothy Nikkel (:tn) from comment #16) > That change was deliberate. If we have an abs pos child of the scroll frame > with opacity we would wrongly clip it. This could lead to perf regressions > because we don't draw something we should be drawing for correctness. I > don't know if tsvgx is actually triggering this case though. tsvgx must be triggering that case or the regression wouldn't have gone away when I reverted that bit of the patch. It sounds like the change is intentional then, and the perf regression is acceptable because we get correctness in return. So I think we can close this bug as WONTFIX. Joel, any objections?
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
(In reply to Timothy Nikkel (:tn) from comment #16) > That change was deliberate. If we have an abs pos child of the scroll frame > with opacity we would wrongly clip it. This could lead to perf regressions > because we don't draw something we should be drawing for correctness. I > don't know if tsvgx is actually triggering this case though. So you've fixed bug 1145263 with this? The testcase in there seems to work now.
We fixed that specific test case for non-apz cases only.
thanks for looking into this guys!
You need to log in before you can comment on or make changes to this bug.