Closed Bug 1315490 Opened 8 years ago Closed 8 years ago

2.33 - 3.82% tps (linux64, windows8-64) regression on push ef3d294a1cf6 (Wed Nov 2 2016)

Categories

(Toolkit :: Reader Mode, defect, P1)

52 Branch
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox52 --- affected
firefox53 --- affected

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push ef3d294a1cf6 . As author of one of the patches included in that push, we need your help to address this regression. Regressions: 4% tps summary windows8-64 pgo e10s 32.17 -> 33.4 3% tps summary windows8-64 opt e10s 35.77 -> 36.96 2% tps summary linux64 opt e10s 41.93 -> 42.91 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=3980 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
we have pretty much an across the board tps regression with this change: https://treeherder.mozilla.org/perf.html#/compare?originalProject=autoland&originalRevision=22dcdb840e8a&newProject=autoland&newRevision=ef3d294a1cf6&framework=1 ^ that compares the commit to the previous one with 6 data points for each test. you can see some subtests are the root cause of this (osx 10.10 for example): https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=autoland&originalRevision=22dcdb840e8a&newProject=autoland&newRevision=ef3d294a1cf6&originalSignature=d96c356354e1d01835f87142841749ee6d353184&newSignature=d96c356354e1d01835f87142841749ee6d353184&showOnlyImportant=1&framework=1 :evanxd, I see you are the patch author, can you take a look at this and help determine why this is happening and what we should do (accept the regression, fix it as much as we can, back out, etc.)?
Flags: needinfo?(evan)
Component: Untriaged → Reader Mode
Product: Firefox → Toolkit
Where do the test pages live? The subtest results are kind of meaningless if we can't see what pages they're based on. :-\
Flags: needinfo?(jmaher)
(I tried to look at the wiki, which has no links to them, and the bug that got filed, and dxr for build-central, and couldn't find anything anywhere. :-\ )
Gijs, thanks for looking into this. the page set is referenced a few times here: https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5 and we do mention it here as well: https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5n_pages_set One thing to keep in mind is that you need to extract these into: 'testing/talos/talos/tests/tp5n', like so: cd testing/talos/talos/tests wget https://people-mozilla.org/~jmaher/taloszips/zips/tp5n.zip unzip tp5n.zip then you can run |mach talos-test -a tps| sorry for not making that clear, if I could make the wiki clearer, please let me know!
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher) from comment #4) > Gijs, thanks for looking into this. > > the page set is referenced a few times here: > https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5 They should be referenced by the tests that use them, in this case, the tps test. > and we do mention it here as well: > https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5n_pages_set The links should ideally be https, which I would assume we have set up for people.m.o. > One thing to keep in mind is that you need to extract these into: > 'testing/talos/talos/tests/tp5n', like so: > cd testing/talos/talos/tests > wget https://people-mozilla.org/~jmaher/taloszips/zips/tp5n.zip > unzip tp5n.zip > > then you can run |mach talos-test -a tps| Do we not have a searchable web repo with them? That would be easier when it's less about running the tests and more about inspecting the test pages to see why particular subtests have regressed and not others. If I wanted to run them locally, I assume running ./mach talos-test would download these pages automatically and securely - if not, we should fix that. FWIW, there are a number of reasons why these tests would have regressed. One is that the changes from bug 1177619 regressed it, the other that some of the other changes that were ported into m-c at the same time have. It's not really clear to me from the wiki description what exactly these tests are supposed to be measuring. Is it pageload time or tab switch time, or somehow both? Ideally we should somehow ensure that the reader mode detection doesn't affect page load times at all. I'm not sure how to do that without regressing when we show the icon for pages that take a long time to fully load (ie we really don't want to wait for the "load" event, rather for DOMContentLoaded and initial layout).
Flags: needinfo?(jmaher)
Thanks :Gijs, I have made sure the wiki page is updated, including https updates. there is not a searchable repo for these pages, that would make things easier- but for now I am happy to just have this publicly available. Regarding running via mach, we have bug 1269040 filed and that might get fixed this month as we need it for other purposes.
Flags: needinfo?(jmaher)
There are two parts in patch[1] already landed in m-c, one is Bug 1177619 patch and another one is from other bugs/fixes. I did `mach talos-test -a tps` for each item for 3 times. The `Origin` item mean there is no change(patch) for m-c, `Bug 1177619 Patch` means that based on `Origin` and add the Bug 1177619 patch, and `Other Patches` means that based on `Origin` and add the other patches. | | Origin | Bug 1177619 Patch | Other Patches | |----------------------|--------|-------------------|---------------| | 1. | 36.81 | 39.04 | 37.27 | | 2. | 38.57 | 39.72 | 36.51 | | 3. | 37.46 | 37.00 | 36.47 | | AVG. | 37.61 | 38.59 | 36.75 | | Time Increasing Rate | 0% | 2.61% | -2.29% | In the experiment, we can find out that Bug 1177619 patch made 2.61% slower and other patches made 2.29% faster(maybe we should say the performance almost not change here). Looks like there is some performance issue for Bug 1177619 patch. But just like we discussed in the review process[2], we already knew the performance will be changed(a little bit badly?) and we chose to accept/take it because we think there might be no other better way to fix the issue at this stage. So looks like if we want to choose to fix Bug 1177619 now(at this stage), I think we should just take the performance change. What do you think? [1]: https://hg.mozilla.org/try/rev/ef3d294a1cf6 [2]: https://github.com/mozilla/readability/pull/310#issuecomment-256280549
Flags: needinfo?(evan) → needinfo?(jmaher)
Flags: needinfo?(gijskruitbosch+bugs)
I am not clear what the other patches are, looking on perfherder I see the regression and no [partial] fix. If we determine this is expected and there are no obvious wins, then most likely we need to accept this. Evan, when you tested this, what platform did you test on?
Flags: needinfo?(jmaher)
(In reply to Evan Tseng [:evanxd][:愛聞插低] from comment #7) > Looks like there is some performance issue for Bug 1177619 patch. But just > like we discussed in the review process[2], we already knew the performance > will be changed(a little bit badly?) We knew it would affect the performance of the reader mode readability algorithm, yes. Looking at the test pages used on the github issue, the average impact was single digits. I would not have expected that to have any measurable impact on any talos tests, because they test much more than just the readability algorithm. Especially the impact on the select sites jmaher listed earlier is much, much too large to be acceptable (30% regressions). > and we chose to accept/take it because > we think there might be no other better way to fix the issue at this stage. > So looks like if we want to choose to fix Bug 1177619 now(at this stage), I > think we should just take the performance change. What do you think? I disagree. We need to understand why the readability algorithm impacts this talos test so badly (for the heavily affected pages) and what we can do to reduce impact. As I tried to note already in comment #5, it should be possible for the readability stuff to not affect this talos test at all without regressing when we show the icon, e.g. by running it out-of-band with the relevant events we listen for that trigger the test. That said, for us to do that it would be helpful to have a better understanding of what these tests measure: (In reply to :Gijs Kruitbosch from comment #5) > It's not really clear to me from the wiki description what exactly these > tests are supposed to be measuring. Is it pageload time or tab switch time, > or somehow both? Joel? Tab switch time shouldn't be impacted by these changes at all.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(jmaher)
I will redirect to mconley- I really don't know much about this test
Flags: needinfo?(jmaher) → needinfo?(mconley)
(In reply to :Gijs Kruitbosch from comment #9) > (In reply to Evan Tseng [:evanxd][:愛聞插低] from comment #7) > > Looks like there is some performance issue for Bug 1177619 patch. But just > > like we discussed in the review process[2], we already knew the performance > > will be changed(a little bit badly?) > > We knew it would affect the performance of the reader mode readability > algorithm, yes. Looking at the test pages used on the github issue, the > average impact was single digits. I would not have expected that to have any > measurable impact on any talos tests, because they test much more than just > the readability algorithm. Especially the impact on the select sites jmaher > listed earlier is much, much too large to be acceptable (30% regressions). > > > and we chose to accept/take it because > > we think there might be no other better way to fix the issue at this stage. > > So looks like if we want to choose to fix Bug 1177619 now(at this stage), I > > think we should just take the performance change. What do you think? > > I disagree. We need to understand why the readability algorithm impacts this > talos test so badly (for the heavily affected pages) and what we can do to > reduce impact. As I tried to note already in comment #5, it should be > possible for the readability stuff to not affect this talos test at all > without regressing when we show the icon, e.g. by running it out-of-band > with the relevant events we listen for that trigger the test. > > That said, for us to do that it would be helpful to have a better > understanding of what these tests measure: Sure, let's do it. > (In reply to :Gijs Kruitbosch from comment #5) > > It's not really clear to me from the wiki description what exactly these > > tests are supposed to be measuring. Is it pageload time or tab switch time, > > or somehow both? > > Joel? The "tps" test is for measuring "the time between switching a tab and the first paint to the content area"[1]. > Tab switch time shouldn't be impacted by these changes at all. I agreed. We do `ReaderMode.isProbablyReaderable`[2] after `MozAfterPaint`[3] is triggered, which means the changes in Bug 1177619 cannot impact that. The first paint should be occurred before `MozAfterPaint` event. So maybe there might be something wrong here? [1]: https://wiki.mozilla.org/Buildbot/Talos/Tests#tps [2]: http://searchfox.org/mozilla-central/source/browser/base/content/tab-content.js#387 [3]: https://developer.mozilla.org/en-US/docs/Web/Events/MozAfterPaint (In reply to Joel Maher ( :jmaher) from comment #8) > I am not clear what the other patches are, looking on perfherder I see the > regression and no [partial] fix. If we determine this is expected and there > are no obvious wins, then most likely we need to accept this. > > Evan, when you tested this, what platform did you test on? Joel, I tested that on Mac OS.
Evan, can you do a trypush to check that backing out the specific changes from bug 1177619 from aurora reverts this performance regression? If so, I think we should land that while we figure out how to fix this properly.
Flags: needinfo?(evan)
Priority: -- → P1
(In reply to :Gijs Kruitbosch from comment #12) > Evan, can you do a trypush to check that backing out the specific changes > from bug 1177619 from aurora reverts this performance regression? > If so, I think we should land that while we figure out how to fix this properly. Hi Gijs, Just want to ensure one thing here, you want me back out the patch for Bug 1177619 first if after I do the trypush then the performance regress is reverted, right? Then we can continue to figure out the "tps" performance regression.
Flags: needinfo?(evan) → needinfo?(gijskruitbosch+bugs)
(In reply to Evan Tseng [:evanxd][:愛聞插低] from comment #13) > (In reply to :Gijs Kruitbosch from comment #12) > > Evan, can you do a trypush to check that backing out the specific changes > > from bug 1177619 from aurora reverts this performance regression? > > If so, I think we should land that while we figure out how to fix this properly. > > Hi Gijs, > > Just want to ensure one thing here, you want me back out the patch for Bug > 1177619 first if after I do the trypush then the performance regress is > reverted, right? > > Then we can continue to figure out the "tps" performance regression. No, I would like you to do: 1) trypush from aurora with talos runs 2) trypush from aurora + a patch that backs out specifically the changes from bug 1177619, with talos runs and then check with perfherder that the difference in talos results between those trypushes show that the backout actually fixes the problem that caused this bug to be reported. If that is the case, we can then land that backout patch on aurora.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(evan)
(I guess if it's easier you could do both trypushes based on central/nightly as well.)
Just so we're clear here on what tps does: 1) The test opens a new window and loads a subset of the tp5 set in a bunch of separate tabs 2) The test waits until page load has completed for each tab 3) The test measures how long it takes to switch between the initial tab and each one of the tp5 tabs. It does this by waiting for the TabSwitchDone event, and also waiting for the next MozAfterPaint event to be fired for the web content. (If I had to guess, I'd say this is where the regression is being introduced). 4) The test closes the window Here's the part where tps injects a framescript to wait for MozAfterPaint in the tab: http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/testing/talos/talos/tests/tabswitch/bootstrap.js#161 If the problem is that we're adding more event handlers for MozAfterPaint, then this shouldn't be a user-visible regression imo. What we should really do is fix bug 1264798 so that we know when the composite occurred instead of taking a timestamp at MozAfterPaint (since that can be influenced by MozAfterPaint handlers).
Flags: needinfo?(mconley)
(In reply to Mike Conley (:mconley) from comment #16) > Just so we're clear here on what tps does: > > 1) The test opens a new window and loads a subset of the tp5 set in a bunch > of separate tabs > 2) The test waits until page load has completed for each tab > 3) The test measures how long it takes to switch between the initial tab and > each one of the tp5 tabs. It does this by waiting for the TabSwitchDone > event, and also waiting for the next MozAfterPaint event to be fired for the > web content. (If I had to guess, I'd say this is where the regression is > being introduced). > 4) The test closes the window Thanks, this is really helpful. Talking to mconley, it seems that the MozAfterPaint event is used by the tps framework but not by the actual tab switch. So we're not impacting tab switch times, we're only impacting when another mozafterpaint listener is invoked, because we have our own mozafterpaint listener that in some cases now takes up to 10ms longer to run (on the hardware talos is testing on). To confirm that this is the case, I'll push a baseline off trunk to try, and a 'fix' that puts our mozafterpaint handling in a setTimeout(, 0), which will unblock the other handler but not actually change "real" tab switch times. Assuming that these trypushes show that our hypothesis is right, we believe we should wontfix this bug, as the regression does not represent actual user impact.
Flags: needinfo?(evan)
(In reply to :Gijs Kruitbosch (PTO Nov. 17-20 inclusive) from comment #18) > Baseline: > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=0042e7872a705c1211703c57c11b29016cd8eda1 > > Fix: > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=3f8f4cedd12588c4c7eaca9199c7ec3895084254 > > Perfherder link (in anticipation, I only just pushed this): > > https://treeherder.mozilla.org/perf.html#/ > compare?originalProject=try&originalRevision=0042e7872a705c1211703c57c11b2901 > 6cd8eda1&newProject=try&newRevision=3f8f4cedd12588c4c7eaca9199c7ec3895084254& > framework=1&showOnlyImportant=0 In the tps tests, looks like the hypothesis is right. So should we wontfix here? | Tests | Base | New | Time Increasing Rate | |--------------|---------------|---------------|----------------------| | tps opt | 62.59 ± 1.21% | 61.88 ± 1.17% | -1.14% | | tps opt e10s | 41.97 ± 0.63% | 37.78 ± 0.93% | -9.97% |
Yes, I think so.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
thanks for the work on this!
You need to log in before you can comment on or make changes to this bug.