Closed Bug 1773629 Opened 2 years ago Closed 2 years ago

149.21 - 105.95% imdb fcp / imdb ContentfulSpeedIndex (Windows) regression on Wed June 8 2022

Categories

(Testing :: Raptor, defect, P2)

Firefox 103
defect

Tracking

(firefox-esr91 unaffected, firefox-esr102 unaffected, firefox101 unaffected, firefox102 unaffected, firefox103 wontfix, firefox104 fixed)

RESOLVED FIXED
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox101 --- unaffected
firefox102 --- unaffected
firefox103 --- wontfix
firefox104 --- fixed

People

(Reporter: alexandrui, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(1 file)

Perfherder has detected a browsertime performance regression from push 643fad8de973ede4c003baf665639de5052a829d. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
149% imdb fcp windows10-64-shippable-qr fission warm webrender 242.67 -> 604.75
106% imdb ContentfulSpeedIndex windows10-64-shippable-qr fission warm webrender 343.17 -> 706.75

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
31% google-slides LastVisualChange linux1804-64-shippable-qr fission warm webrender 2,563.33 -> 1,766.67
22% google-slides LastVisualChange windows10-64-shippable-qr fission warm webrender 2,547.50 -> 1,997.50
13% google-docs LastVisualChange windows10-64-shippable-qr fission warm webrender 5,716.92 -> 4,989.17
6% youtube-watch loadtime android-hw-a51-11-0-arm7-shippable-qr warm webrender 971.40 -> 910.50
5% youtube-watch LastVisualChange android-hw-a51-11-0-arm7-shippable-qr warm webrender 1,558.00 -> 1,483.75
5% youtube-watch PerceptualSpeedIndex android-hw-a51-11-0-arm7-shippable-qr warm webrender 939.33 -> 896.33

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(eitimielo)

Set release status flags based on info from the regressing bug 1741787

Has Regression Range: --- → yes

Hi Denis these are regressions/changes from the warm pageload bytecache changes.
Do you see anything concerning here?

Flags: needinfo?(eitimielo) → needinfo?(dpalmeiro)
Severity: -- → S4
Priority: -- → P2

Could we add some profiles or videos to the bug please? It would also be good practice if we could do this automatically for any regression bug. Thanks!

Flags: needinfo?(dpalmeiro) → needinfo?(eitimielo)

Hi Alex, (as Advised by sparky) please can you provide a side-by-side video recording along with a couple treeherder links to before/after profiler runs. (I think it's for imdb)

Flags: needinfo?(eitimielo) → needinfo?(aionescu)
Attached video warm-side-by-side.mp4 (deleted) —
Flags: needinfo?(aionescu)

Thanks Alex

Still pending profiler links afaict.

Flags: needinfo?(aionescu)

:denispal let us know if you need anything else, thanks!

Flags: needinfo?(dpalmeiro)

Set release status flags based on info from the regressing bug 1741787

See this bug for another possible regression: https://bugzilla.mozilla.org/show_bug.cgi?id=1774420

:arai/:denispal, could either of you look into these regressions? I'm considering requesting a back out for this patch as it seems to have made the variance in our tests much worse.

Here's a graph for ny-times showing how the noise went from being minor to very large: https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&selected=3777014,1540090385&series=autoland,3777014,1,13&timerange=7776000&zoom=1654682517687,1654715156439,980.5071408827089,1147.7607521690918

The tests in the alert summary also show large increases in variance.

Flags: needinfo?(dpalmeiro) → needinfo?(arai.unmht)
Flags: needinfo?(dpalmeiro)

Just to make sure (because no one mentions this here), the patch changes what situation "warm" score targets.

Before the patch, it's equivalent of the 2nd pageload in the regular browsing outside of test, where files are cached, and JavaScript is compiled again.
After the patch, it's equivalent of the 5th pageload in the regular browsing outside of test, where files are cached and the JavaScript bytecode is also cached and those JavaScript isn't compiled.

So, this is not actually a regression, but this means either:

  • JavaScript bytecode cache doesn't work well for this case, or
  • There's something wrong with the test harness

Then, now I think that we should revert the "warm" to the previous one, and add another variant for bytecode cache case.
So that we can keep tracking the same thing as before for "warm", and also track the bytecode cache case separately, and also we can check if bytecode cache is actually improving the performance, for each run.

Flags: needinfo?(arai.unmht)

:arai, sounds good. I'll back out this patch, and make a new bug to add an extra test-type for the bytecode cache. We can run a similar but smaller set of tests with that variant.

Actually, we might just do the switch to a new test suite instead of a backout. I have a possible patch here for it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7a187e51823db646f36d35b805c7da3fbfeac187

Depends on: 1779468

This is resolved by bug 1779468. We created a new set of tests called browsertime-tp6-bytecode that runs with the prepopulated bytecode cache on warm pageloads. Our existing tests will revert to the previous behaviour.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Flags: needinfo?(dpalmeiro)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: