Closed Bug 1220179 Opened 9 years ago Closed 7 years ago

3-6% Linux 64/Win* tps regression on Mozilla-Inbound on October 30, 2015 from push 2e19045ba652ca2a5a5fc0e20d6f95293acfa32d

Categories

(Core :: Layout: Images, Video, and HTML Frames, defect)

45 Branch
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox45 --- affected

People

(Reporter: jmaher, Assigned: seth)

References

(Depends on 1 open bug)

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Talos has detected a Firefox performance regression from your commit 2e19045ba652ca2a5a5fc0e20d6f95293acfa32d in bug 1207355.  We need you to address this regression.

This is a list of all known regressions and improvements related to your bug:
http://alertmanager.allizom.org:8080/alerts.html?rev=2e19045ba652ca2a5a5fc0e20d6f95293acfa32d&showAll=1

On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tps

Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64,win64,win32 -u none -t g2  # add "mozharness: --spsProfile" to generate profile data

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a tps

Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Monday, or the offending patch will be backed out! ***

Our wiki page oulines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
here is a comparison view:
https://treeherder.allizom.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=99a4fb4ba5c1&newProject=mozilla-inbound&newRevision=2e19045ba652

There are a lot of noisy tests, which some of the changes are, but looking over the alerts we got- you can see the compare view showing those differences.

I have also done retriggers to ensure this is the problem:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=41d905020339&tochange=cb5ed62e361e&filter-searchStr=mozilla-inbound%20talos%20g2

:seth, what are your thoughts on this?  Is this regression expected on the platforms?  Is there any ways you can think of to improve this?  do you need help pushing to try to bisect it to a specific patch?
Flags: needinfo?(seth)
I have a feeling the chunk in nsImageLoadingContent that was removed was added was back when because of talos. Adding it back to see could be informative.
(In reply to Timothy Nikkel (:tn) from comment #2)
> I have a feeling the chunk in nsImageLoadingContent that was removed was
> added was back when because of talos. Adding it back to see could be
> informative.

...was added *way* back when...
Seth mentioned some ideas about this on irc.  when this was added *way* back, is that prior to July 15th (when tps was turned on)?
Yes, much before, 2009 to be specific. I didn't know that this was a new talos test.
So my theory, and I'm pretty sure it's correct, is that the regression comes from us no longer triggering decoding for CSS images immediately on tab switch. I'm working on patches that will fix that. It's much easier to fix this with bug 1157546 in place, so I'm going to make it block this bug.
Depends on: 1157546
Flags: needinfo?(seth)
Actually, I should just make bug 1218990 block this, since that's the real fix.

I pushed a baseline Talos run (which is posted in bug 1157546) so we can compare with bug 1218990 and verify the fix. I'd expect our tps performance to return to the level before bug 1207355 landed, or perhaps even get better than that.
Depends on: 1218990
No longer depends on: 1157546
Setting assignee to reflect above comments.
Assignee: nobody → seth
this is now live on Aurora, we see regressions on:
win7
winxp
win7-e10s
linux64
linux64-e10s

:seth, is there an expected time to land the fixes for this?
Flags: needinfo?(seth)
Just wanted to post an update here: this hasn't gotten fixed because I've had a hard time getting bug  	1157546 landed. We're finally seeing some movement there, so I hope to get this fixed soon.
Version: Trunk → 45 Branch
I assume we can close this out now that bug 1157546 is landed?
(In reply to Joel Maher ( :jmaher) from comment #11)
> I assume we can close this out now that bug 1157546 is landed?

No, bug 1157546 was just some infrastructure work. It didn't change how we handle CSS images, just re-implemented what we already had. If we are assuming comment 6 is correct in pinpointing the cause of this regression at least.

So this regression is probably still in, and we should fix it.
seth hasn't replied in 2 years, I think it is safe to assume he is not going to reply on this topic- I recommended closing this out 7 months ago and nobody has worked on it.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(seth.bugzilla)
Resolution: --- → WONTFIX
Product: Core → Core Graveyard
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.