Closed
Bug 1165693
Opened 10 years ago
Closed 9 years ago
2.72% Linux* tp5o regression on Mozilla-Inbound-Non-PGO (v.41) on May 12, 2015 from push d6d25651b082
Categories
(Testing :: Talos, defect)
Tracking
(firefox41 fixed)
RESOLVED
FIXED
mozilla41
Tracking | Status | |
---|---|---|
firefox41 | --- | fixed |
People
(Reporter: vaibhav1994, Assigned: jfkthame)
References
Details
Attachments
(2 files)
(deleted),
patch
|
jtd
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jtd
:
review+
|
Details | Diff | Splinter Review |
Talos has detected a Firefox performance regression from your commit d6d25651b082 in bug 1056479. We need you to address this regression.
This is a list of all known regressions and improvements related to your bug:
http://alertmanager.allizom.org:8080/alerts.html?rev=d6d25651b082&showAll=1
On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5
Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux -u none -t tp5o # add "mozharness: --spsProfile" to generate profile data
To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code
Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a tp5o
Making a decision:
As the patch author we need your feedback to help us handle this regression.
Our wiki page oulines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Comment 1•10 years ago
|
||
John, the revisions causing the regression is one of: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=d6d25651b082
Here are the re-triggered pushes on mozilla-inbound-non-pgo: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=2d8698ff73b6&tochange=c48d5ce78376&filter-searchStr=Ubuntu%20HW%2012.04%20x64%20mozilla-inbound%20talos%20tp5o
Note that, it is not necessary to backout the offending patches. Since it is 2.72% only, it will be good if we can fix it else we can mark it as wontfix if it is too time consuming.
Depends on: 1056479
Flags: needinfo?(jdaggett)
Updated•10 years ago
|
Assignee: nobody → jdaggett
Flags: needinfo?(jdaggett)
Comment 2•10 years ago
|
||
(In reply to Vaibhav (:vaibhav1994) from comment #1)
> John, the revisions causing the regression is one of:
> http://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?changeset=d6d25651b082
Yes, this is the change in the font backend for Linux. We're now using a larger amount of shared code across platforms with less Linux-specific code.
I'll take and look and see how we can tune this a bit.
Comment 3•10 years ago
|
||
That sounds great, thanks John!
Updated•10 years ago
|
Comment 5•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #4)
> John, any luck with this? It has been a while.
I was hoping some of the other fixes would bring down the tp5o numbers but it looks like it didn't. I'll do some more tuning.
Flags: needinfo?(jdaggett)
Comment 6•9 years ago
|
||
Hey :jtd, I am checking in to see if this is still an active bug for you or not. If there is a timeline to work on this great, if not- I prefer to close this as wontfix.
Flags: needinfo?(jdaggett)
Assignee | ||
Comment 7•9 years ago
|
||
To get a 2.7% regression on tp5o, it seems like the underlying regression in the font backend must be considerably greater, as font management/selection is only a small part of tp5o's overall work. So I really hope we can identify what's behind this (maybe profiling a Linux tp5o run would identify hotspots in the font code?) and find a way to improve it.
Assignee | ||
Comment 8•9 years ago
|
||
I pushed a try run with profiling, and from looking at a few of the resulting profiles, it appears that the FcConfigSubstitute() calls in gfxFcPlatformFontList::FindFamily can be pretty expensive. E.g. in
http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try-Non-PGO/sha512/6768d1d0db0589a8b56f292293f99b3ab65fb5d115b15cd6671b24a1755f4b28f63abab161c91e0cc956678d36ee8da0af6904cc1a8be2dd39b411ed8d9eef44&pathInZip=profile_tp5o/huffingtonpost.com/cycle_0.sps
it looks like 3% of the total running time is spent in those calls alone.
So if we can find some alternative that avoids calling FcConfigSubstitute (twice, even!), this should help significantly. Or at least cache the sentinelFirstFamily instead of recomputing it each time, and only invalidate the cached value if the font list changes. That should virtually halve the cost of FindFamily().
Comment 9•9 years ago
|
||
oh awesome stuff!
Assignee | ||
Comment 10•9 years ago
|
||
This should provide a small but (probably) measurable improvement, afaict; though if we could avoid the other FcConfigSubstitute call too, that would be more better.
Attachment #8616179 -
Flags: review?(jdaggett)
Assignee | ||
Comment 11•9 years ago
|
||
Tryserver tp5o run with the patch above gives an average of 304.213:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=954c9ddefc70
Tryserver tp5o run on the parent of the above (i.e. without the patch), average 314.47:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=de0e7394c67d
Which suggests an improvement of slightly over 3%. :)
Comment 12•9 years ago
|
||
well done!
Comment 13•9 years ago
|
||
Comment on attachment 8616179 [details] [diff] [review]
Cache the result of calling FcConfigSubstitute for our sentinel font name, to make gfxFcPlatformFontList::FindFamily less expensive
Review of attachment 8616179 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good!
Attachment #8616179 -
Flags: review?(jdaggett) → review+
Assignee | ||
Comment 14•9 years ago
|
||
Assignee | ||
Comment 15•9 years ago
|
||
We should be able to get an additional boost of a similar amount by caching the actual familyname lookups in FindFamily, so that we can bypass the expensive FcConfigSubstitute call on any subsequent lookups for the same family name. Tryserver job https://treeherder.mozilla.org/#/jobs?repo=try&revision=23c1f410f8ed shows a tp5o average of 292.14 with this added patch.
Attachment #8616771 -
Flags: review?(jdaggett)
Assignee | ||
Updated•9 years ago
|
Assignee: jdaggett → jfkthame
Status: NEW → ASSIGNED
Assignee | ||
Comment 16•9 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #15)
> tp5o average of 292.14 with this added patch.
Err... I mean an average of 293.574, but that's still worthwhile. :)
Comment 17•9 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #16)
> (In reply to Jonathan Kew (:jfkthame) from comment #15)
>
> > tp5o average of 292.14 with this added patch.
>
> Err... I mean an average of 293.574, but that's still worthwhile. :)
The tp5o numbers tend to vary widely, it's actually better to use the PGO build numbers because those tend to vary a whole lot less and regressions/fixes are easier to see.
Flags: needinfo?(jdaggett)
Comment 18•9 years ago
|
||
Comment on attachment 8616771 [details] [diff] [review]
patch 2 - Cache family-name lookups in gfxFcPlatformFontList::FindFamily, to avoid repeating expensive calls to FcConfigSubstitute
Review of attachment 8616771 [details] [diff] [review]:
-----------------------------------------------------------------
> // Example:
> //
> // serif ==> DejaVu Serif, ...
> // Helvetica, serif ==> Helvetica, TeX Gyre Heros, Nimbus Sans L, DejaVu Serif
> //
> // In this case fontconfig is including Tex Gyre Heros and
> // Nimbus Sans L as alternatives for Helvetica.
>
> + NS_ConvertUTF16toUTF8 familyToFind(familyName);
> + gfxFontFamily* cached = mFcSubstituteCache.GetWeak(familyToFind);
> + if (cached) {
> + return cached;
> + }
> +
We should probably rewrite the comment here to explain what's going on
below, since at this point that comment is outdated.
Attachment #8616771 -
Flags: review?(jdaggett) → review+
Comment 19•9 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox41:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Comment 20•9 years ago
|
||
verified on graph server!
Assignee | ||
Comment 21•9 years ago
|
||
Comment 22•9 years ago
|
||
Assignee | ||
Comment 23•9 years ago
|
||
FTR, the first landing here resulted in:
>>>>>
Improvement: Mozilla-Inbound - Tp5 Optimized - Ubuntu HW 12.04 x64 - 2.91% decrease
-----------------------------------------------------------------------------------
Previous: avg 244.216 stddev 1.925 of 12 runs up to revision edaec128d6ec
New : avg 237.105 stddev 1.470 of 12 runs since revision e03144903268
Change : -7.112 (2.91% / z=3.694)
Graph : http://mzl.la/1I0GTZm
Changeset range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=edaec128d6ec&tochange=e03144903268
<<<<<
and then the second patch gave us in addition:
>>>>>
Improvement: Mozilla-Inbound-Non-PGO - Tp5 Optimized - Ubuntu HW 12.04 - 2.73% decrease
---------------------------------------------------------------------------------------
Previous: avg 348.147 stddev 1.815 of 12 runs up to revision 5389f1e4087c
New : avg 338.638 stddev 2.570 of 12 runs since revision 0c35e99b2167
Change : -9.509 (2.73% / z=5.240)
Graph : http://mzl.la/1GyBNsw
Changeset range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=5389f1e4087c&tochange=0c35e99b2167
<<<<<
You need to log in
before you can comment on or make changes to this bug.
Description
•