Closed Bug 1298608 Opened 8 years ago Closed 8 years ago

2.63 - 20.65% sessionrestore_no_auto_restore / tart / tp5n main_normal_fileio / tp5o Main_RSS / tp5o Private Bytes / tp5o_scroll / tsvgr_opacity (windows7-32, windowsxp) regression on push 209072396aa5ab5c5a0a28109a980dbbcd884922 (Thu Aug 25 2016)

Categories

(Core :: Graphics, defect)

51 Branch
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox51 --- affected

People

(Reporter: jmaher, Assigned: mchang)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push 209072396aa5ab5c5a0a28109a980dbbcd884922. As author of one of the patches included in that push, we need your help to address this regression. Regressions: 21% tp5n main_normal_fileio windows7-32 pgo 52101054.17 -> 62861425.83 18% tp5n main_normal_fileio windows7-32 opt 52715503.92 -> 62263896.33 17% tart summary windowsxp opt 6.03 -> 7.07 15% tsvgr_opacity summary windows7-32 opt 695 -> 797.92 15% tsvgr_opacity summary windowsxp opt 689.07 -> 789.59 14% tp5o_scroll summary windowsxp pgo 3.17 -> 3.62 14% tart summary windowsxp pgo 4.72 -> 5.39 14% tsvgr_opacity summary windowsxp opt 686.62 -> 782.7 10% tp5o_scroll summary windowsxp opt 4.37 -> 4.79 5% tp5o Main_RSS windows7-32 pgo e10s 103321503.91 -> 108475935.56 5% tp5o Main_RSS windows7-32 opt 170334630.52 -> 178443877.92 4% tp5o Private Bytes windows7-32 pgo 200766365.94 -> 207989087.29 4% tp5o Main_RSS windowsxp pgo 218783527.86 -> 226472317.73 3% tp5o Private Bytes windowsxp opt 204362350.89 -> 211405027.36 3% tp5o Private Bytes windowsxp pgo 202393707.92 -> 209308076.8 3% tp5o Main_RSS windowsxp opt 222326566.25 -> 229492442.28 3% tp5o Main_RSS windows7-32 pgo 169171122.74 -> 174158572.74 3% sessionrestore_no_auto_restore windowsxp pgo 694.92 -> 713.17 Improvements: 58% tsvgr_opacity summary windows7-32 pgo e10s 934.8 -> 388.74 57% tsvgr_opacity summary windowsxp pgo e10s 901.89 -> 388.68 39% tsvgr_opacity summary windows7-32 opt e10s 1086.24 -> 665.17 38% tsvgr_opacity summary windowsxp opt e10s 1039.3 -> 641.29 32% tsvgx summary windowsxp opt e10s 272.79 -> 186.83 28% tsvgx summary windows7-32 pgo 400.21 -> 287.02 28% tsvgx summary windows7-32 opt e10s 274.5 -> 198.56 27% tsvgx summary windowsxp pgo e10s 232.65 -> 168.79 27% tsvgx summary windowsxp pgo 405.4 -> 296.45 26% tps summary windows7-32 pgo e10s 46.95 -> 34.55 26% tsvgx summary windows7-32 opt 464.12 -> 344.55 26% tsvgx summary windows7-32 pgo e10s 221.85 -> 164.86 25% tsvgx summary windowsxp opt 465.3 -> 348.81 23% cart summary windowsxp pgo 40.64 -> 31.23 22% tps summary windows7-32 opt e10s 53.58 -> 41.56 22% tps summary windowsxp pgo e10s 40.91 -> 31.87 21% tsvgr_opacity summary windows7-32 pgo 563.79 -> 445.88 18% tsvgr_opacity summary windowsxp pgo 550.86 -> 449.83 18% tps summary windowsxp opt e10s 48.54 -> 39.89 14% cart summary windowsxp opt 44.82 -> 38.71 13% tps summary windows7-32 pgo 64.11 -> 55.65 10% tps summary windows7-32 opt 79.38 -> 71.52 8% tcanvasmark summary windowsxp pgo e10s 8656.02 -> 9317.25 7% tps summary windowsxp pgo 59.92 -> 55.63 7% tcanvasmark summary windows7-32 pgo e10s 8892.38 -> 9503.71 7% tcanvasmark summary windows7-32 pgo 8853.67 -> 9432.17 6% tps summary windowsxp opt 77.03 -> 72.31 2% tp5o summary windows7-32 opt e10s 363.52 -> 354.56 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2749 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
I did a series of retriggers and generated a compare view: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=024e7dc7a219&newProject=mozilla-inbound&newRevision=209072396aa5&framework=1 these match up well with the alerts. I see winxp having tart, tp5o_scroll, and svg_opacity regressions, otherwise memory and fileio for win7 is increased. Oddly most seem to be non-e10s, maybe that makes sense given the nature of the patch. Mason, can you look into this?
Component: Untriaged → Graphics
Flags: needinfo?(mchang)
Product: Firefox → Core
Assignee: nobody → mchang
Flags: needinfo?(mchang)
From irc, if you could please have a patch or something to get some memory profiling data from this push, that'd be great! As for the logs, the amount of bytes read on main thread from before (https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-win32/1472152998/mozilla-inbound_win7_ix_test-xperf-bm112-tests1-windows-build637.txt.gz) and after (https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-win32/1472153839/mozilla-inbound_win7_ix_test-xperf-bm126-tests1-windows-build567.txt.gz), the amount of mainthread_readbytes is roughly the same, and actually the amount of readbytes from xul.dll is exactly the same.
Flags: needinfo?(jmaher)
I figured out how to get the data, here are some try pushes: https://treeherder.mozilla.org/#/jobs?repo=try&author=jmaher@mozilla.com&fromchange=cae7eabc8434a4e4380cc5dd7761fb1dffa8ee6e&tochange=0beee12cd65a565d551e0d57742116faf3a1e1b2 I suspect tomorrow I will be able to pinpoint the issue with about 10 minutes looking into the log files.
ok, this is really odd, but it looks like this patch is showing a difference in fonts?!? here is the differences in fileIO access for the main thread during normal running (non startup/shutdown): filename bad good diff C:\windows\Fonts\simsun.ttc 8773632 3686400 5087232 C:\windows\Fonts\msmincho.ttc 3731456 1785856 1945600 C:\windows\Fonts\msgothic.ttc 5828608 4231168 1597440 C:\windows\Fonts\gulim.ttc 3190784 2408448 782336 C:\windows\Fonts\timesbd.ttf 356352 163840 192512 C:\windows\Fonts\lucon.ttf 98304 98304 C:\windows\Fonts\times.ttf 401408 327680 73728 C:\windows\Fonts\segoeuil.ttf 65536 65536 C:\windows\Fonts\tahoma.ttf 380928 319488 61440 C:\windows\Fonts\verdanab.ttf 139264 90112 49152 C:\windows\Fonts\ariali.ttf 241664 196608 45056 C:\windows\Fonts\timesi.ttf 229376 184320 45056 C:\windows\Fonts\cour.ttf 98304 65536 32768 C:\windows\Fonts\l_10646.ttf 131072 98304 32768 C:\windows\Fonts\tahomabd.ttf 356352 331776 24576 C:\Users\cltbld\AppData\Local\Temp\TMPDIR\profile\cache2\index.log 83096 64984 18112 C:\windows\Fonts\arialbi.ttf 217088 204800 12288 C:\windows\Fonts\shruti.ttf 270336 262144 8192 C:\windows\Fonts\msyhbd.ttf 3448832 3444736 4096 C:\windows\Fonts\pala.ttf 315392 311296 4096 C:\windows\Fonts\seguisym.ttf 299008 294912 4096 C:\windows\Fonts\trebucbd.ttf 102400 98304 4096 C:\$Extend\$UsnJrnl:$J 4096 8192 -4096 C:\windows\Fonts\malgun.ttf 32768 49152 -16384 C:\windows\Fonts\segoeuib.ttf 32768 65536 -32768 C:\Users\cltbld\AppData\Local\Temp\TMPDIR\profile\cookies.sqlite 819400 950472 -131072 C:\windows\Fonts\meiryo.ttc 2183168 2334720 -151552 C:\windows\Fonts\msyh.ttf 7987200 8183808 -196608 C:\windows\Fonts\msjh.ttf 32768 327680 -294912 C:\windows\Fonts\mingliu.ttc 303104 618496 -315392 C:\$LogFile 32768 385024 -352256 it is possible that we are forcing fonts to default to different families? Maybe we are displaying errors/css/info that uses certain fonts? Maybe to render we prefer a different font?
Flags: needinfo?(jmaher)
:mchang, you mentioned over IRC some ideas that you had, did you have any luck?
Flags: needinfo?(mchang)
(In reply to Joel Maher ( :jmaher) from comment #5) > :mchang, you mentioned over IRC some ideas that you had, did you have any > luck? No not yet, hoping to get to it this week, but skia windows is nightly only after bug 1301111 and I might back out, not sure yet.
Flags: needinfo?(mchang)
Bug 1298484 fixes the Windows XP Main_RSS and Private Bytes regression.
Depends on: 1298484
The xperf on my local windows machine that came with the SDK ( Microsoft (R) Windows (R) Performance Analyzer Version 6.3.9600) doesn't seem to work with the way it's being run in talos. Do you know what version of xperf is on try?
Flags: needinfo?(jmaher)
I am not sure of the version on the talos machines, but here are the instructions for setting up a new machine: https://wiki.mozilla.org/index.php?title=ReferencePlatforms/Test/Win7#Xperf_and_mozprofilerprobe.mof
Flags: needinfo?(jmaher)
Skia windows is being backed out so the numbers should reset soon.
Joel, do you know if this update is installed on the Windows 7 try machines? - https://support.microsoft.com/en-us/kb/2505438
Flags: needinfo?(jmaher)
/me redirects to Q. Q, do you know if this update is installed on the Windows 7 try machines? - https://support.microsoft.com/en-us/kb/2505438
Flags: needinfo?(jmaher) → needinfo?(q)
I think we're going to accept the regressions. The tp5n main_normal_fileio windows7-32 regressions don't make a lot of sense and aren't related to using dwrite fonts. From my investigation, we can't control when windows decides to read from a font file versus hitting the windows cache. Even with GDI fonts, both cairo and skia use ExtTextOut and native GDI system calls to get the font, and we're not requesting any extra fonts over cairo. The svgr_opacity tests only get worse in non-e10s. Running it through a profiler, all the time is being spent in the blitter, but we're not using any different blitter in e10s and nothing else stands out in the profile. Most of the RSS regressions were fixed in bug 1298484. Finally for the tart regressions, we accepted these regressions in bug 1294337. Since this is layers acceleration disabled, we use cairo to composite with basic layers and all the hot path is in the compositor. I didn't really see anything in skia that stood out. At least we get a lot of nice wins.
:mchang, are we going to reland skia for windows?
(In reply to Joel Maher ( :jmaher) from comment #15) > :mchang, are we going to reland skia for windows? Not in 51, but in 52. We're going to do a telemetry experiment first in bug 1299605, then reland sometime in the middle of October.
Flags: needinfo?(q)
Blocks: 1302124
No longer blocks: 1291351
While looking at a different Skia regression with Lee, I saw that tp5n nonmain_startup_fileio op are vastly different on inbound versus try. See: https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,2f3af3833d55ff371ecf01c41aeee1939ef3a782,1,1%5D&series=%5Btry,2f3af3833d55ff371ecf01c41aeee1939ef3a782,1,1%5D&series=%5Btry,e5f5eaa174ef22fdd6b6e150e8c450aa827c2ff6,1,1%5D&series=%5Bmozilla-inbound,e5f5eaa174ef22fdd6b6e150e8c450aa827c2ff6,1,1%5D Do you have any idea why this is Joel? Or am I looking at the data right? Thanks!
Flags: needinfo?(jmaher)
yes, that is tracked in bug 1274018. I am not sure what to do here, but that is what is happening.
Flags: needinfo?(jmaher)
Looks like the tsvgr_opacity summary windows7-32 opt is fixed by bug 1299435. Also resolving this as wontfix.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.