Closed Bug 1299070 Opened 8 years ago Closed 8 years ago

2.07 - 4.51% sessionrestore / sessionrestore_no_auto_restore / ts_paint (linux64, osx-10-10, windows7-32, windows8-64, windowsxp) regression on push 294c959e23689bdc9c3276c552bd91db8ae462a4 (Sun Aug 28 2016)

Categories

(DevTools :: General, defect, P1)

51 Branch
defect

Tracking

(firefox48 unaffected, firefox49 unaffected, firefox50 unaffected, firefox51 fixed)

RESOLVED FIXED
Firefox 51
Iteration:
51.3 - Sep 19
Tracking Status
firefox48 --- unaffected
firefox49 --- unaffected
firefox50 --- unaffected
firefox51 --- fixed

People

(Reporter: ashiue, Assigned: jdescottes)

References

Details

(Keywords: perf, regression, talos-regression, Whiteboard: [devtools-html])

Attachments

(2 files)

Talos has detected a Firefox performance regression from push 294c959e23689bdc9c3276c552bd91db8ae462a4. As author of one of the patches included in that push, we need your help to address this regression. Regressions: 5% sessionrestore windows7-32 pgo 686.29 -> 717.25 4% sessionrestore windows7-32 opt e10s 787.33 -> 819.83 4% sessionrestore windows7-32 opt 899.96 -> 936.58 4% sessionrestore_no_auto_restore windows8-64 opt 858.25 -> 889.92 4% sessionrestore windows8-64 opt e10s 710.08 -> 735.92 4% sessionrestore windows8-64 opt 820 -> 849.58 4% sessionrestore_no_auto_restore windows8-64 opt e10s 746 -> 772.67 4% sessionrestore_no_auto_restore windows7-32 opt 957.21 -> 990.92 3% ts_paint linux64 opt 1532.46 -> 1581.42 3% sessionrestore osx-10-10 opt 1040.08 -> 1072.25 3% sessionrestore_no_auto_restore windowsxp opt 912.62 -> 940 3% sessionrestore_no_auto_restore windows7-32 opt e10s 833.67 -> 858.42 3% ts_paint linux64 opt e10s 1455.08 -> 1494.83 3% sessionrestore linux64 opt e10s 1047.42 -> 1075.67 3% sessionrestore_no_auto_restore osx-10-10 opt 1083.83 -> 1112.5 2% sessionrestore linux64 opt 1185.42 -> 1210 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2875 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
After doing some retriggers, this issue might be caused by one of following patches: https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=087a40f5eaa909ceea0d41f0357c50ee3df47a4d&tochange=294c959e23689bdc9c3276c552bd91db8ae462a4 Hi Julian, as you are the patch author, can you take a look at this and determine what is the root cause? Thanks!
Blocks: 1291351, 1295608
Flags: needinfo?(jdescottes)
Flags: qe-verify-
Whiteboard: [devtools-html] [triage]
Testing locally, I could reproduce a performance regression on the sessionrestore tests (~2%). I think the root cause of the regression is that since Bug 1295608, devtools are parsing localization properties files using javascript. And on browser startup, the following files are being parsed for devtools: # loaded by l10n.js devtools-shared/locale/shared.properties # loaded by definitions.js devtools/locale/toolbox devtools/locale/inspector devtools/locale/webconsole devtools/locale/debugger devtools/locale/styleeditor devtools/locale/shadereditor devtools/locale/canvasdebugger devtools/locale/performance devtools/locale/memory devtools/locale/netmonitor devtools/locale/storage devtools/locale/webaudioeditor devtools/locale/scratchpad.properties devtools/locale/dom.properties devtools/locale/toolbox.properties # loaded by browser-menus.js devtools/locale/menus.properties As we can see most of them come from definitions.js, which was updated in https://hg.mozilla.org/integration/fx-team/rev/294c959e2368. After backing out this patch, the results for sessionrestore seem to be back to the baseline (currently running tests on try to verify that). Moving on, I can see three options here: - backout 294c959e2368 - improve the performance of the javascript parser - introduce some lazy loading to avoid localizing all strings at startup time I don't think we should be investing too much time in performance improvements of the parsing library: after 1294186 lands, each property file will only ever be parsed once. I would prefer to go for the 3rd option and either lazyload definitions.js in devtools.js, or lazyload the localized strings inside of definitions.js. I triggered the following try tests: - backout of 294c9599e2368 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=3835bb78c18c - lazyloading : https://treeherder.mozilla.org/#/jobs?repo=try&revision=5efdb82d00c5
As discussed with bgrins, here is another approach using getters for each of the definitions.js strings : - try https://treeherder.mozilla.org/#/jobs?repo=try&revision=1b1e6e9c1374 - perf compare https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=1b1e6e9c1374
Flags: needinfo?(jdescottes)
Results from previous tests =========================== - adding lazyloading for strings in definitions.js only partially solves the issue. browser-menus.js will still access the label & accessKey for each registered tools on startup in order to build the developer menu. So even with lazy loading, we are still loading properties files for 6 tools (+1 for ellipsis, + 1 for the other menu strings). - backout of the definitions.js change on a recent fx-team still seems to be ~2% slower on sessionrestore tests. ts_paint tests seem to be fine. Applying the backout on top of the regression itself, performances seem extremely similar to the "good" reference. The only properties file parsed in this case is the one for the ellipsis. [I have lots of over test results but detailing all of them here would be too long, ping me if interested.] There is no way to completely avoid loading 7 devtools properties files (6 default tools registered by default, plus the menu). Introducing lazyloading will at most get down the number of parsed properties files from 13 to 7. From the results I get on perfherder so far it doesn't translate into a 50% perf improvement though (https://treeherder.mozilla.org/#/jobs?repo=try&revision=b33f0e319e70). And I still don't really understand why. Local profiling =============== Locally I measured the time spent to localize all strings needed by devtools on browser startup - ~2ms using Services.string (similar to the situation before Bug 1295608) - ~25ms using require + js properties parser The 23ms increase seems to make sense. For instance, the sessionrestore test takes a little under 700ms on my machine, which means string localization is responsible for a bit more than 3% of that time. This is consistent with the initial regression figures. My assumption is that the regression solely comes from this. If I analyze where the time is spent between requiring & parsing. On average requiring takes 6/25ms and parsing takes the remaining 19/25ms. (implementing a simpler parser, I got this down to 7ms for overall parsing, but some edge cases are missing and not sure how fast it can be while still being complete enough for our needs). The time is evenly spent amongst the 13 properties files, so ~2ms for each. So I would expect the regression to be completely proportional to the number of properties files downloaded. As I said, that's not what I have seen so far on perfherder. I am trying to get more profiling data from perfherder, I will know later today if what I'm seeing locally correlates with what happens on perfherder. [I also noticed a strange behavior. After doing a fresh build of Firefox, the two first times I open the browser, I will get the ~23ms figure mentioned above, but the 3rd time, the first require takes 70ms. I have not observed this kind of variance with the Services.string implementation. This also doesn't seem to occur when I run the Talos tests locally.] Next steps ========== Further improvements to investigate/discuss: - regroup all definitions strings in a single properties file - use a faster parser - keep using Services.string when available
> - regroup all definitions strings in a single properties file Here is a try push implementing this : https://treeherder.mozilla.org/#/jobs?repo=try&revision=d95b6dab4a7f And the perfherder comparison with the good baseline: https://treeherder.mozilla.org/perf.html#/compare?originalProject=fx-team&originalRevision=087a40f5eaa909ceea0d41f0357c50ee3df47a4d&newProject=try&newRevision=d95b6dab4a7f&framework=1&showOnlyImportant=0 There is still one orange on sessionrestore opt e10s for Linux64. And I see a lot of ~+1% on ts_paint and sessionrestore tests even if they don't popup as "important" changes. Alison, could you have a look at the perfherder link above? In my opinion this is still not OK, but I'm not used to work with Perfherder and maybe you have a different take on this.
Flags: needinfo?(ashiue)
thanks Julian for coming up with a fix and getting great data on try. I took some time this morning looking at this, and overall I think these changes are fixing the large majority of any regressions :) Looking at the graph links available on the compare view, you can see many of the values from your try push is posting lower than the values on fx-team, very similar to what you see prior to the regression. Another way to look at this is looking at the raw values seen in the compare view vs the raw values from comment 0, things are much closer to the original.
Flags: needinfo?(ashiue)
Thanks for the feedback Joel! I've tried some other variations (https://treeherder.mozilla.org/#/jobs?repo=try&revision=79908fa3180c & https://treeherder.mozilla.org/#/jobs?repo=try&revision=361b5d153b45) but I'm not seeing a huge difference compared to https://treeherder.mozilla.org/#/jobs?repo=try&revision=d95b6dab4a7f. One issue with this patch is that it involves moving a lot of localized strings from one file to another. I guess we will again need a kind of migration script for localization teams (see Bug 1295157).
Assignee: nobody → jdescottes
Status: NEW → ASSIGNED
Iteration: --- → 51.3 - Sep 12
Priority: -- → P1
Whiteboard: [devtools-html] [triage] → [devtools-html]
Attachment #8787247 - Flags: review?(bgrinstead) → review+
Axel, to fix this performance regression, I want to move ~60 strings from existing properties files to a new properties file. What do you advise to do wrt localization teams? Would an email explaining what changed and why be ok here?
Flags: needinfo?(l10n)
Is it possible to see what the string move looks like? I don't think there's much else that we can do at the moment, besides sending an email. I can take care of that in the report I send at the beginning of the cycle. It's also still not completely clear what's the timeline for bug 1295157 is, and if it's going to happen for all devtools at the same time.
Flags: needinfo?(l10n)
Comment on attachment 8787248 [details] Bug 1299070 - Merge all startup strings in startup.properties; https://reviewboard.mozilla.org/r/76092/#review74074 Good find
Attachment #8787248 - Flags: review?(bgrinstead) → review+
(In reply to Francesco Lodolo [:flod] from comment #12) > Is it possible to see what the string move looks like? > You can check https://bugzilla.mozilla.org/attachment.cgi?id=8787248 > I don't think there's much else that we can do at the moment, besides > sending an email. I can take care of that in the report I send at the > beginning of the cycle. > Ok, thanks a lot! > It's also still not completely clear what's the timeline for bug 1295157 is, [Will also reply directly in Bug 1295157] I don't think we will have a script ready for the next merge day (Sep 12), so I'll wait after that to start landing DTDs -> properties patches. However I think we will have to start landing in Firefox 52 anyway. > and if it's going to happen for all devtools at the same time. For now we focus on the inspector, webconsole and overall toolbox. So the migration will not happen for all devtools at the same time. Other modules will be migrated later (TBD).
Comment on attachment 8787248 [details] Bug 1299070 - Merge all startup strings in startup.properties; Francesco, I'll let you review the file and let me know if you have any concern before landing this. Thanks!
Attachment #8787248 - Flags: feedback?(francesco.lodolo)
Comment on attachment 8787248 [details] Bug 1299070 - Merge all startup strings in startup.properties; The patch looks good, since it moves existing strings from one file to the other. This also made me notice one bad approach to strings, and it's mostly an FYI for the future. ToolboxDebugger.label=Debugger debuggerMenu.accesskey=D The two strings have IDs completely unrelated, you need a wild guess to know which string an accesskey is related to, unless you look at the code. Proper and intuitive naming schema: debuggerMenu.label=Debugger debuggerMenu.commandkey=S debuggerMenu.accesskey=D
Attachment #8787248 - Flags: feedback?(francesco.lodolo) → feedback+
Component: Untriaged → Developer Tools
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
Found one weirdness only after copying strings in my own localization: storage.tooltip3 is now in both storage.properties and startup.properties, I assume that's not wanted?
Flags: needinfo?(jdescottes)
(In reply to Francesco Lodolo [:flod] (slow to reply, mostly away Sep 3-10) from comment #21) > Found one weirdness only after copying strings in my own localization: > storage.tooltip3 is now in both storage.properties and startup.properties, I > assume that's not wanted? Thanks for catching that! Incidentally, I have an opened bug for cleaning up unused strings (Bug 1299580), I attached a patch to remove it there.
Flags: needinfo?(jdescottes)
Thanks for this fix! The performance improvement already shown on talos. Improvements: 4% sessionrestore_no_auto_restore windowsxp opt e10s 804.46 -> 773.42 4% sessionrestore linux64 pgo e10s 959.79 -> 924.58 3% sessionrestore_no_auto_restore osx-10-10 opt e10s 1025.67 -> 992.67 3% sessionrestore_no_auto_restore windows8-64 opt 902.42 -> 875 3% sessionrestore windows7-32 opt 942.29 -> 913.92 3% sessionrestore windows8-64 opt 863.42 -> 837.75 3% sessionrestore osx-10-10 opt e10s 980.25 -> 951.58 3% sessionrestore linux64 opt e10s 1087.75 -> 1059.33 3% sessionrestore_no_auto_restore linux64 opt e10s 1138.33 -> 1109.17 3% ts_paint linux64 pgo e10s 1329.42 -> 1295.83 2% ts_paint linux64 pgo 1412.83 -> 1384.42 2% sessionrestore_no_auto_restore windowsxp opt 950.33 -> 935.83 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=2987
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: