Closed
Bug 1386867
Opened 7 years ago
Closed 7 years ago
3.31% quantum_pageload_google (windows10-64) regression on push 58b579b4ef4e1fb938297bc43a7fc7e4b2168a4a (Wed Aug 2 2017)
Categories
(Firefox :: Session Restore, defect, P1)
Firefox
Session Restore
Tracking
()
RESOLVED
FIXED
Firefox 57
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox55 | --- | unaffected |
firefox56 | --- | unaffected |
firefox57 | blocking | fixed |
People
(Reporter: jmaher, Assigned: ttaubert)
References
Details
(Keywords: perf, regression, talos-regression)
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
nika
:
review+
|
Details | Diff | Splinter Review |
Talos has detected a Firefox performance regression from push:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=58b579b4ef4e1fb938297bc43a7fc7e4b2168a4a
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
3% quantum_pageload_google summary windows10-64 opt e10s stylo 527.56 -> 545.04
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=8494
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests
For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Updated•7 years ago
|
Component: Untriaged → Session Restore
Reporter | ||
Comment 1•7 years ago
|
||
this appears to be a similar regression for the non stylo case, as well as the pgo case.
:mystor, I see that you authored 2 of the 3 patches in the bug, could you take a look at this and determine why this is happening and help figure out a resolution?
Flags: needinfo?(michael)
Comment 2•7 years ago
|
||
I wrote two of the patches, but they were minor fixups - ttaubert probably will have a better idea of what's going on here.
Flags: needinfo?(michael) → needinfo?(ttaubert)
Updated•7 years ago
|
status-firefox55:
--- → unaffected
status-firefox56:
--- → unaffected
status-firefox57:
--- → affected
status-firefox-esr52:
--- → unaffected
tracking-firefox57:
--- → ?
Version: 53 Branch → Trunk
Assignee | ||
Comment 3•7 years ago
|
||
Ok let's start with the obvious: I have no idea. I profiled talos locally and it looks like the only thing we're doing differently and that Google is heavy on might be DOMSessionStorage?
1) Here's the backout as a baseline:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0
2) A tiny modification:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ee522a878a2022a8ee73a7a8da8c773279014596
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=ee522a878a2022a8ee73a7a8da8c773279014596&framework=1&filter=tp6_google&showOnlyImportant=0
[No improvement here.]
3) Commenting out SessionStore.collect():
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e138ccaf815ba905d1e3dd1886a5a0afede4f542
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=e138ccaf815ba905d1e3dd1886a5a0afede4f542&framework=1&filter=tp6_google&showOnlyImportant=0
[Surprisingly, no improvement here either.]
4) For fun, deactive the SessionStorageListener:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b580b8cbbf5dc7d1f089733453a90a0791fdc56b
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=b580b8cbbf5dc7d1f089733453a90a0791fdc56b&framework=1&filter=tp6_google&showOnlyImportant=0
[That's better than the base revision, but not really something we can land. Probably good to concentrate on DOMSessionStorage some more?]
Flags: needinfo?(ttaubert)
Assignee | ||
Comment 4•7 years ago
|
||
It looks like the MozStorageChanged event handler is somewhat problematic and I don't understand why. Especially not why suddenly after we fixed bug 1373672.
Here's a patch that removes differential updates with "storagechange", that seems to bring us back to the previous tp6_google numbers:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2459dc16c5c66bea2c7c0b8f0cb1a8b5bc4af9da
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=2459dc16c5c66bea2c7c0b8f0cb1a8b5bc4af9da&framework=1&filter=tp6_google&showOnlyImportant=0
Assignee | ||
Comment 5•7 years ago
|
||
Michael, wdyt? I've also used the chance to remove `createLazy()`, we don't actually need to cache these functions' results as they're only called once.
Attachment #8894500 -
Flags: review?(michael)
Assignee | ||
Comment 6•7 years ago
|
||
BTW, differential updates were introduced, I think, to deal with large DOMSessionStorage messages. These should hopefully not be a problem anymore since bug 1362058 limited what we store to 2KB.
Comment 7•7 years ago
|
||
Comment on attachment 8894500 [details] [diff] [review]
0001-Bug-1386867-Fix-quantum_pageload_google-talos-regres.patch
Review of attachment 8894500 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/components/sessionstore/SessionStorage.jsm
@@ -40,5 @@
> * session storage data as strings. For example:
> * {"https://example.com^userContextId=1": {"key": "value", "my_number": "123"}}
> */
> - collect(docShell, frameTree) {
> - return SessionStorageInternal.collect(docShell, frameTree);
Were we passing a unused frameTree argument to SessionStorageInternal before?
::: browser/components/sessionstore/content/content-sessionStore.js
@@ -640,5 @@
> - return;
> - }
> -
> - let {url, key, newValue} = event;
> - let uri = Services.io.newURI(url);
I'm guessing that this callback (probably URL parsing is the most expensive part if I had to guess from eyeballing this) is the expensive bit here?
I don't really like removing the partial session storage updating because I worry it will hurt real-world usage in favour of helping this talos test. I can see why doing this for every SessionStorage update callback could slow down the browser though.
We could probably use native code here to keep the partial updates and speed this up, but it's probably not worth it if we have a maximum storage limit which is fairly low.
Could you try doing something like:
a) Don't parse the string URI on the event,
b) just use the URL as the key in the _changes map,
c) In collect() we actually parse the URIs and fix up the data to actually be useful?
Potentially we could even avoid the URI parsing by instead using some native helper code (on SessionStorageUtils or DOMWindowUtils, iunno), which gets the Principal off of the event.storageArea object, which we could use instead of parsing the URL ourselves.
(NOTE: This is all assuming that the URL parsing is the slow part).
Attachment #8894500 -
Flags: review?(michael) → review-
Comment 8•7 years ago
|
||
(I am not sure if I made it clear from the comment but I'm not opposed to landing this fundamentally, I just think it might be good to explore some alternatives first because it would be nice to not lose incremental sessionstore.)
Assignee | ||
Comment 9•7 years ago
|
||
Moving the Services.io.newURI() call to the TabStateCache doesn't seem to solve it either :(
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8733b2add7fb1fd3862f442fe070bd45adcb41a8
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=8733b2add7fb1fd3862f442fe070bd45adcb41a8&framework=1&filter=tp6_google&showOnlyImportant=0
Assignee | ||
Comment 10•7 years ago
|
||
Interestingly, the patch above together with removing the storage quota retrieval seems only slightly worse than the baseline:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=066491f1cd1eeb74a4eea1439994c3f890b78ad8
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=066491f1cd1eeb74a4eea1439994c3f890b78ad8&framework=1&filter=tp6_google&showOnlyImportant=0
(I still don't understand why this is suddenly so problematic :/)
Comment 11•7 years ago
|
||
(In reply to Tim Taubert [:ttaubert] from comment #10)
> Interestingly, the patch above together with removing the storage quota
> retrieval seems only slightly worse than the baseline:
>
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=066491f1cd1eeb74a4eea1439994c3f890b78ad8
> https://treeherder.mozilla.org/perf.html#/
> compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207
> bf39d4e0&newProject=try&newRevision=066491f1cd1eeb74a4eea1439994c3f890b78ad8&
> framework=1&filter=tp6_google&showOnlyImportant=0
>
> (I still don't understand why this is suddenly so problematic :/)
What does it look like if you keep that check, but stash nsIDOMWindowUtils in a global?
- let usage = content.QueryInterface(Ci.nsIInterfaceRequestor)
- .getInterface(Ci.nsIDOMWindowUtils)
- .getStorageUsage(event.storageArea);
^ I took a quick look at the definition of GetStorageUsage, and it looks like it should be pretty cheap, so I imagine that the expensive part is the interface lookups?
Assignee | ||
Comment 12•7 years ago
|
||
This seems slightly better than what we currently have, but it's still a ~3% regression:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=00a6699122fbff5c846ff761bc5ca745b2333495
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=00a6699122fbff5c846ff761bc5ca745b2333495&framework=1&filter=tp6_google&showOnlyImportant=0
Assignee | ||
Comment 13•7 years ago
|
||
Let me try this again with a global nsIDOMWindowUtils and deferring the .newURI() call.
Assignee | ||
Comment 14•7 years ago
|
||
Hmmm, no.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=70c862236337f166940297fb6eb160264d1213eb
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=70c862236337f166940297fb6eb160264d1213eb&framework=1&filter=tp6_google&showOnlyImportant=0
Page load is a Quantum release criteria and Google.come is a top site. I'd like to track this as a 57 blocker until we have a resolution.
Updated•7 years ago
|
Assignee: nobody → ttaubert
Priority: -- → P1
Assignee | ||
Comment 16•7 years ago
|
||
The latest unsuccessful attempt:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2b848ea74325fba1525b2d1c919389617829986
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=b2b848ea74325fba1525b2d1c919389617829986&framework=1&filter=tp6_google&showOnlyImportant=0
Assignee | ||
Comment 17•7 years ago
|
||
I think this is better and looks promising.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=63fb7531c4a16d7283715c1ddfb182cc684d342e
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=be582e2778b1bbb0e60a0fb1fc912207bf39d4e0&newProject=try&newRevision=63fb7531c4a16d7283715c1ddfb182cc684d342e&framework=1&filter=tp6_google&showOnlyImportant=0
Please note that the comparison is against the baseline (the backout). So if I'm not mistaken we might be even faster than before.
Attachment #8894500 -
Attachment is obsolete: true
Attachment #8902685 -
Flags: review?(michael)
Assignee | ||
Comment 18•7 years ago
|
||
(In reply to Michael Layzell [:mystor] from comment #7)
> ::: browser/components/sessionstore/SessionStorage.jsm
> @@ -40,5 @@
> > * session storage data as strings. For example:
> > * {"https://example.com^userContextId=1": {"key": "value", "my_number": "123"}}
> > */
> > - collect(docShell, frameTree) {
> > - return SessionStorageInternal.collect(docShell, frameTree);
>
> Were we passing a unused frameTree argument to SessionStorageInternal before?
Forgot to answer. Yes, the second argument was unused since the API changed. I modified only the *Internal object's method, but not the public one.
Comment 19•7 years ago
|
||
Comment on attachment 8902685 [details] [diff] [review]
0001-Bug-1386867-Fix-quantum_pageload_google-talos-regres.patch
Review of attachment 8902685 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/components/sessionstore/content/content-sessionStore.js
@@ +797,5 @@
> * A function that returns the value that will be sent to the parent
> * process.
> */
> push(key, fn) {
> + this._data.set(key, fn);
This seems like it would affect other collectors as well as just SessionStorage. Do we see any perf changes for them?
Attachment #8902685 -
Flags: review?(michael) → review+
Assignee | ||
Comment 20•7 years ago
|
||
(In reply to Michael Layzell [:mystor] from comment #19)
> > * A function that returns the value that will be sent to the parent
> > * process.
> > */
> > push(key, fn) {
> > + this._data.set(key, fn);
>
> This seems like it would affect other collectors as well as just
> SessionStorage. Do we see any perf changes for them?
I removed it because it actually shouldn't affect any of the collectors. I'm not sure why I ever added it back then, but we never call any of the functions twice. They're called once, or overridden, and then the map is cleared.
Comment 21•7 years ago
|
||
Pushed by ttaubert@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a972e0fb330b
Fix quantum_pageload_google talos regression r=mystor
Assignee | ||
Updated•7 years ago
|
Status: NEW → ASSIGNED
Comment 22•7 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 57
Comment 23•7 years ago
|
||
This improved the tp6 tests:
== Change summary for alert #9110 (as of August 30 2017 14:55 UTC) ==
Improvements:
5% tp6_google summary windows7-32 opt e10s 544.50 -> 515.21
5% tp6_google summary windows10-64 opt e10s 533.06 -> 506.25
5% tp6_google summary windows10-64 pgo e10s 423.92 -> 404.42
5% tp6_google summary windows10-64 opt e10s stylo526.88 -> 502.75
5% tp6_google summary windows10-64 pgo 1_thread e10s stylo441.38 -> 421.46
4% tp6_google summary windows7-32 opt e10s stylo525.12 -> 502.42
4% tp6_google summary windows10-64 opt 1_thread e10s stylo532.83 -> 510.79
4% tp6_google summary windows7-32 pgo e10s 413.44 -> 397.29
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9110
You need to log in
before you can comment on or make changes to this bug.
Description
•