Closed
Bug 1257584
Opened 9 years ago
Closed 8 years ago
2 - 4% sessionrestore / sessionrestore_no_auto_restore / tp5o_scroll / tpaint / ts_paint windows* regression on push 1e05c8d75716 (Thu Mar 10 2016)
Categories
(Core :: JavaScript: GC, defect)
Core
JavaScript: GC
Tracking
()
RESOLVED
WONTFIX
Tracking | Status | |
---|---|---|
firefox48 | --- | affected |
People
(Reporter: jmaher, Unassigned)
References
Details
(Keywords: perf, regression, Whiteboard: [talos_regression])
Talos has detected a Firefox performance regression from push 1e05c8d75716. As author of one of the patches included in that push, we need your help to address this regression.
This is a list of all known regressions and improvements related to the push:
https://treeherder.mozilla.org/perf.html#/alerts?id=406
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see:
https://wiki.mozilla.org/Buildbot/Talos/Tests#sessionrestore.2Fsessionrestore_no_auto_restore
https://wiki.mozilla.org/Buildbot/Talos/Tests#tpaint
https://wiki.mozilla.org/Buildbot/Talos/Tests#ts_paint
Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p win64,win32 -u none -t other-e10s,svgr-e10s,g1-e10s,other,g2,svgr --rebuild 5 # add "mozharness: --spsProfile" to generate profile data
(we suggest --rebuild 5 to be more confident in the results)
To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code
Then run the following command from the directory where you set up Talos:
talos --develop -e [path]/firefox -a sessionrestore_no_auto_restore:tpaint:sessionrestore:ts_paint
(add --e10s to run tests in e10s mode)
Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Monday, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Comment 1•9 years ago
|
||
here is a compare view:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=c138edad150f&newProject=mozilla-inbound&newRevision=676b6db552c8&framework=1
there are some false positives there, but what regressed is:
win7 - ts_paint, opt + opt e10s
winxp - sessionrestore_no_auto_restore opt e10s
winxp - sessionrestore opt + opt e10s
:terrence, can you take a look at this?
Flags: needinfo?(terrence)
Reporter | ||
Comment 2•9 years ago
|
||
in addition we have damp regressions which are pgo only on win7, stronger regression on e10s, here are the subtests:
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-inbound&originalRevision=c138edad150f&newProject=mozilla-inbound&newRevision=676b6db552c8&originalSignature=708d28fdea12b324282e5b53b581fdcba93c7e58&newSignature=708d28fdea12b324282e5b53b581fdcba93c7e58
Comment 3•9 years ago
|
||
The only new code added here is an exact duplicate of the code here:
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/windows/w95cv.c#314
According to the comment there, every PRLock in the system will leak if we don't call PR_Init before calling PR_NewLock; windows caches library loads at several levels, so I initially thought that this must be a measurement error. Looking closer, however, we never actually call PR_Init or PR_Initialize, so I think we're currently just leaking a ton of memory on Windows. Whoops! These sort of hidden gotchas are exactly the reason why I'm trying to move us away from NSPR.
Given that the landed patch has already saved us hundreds of lines of code in subsequent landings, and given that long-running memory leaks are a huge, ongoing problem in windows, I think we should just accept a small regression here.
Flags: needinfo?(terrence)
Comment 4•9 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #3)
> we never actually call PR_Init or PR_Initialize, so I think we're
> currently just leaking a ton of memory on Windows.
PR_NewLock will initialize NSPR if needed:
if (!_pr_initialized) _PR_ImplicitInitialization();
(I've been in NSPR land recently for bug 1257522...)
Reporter | ||
Comment 5•9 years ago
|
||
is there something we can try out here then? If there is nothing, I think closing this as wontfix is reasonable.
Comment 6•9 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #4)
> (In reply to Terrence Cole [:terrence] from comment #3)
> > we never actually call PR_Init or PR_Initialize, so I think we're
> > currently just leaking a ton of memory on Windows.
>
> PR_NewLock will initialize NSPR if needed:
>
> if (!_pr_initialized) _PR_ImplicitInitialization();
>
> (I've been in NSPR land recently for bug 1257522...)
In that case, what is the regression from? I cannot image that we're not creating any locks, given that we have to run JS to restore the session. Maybe we changed the DLL load order and are getting hit by disk caching that has learned a different order?
(In reply to Joel Maher (:jmaher) from comment #5)
> is there something we can try out here then? If there is nothing, I think
> closing this as wontfix is reasonable.
Given the size of the slowdown, I think this is almost certainly an artifact of windows disk caching. I'll give some thought to how we might be able to test this.
Reporter | ||
Comment 7•9 years ago
|
||
this is now on mozilla-beta for sessionrestore (winxp, win8):
https://treeherder.mozilla.org/perf.html#/alerts?id=1012&filter=session
Comment 8•8 years ago
|
||
This bug is simply moving overhead to a different location. Considering that these tests do not interlock at any fixed point, I think that this is not actually a regression.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•