Closed
Bug 1238427
Opened 9 years ago
Closed 9 years ago
nsGeolocationRequest::mTimeoutTimer can cause windows to leak
Categories
(Core :: DOM: Geolocation, defect)
Tracking
()
RESOLVED
FIXED
mozilla46
People
(Reporter: karlt, Assigned: mccr8)
References
(Blocks 1 open bug, )
Details
(Keywords: memory-leak, Whiteboard: [MemShrink:P1])
Attachments
(1 file, 4 obsolete files)
(deleted),
patch
|
jdm
:
review+
Sylvestre
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
1. Load about:memory
2. Open new tab and load http://www.bunnings.co.nz/coolaroo-bird-netting-4-x-4m_p00293358
3. Close tab.
4. Minimize memory use.
5. Measure.
Reproduced on nightly without e10s:
4 (100.0%) -- ghost-windows
├──2 (50.00%) ── http://s7.addthis.com/static/sh.63ce9099e9ce7d094433c330.html#rand=0.927425674208481&iit=1452480348184&tmr=load%3D1452480347536%26core%3D1452480347580%26main%3D1452480348172%26ifr%3D1452480348197&cb=0&cdn=0&md=0&kw=&ab=-&dh=www.bunnings.co.nz&dr=&du=http%3A%2F%2Fwww.bunnings.co.nz%2Fcoolaroo-bird-netting-4-x-4m_p00293358&href=http%3A%2F%2Fwww.bunnings.co.nz%2Fcoolaroo-bird-netting-4-x-4m_p00293358&dt=Coolaroo%20Bird%20Netting%204%20x%204m%20%7C%20Bunnings%20Warehouse&dbg=0&cap=tc%3D0%26ab%3D0&inst=1&jsl=1&prod=undefined&lng=en&ogt=image%2Curl%2Csite_name%2Cdescription%2Ctype%3Dwebsite%2Ctitle&pc=men&pub=ra-4e2f690a7922f66b&ssl=0&sid=5693175b03ab2bd7&srpl=1&srd=1&srf=0.01&srx=1&ver=300&xck=0&xtr=0&og=title%3DCoolaroo%2520Bird%2520Netting%25204%2520x%25204m%26type%3Dwebsite%26description%3DFind%2520Coolaroo%2520Bird%2520Netting%25204%2520x%25204m%2520for%2520the%2520lowest%2520prices%2520at%2520Bunnings%2520Warehouse.%2520Visit%2520your%2520local%2520store%2520for%2520the%2520widest%2520range%2520of%2520GARDEN%2520LIFESTYLE%2520%253E%2520GARDEN%2520CARE%2520%253E%2520GARDEN%2520CLOTH%2520%253E%2520BIRD%2520NETTING%2520%253E%2520PACK%2520products.%26site_name%3DBunnings%2520New%2520Zealand%26url%3Dhttp%253A%252F%252Fwww.bunnings.co.nz%252Four-range%252Fgarden%252Flandscaping%252Fplant-protection%252Fbird-netting%252Fcoolaroo%252520bird%252520netting%2525204%252520x%2525204m_00293358%26image%3Dhttp%253A%252F%252Fwww.bunnings.co.nz%252Fassets%252Fimg%252Fbunnings-logo-small-img.jpg&csi=undefined&toLoJson=uvs%3D5693175b36e4cfd1000%26chr%3DUTF-8%26vcl%3D0&rev=v4.0.3-wp&ct=1&xld=1&xd=1 [2]
└──2 (50.00%) ── http://www.bunnings.co.nz/coolaroo-bird-netting-4-x-4m_p00293358 [2]
Reporter | ||
Comment 1•9 years ago
|
||
First noticed on 43.0b2. Location service is involved in some:
26 (100.0%) -- ghost-windows
├───6 (23.08%) ── http://stirlingsports.co.nz/store-locator [6]
├───2 (07.69%) ── http://epsonenau.ugc.bazaarvoice.com/5625-en_nz/crossdomain.htm?format=embedded#origin=http%3A%2F%2Fwww.epson.co.nz [2]
├───2 (07.69%) ── http://www.bunnings.co.nz/coolaroo-bird-netting-4-x-4m_p00293358 [2]
├───2 (07.69%) ── http://www.bunnings.co.nz/gardman-garden-netting-4-x-6m_p00153918 [2]
├───2 (07.69%) ── http://www.bunnings.co.nz/gardman-garden-netting-6-x-4m_p00220177 [2]
├───2 (07.69%) ── http://www.bunnings.co.nz/search/products?q=bird%20netting&redirectFrom=Any [2]
├───2 (07.69%) ── http://www.torpedo7.co.nz/products/T7W4RN5YM/title/torpedo7-girl-s-mystic-long-sleeve-rash-top [2]
├───2 (07.69%) ── https://disqus.com/embed/comments/?base=default&version=f3e1717b71e7256da258d3a504e56865&f=openprintingorg&t_u=http%3A%2F%2Fwww.openprinting.org%2Fprinter%2FBrother%2FBrother-MFC-j615w&t_d=Printer%3A%20Brother%20MFC-j615w%20%7C%20OpenPrinting%20-%20The%20Linux%20Foundation&t_t=Printer%3A%20Brother%20MFC-j615w%20%7C%20OpenPrinting%20-%20The%20Linux%20Foundation&s_o=default [2]
├───2 (07.69%) ── https://disqus.com/embed/comments/?base=default&version=f3e1717b71e7256da258d3a504e56865&f=openprintingorg&t_u=http%3A%2F%2Fwww.openprinting.org%2Fprinter%2FEpson%2FEpson-WF-3620_Series&t_d=Printer%3A%20Epson%20WF-3620%20Series%20%7C%20OpenPrinting%20-%20The%20Linux%20Foundation&t_t=Printer%3A%20Epson%20WF-3620%20Series%20%7C%20OpenPrinting%20-%20The%20Linux%20Foundation&s_o=default [2]
├───2 (07.69%) ── https://e1.fanplayr.com/tunnel.html?v3 [2]
├───1 (03.85%) ── http://www.supercheapauto.com.au/online-store/products/Stanfred-Bike-Carrier-Single-Pole-Twist-4-Clamp-Velcro.aspx?pid=326853
└───1 (03.85%) ── http://www.supercheapauto.com.au/online-store/products/Stanfred-Bike-Carrier-Single-Pole-Twist-4-Clamp-Velcro.aspx?pid=326853#Recommendations
Reporter | ||
Comment 2•9 years ago
|
||
GC and CC logs at http://people.mozilla.org/~karlt/bugs/1238427/
Updated•9 years ago
|
Flags: needinfo?(continuation)
Comment 3•9 years ago
|
||
(In reply to Karl Tomlinson (ni?:karlt) from comment #0)
> 1. Load about:memory
> 2. Open new tab and load
> http://www.bunnings.co.nz/coolaroo-bird-netting-4-x-4m_p00293358
Am I expected to "Share location"? Or does that not matter?
Comment 4•9 years ago
|
||
I guess yes, since I can't reproduce if location is not shared.
Comment 5•9 years ago
|
||
Hmm, but can't reproduce with location shared either.
Comment 6•9 years ago
|
||
Another question:
can you reproduce on debug build? If so, does the leak show up as a shutdown leak or is it a runtime leak?
export XPCOM_MEM_LEAK_LOG=1 showing leaks means shutdown leaks. (it is just that runtime leaks tend to be much easier to debug.)
Assignee | ||
Comment 7•9 years ago
|
||
I can reproduce with e10s disabled. I had to share my location on the drop down thing.
I looked at the CC logs, but it just says that the Bunnings nsGlobalWindow is being held alive.
Assignee: nobody → continuation
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 8•9 years ago
|
||
In a local debug build, when I click on allow on the location door hanger, I get an additional modal dialogue asking if I want to share my location with the web page. If I close the page, then click "allow" on the modal dialogue, then I get a ghost window. On my regular profile, in Nightly, I don't see the additional confirmation dialogue, which is odd. I was still able to reproduce the leak one time, somehow.
Assignee | ||
Comment 9•9 years ago
|
||
The ghost window is freed before shutdown, but the browser does leak some objects. 1 CoreLocationLocationProvider, 1 nsGeolocationService, 8 nsStringBuffer, 8 nsTArray_base.
Assignee | ||
Comment 10•9 years ago
|
||
At least for this leak I'm seeing in debug builds, the missing reference seems to be from nsGeolocationRequest.
Assignee | ||
Comment 11•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #10)
> At least for this leak I'm seeing in debug builds, the missing reference
> seems to be from nsGeolocationRequest.
More specifically, an nsGeolocationRequest gets destroyed in shutdown that is holding alive an nsContentPermissionRequester that is holding alive the window.
Assignee | ||
Comment 12•9 years ago
|
||
It sounds like Google geolocation is broken right now (bug 1232995). Maybe we're hitting some odd failure case.
Reporter | ||
Comment 13•9 years ago
|
||
The ghost window doesn't show up every time, but I see it around half the time in an optimized build and in a new profile. Sometimes I need to Minimize memory use twice before it shows up as a ghost window after Measure.
(In reply to Olli Pettay [:smaug] from comment #3)
> (In reply to Karl Tomlinson (ni?:karlt) from comment #0)
> > 1. Load about:memory
> > 2. Open new tab and load
> > http://www.bunnings.co.nz/coolaroo-bird-netting-4-x-4m_p00293358
> Am I expected to "Share location"? Or does that not matter?
I ignore the door hanger and close the tab with the mouse button.
This dismissed the door hanger before starting the tab-close animation.
(In reply to Olli Pettay [:smaug] from comment #6)
> Another question:
> can you reproduce on debug build? If so, does the leak show up as a shutdown
> leak or is it a runtime leak?
> export XPCOM_MEM_LEAK_LOG=1 showing leaks means shutdown leaks. (it is just
> that runtime leaks tend to be much easier to debug.)
It does reproduce in a debug build. I don't yet know re frequency because it takes much longer to run.
XPCOM_MEM_LEAK_LOG=1 did not report any object leaked for me. I did get
[18597] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /mnt/ssd1/karl/moz/dev/xpcom/base/nsCycleCollector.cpp, line 3563
Reporter | ||
Updated•9 years ago
|
OS: Unspecified → Linux
Reporter | ||
Updated•9 years ago
|
Component: DOM → Geolocation
Assignee | ||
Comment 15•9 years ago
|
||
This patch changes the leak from a missing reference to the nsGlobalWindow to a missing reference to two callbacks, which I think is an improvement.
Assignee | ||
Comment 16•9 years ago
|
||
The two callbacks are being held alive by an nsGeolocationRequest. That in turn is being held alive by the timer thread.
The request is added to a timer in nsGeolocationRequest::SetTimeoutTimer(). The timeout is based on a timeout value from an options object that is passed in. The default value is 2147483647, which is apparently infinity, because according to bug 741131 comment 0, "If timeout was not specified, set the internal timeout variable to Infinity.". I don't know why we bother setting a timer that will never finish, but that's kind of a separate issue, because as smaug pointed out, a web page could just specify an absurdly large timeout, and we don't want to leak the page in that case.
So the bug here is that we need to cancel the timeout somehow when the page goes away. Given that this is an infinite timer, and we don't always leak, we must cancel it in some circumstances and not others.
Assignee | ||
Comment 17•9 years ago
|
||
I filed bug 1238801 about not starting a timer that will never finish, but like I said that is orthogonal to the leak.
Assignee | ||
Comment 18•9 years ago
|
||
I don't know if this is needed to fix the leak but it seems like a good idea.
Assignee | ||
Comment 19•9 years ago
|
||
This alone fixes the leak, though it is kind of a bandaid. Maybe it is good to have anyways?
Assignee | ||
Comment 20•9 years ago
|
||
This fixes the leak even without the previous patch, and should be more general. If there's a timeout timer going, then it has a strong reference to the request, but we don't want to keep the request alive if the only thing keeping it alive is the timer. So we tell the CC about reference from the timer, except we pretend it is from the object itself.
Olli, what do you think about this? I like how this patch is general (the approach in the previous patch really requires that you audit every place that calls release on a request, which is scary), but these fake CC edges are always odd.
Attachment #8707163 -
Flags: feedback?(bugs)
Assignee | ||
Updated•9 years ago
|
Summary: ghost windows from bunnings → nsGeolocationRequest::mTimeoutTimer can cause windows to leak
Comment 21•9 years ago
|
||
Comment on attachment 8707163 [details] [diff] [review]
Pretend the strong timer reference to nsGeolocationRequest is a self reference.
oh, fancy.
And normally nsGeolocationRequests are kept alive by Geolocation object, right?
(and Geolocation is kept alive by Navigator which is kept alive by Window)
Attachment #8707163 -
Flags: feedback?(bugs) → feedback+
Assignee | ||
Comment 22•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #21)
> And normally nsGeolocationRequests are kept alive by Geolocation object,
> right?
Yes. It is possible that some code somewhere implicitly relies on that, but I can't imagine anything important happens to a geolocation object after its page goes away.
Updated•9 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P1]
Assignee | ||
Comment 23•9 years ago
|
||
The patch is currently crashing during CC in the xpcshell test test_geolocation_reset_accuracy.js. I think the callback is being cleared from the timer (I guess by the timers being shut down) before the final CC, so we end up reporting more references to the request than the refcount. The obvious solution of checking the callback of the timer does not work because the best you can do is to return the callback, which AddRefs the request during a CC, which is bad.
Comment 24•9 years ago
|
||
Add HasCallback() to nsITimer ?
Assignee | ||
Comment 25•9 years ago
|
||
The timeout timer of a geolocation request holds a strong reference to the request. This can cause the window to leak if the request is not completed before the tab containing the window is closed.
To fix this, I made the timer instead hold a strong reference to a wrapper class that has only a weak reference to the request. The request destructor must now cancel the timeout timer.
I also outlined a call to StopTimeoutTimer() in nsGeolocationRequest::Shutdown().
This adds a browser-chrome test, because those already detect leaks-until-shutdown. I verified that the test leaks without my patch and doesn't leak with it. I don't really know what I'm doing, so please take a look at it.
try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ede1cf265db5
Attachment #8709643 -
Flags: review?(josh)
Assignee | ||
Updated•9 years ago
|
Attachment #8706680 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8707158 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8707159 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8707163 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
status-firefox43:
--- → wontfix
status-firefox44:
--- → wontfix
status-firefox45:
--- → affected
status-firefox46:
--- → affected
status-firefox-esr38:
--- → wontfix
Updated•9 years ago
|
Attachment #8709643 -
Flags: review?(josh) → review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Assignee | ||
Updated•9 years ago
|
Blocks: GhostWindows
Comment 27•9 years ago
|
||
Keywords: checkin-needed
Comment 28•9 years ago
|
||
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/f87d296b0528 for "/home/worker/workspace/build/src/dom/geolocation/nsGeolocation.cpp:119:5: error: bad implicit conversion constructor for 'TimerCallbackHolder'", https://treeherder.mozilla.org/logviewer.html#?job_id=20170911&repo=mozilla-inbound
Comment 29•9 years ago
|
||
So we need this at least on Aurora.
(I would even consider to Beta, but that is unlikely to be approved)
Assignee | ||
Comment 30•9 years ago
|
||
I added the missing explicit to the ctor, which should fix the issue in comment 28.
Comment 31•9 years ago
|
||
Comment 32•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Assignee | ||
Comment 33•9 years ago
|
||
Comment on attachment 8709643 [details] [diff] [review]
Avoid a strong reference from the timeout timer to nsGeolocationRequest.
This is an approval request for beta 45.
Approval Request Comment
[Feature/regressing bug #]: old as far as I can tell
[User impact if declined]: bad leaks on pages with geolocation if the tab is closed before the geolocation request finishes.
[Describe test coverage new/current, TreeHerder]: There's some testing.
[Risks and why]: low. This should only substantially change behavior when there would have otherwise been a leak.
[String/UUID change made/needed]: none
Attachment #8709643 -
Flags: approval-mozilla-beta?
Comment 34•9 years ago
|
||
Comment on attachment 8709643 [details] [diff] [review]
Avoid a strong reference from the timeout timer to nsGeolocationRequest.
Fix a leak + has tests, taking it.
Attachment #8709643 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 35•9 years ago
|
||
bugherder uplift |
You need to log in
before you can comment on or make changes to this bug.
Description
•