Closed
Bug 1171127
Opened 9 years ago
Closed 9 years ago
Ghost windows on washingtonpost.com with e10s enabled
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla41
People
(Reporter: mccr8, Assigned: valentin)
References
(Blocks 1 open bug)
Details
(Keywords: memory-leak, perf, Whiteboard: [MemShrink:P1])
Attachments
(1 file)
(deleted),
patch
|
jduell.mcbugs
:
review+
Sylvestre
:
approval-mozilla-aurora-
|
Details | Diff | Splinter Review |
With e10s, if you go to a story on washingtonpost.com, the scroll all the way to the bottom so various comment junk starts loading, then navigate to a new page while the loading is in progress, then you leak the page forever. I can't reproduce without e10s enabled. I'm still investigating, but the problem seems to be an orphan object element being held alive by a networking channel. (The problem could be in how the DOM handles the channels, of course.)
Reporter | ||
Updated•9 years ago
|
tracking-e10s:
--- → ?
Reporter | ||
Comment 1•9 years ago
|
||
For at least one of the orphan object elements, it seems like what is holding it alive is a ObjectInterfaceRequestorShim. The only thing referring to that is something allocated in mozilla::net::nsHttpHandler::NewProxiedChannel2(), which I think is an HttpChannelChild. As for the channel, it looks like there's an nsDocument http://js.washingtonpost.com/pb/resources/js/echo2/sdk/v3/gui.pack.css that is holding it alive, something allocated under HttpChannelChild() (maybe a ChannelEventQueue?) and an nsPerformance object. The nsPerformance object is being referred to directly only by its wrapper, which seems weird to me. I don't know about the ChannelEventQueue, but the nsDocument and nsPerformance object are being held entrained (via a very long path) by the orphan object elements, so there's your cycle. I'm not really sure what is going wrong here. Maybe something should have cleared the channel to shim edge? My information here is coming from conservative heap scanning and allocation stacks, along with the cycle collector log. I should be able to rerun the analysis while the browser is running, which would let me poke around with a debugger.
Reporter | ||
Comment 2•9 years ago
|
||
jdm, do you have any ideas about what might be going wrong here? I'm guessing something akin to bug 1157468 where cleanup isn't being done in some condition.
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(josh)
Reporter | ||
Updated•9 years ago
|
Comment 3•9 years ago
|
||
So ObjectInterfaceRequestorShim is used as the notification callbacks on the channel nsObjectLoadingContent::OpenChannel opens. It's also used as the channel's streamlistener. Normally necko is responsible for nulling out both the streamlistener and the callbacks after OnStopRequest. And in particular, HttpBaseChannel::ReleaseListeners nulls them both out and should be called OnStopRequest. It also nulls out mProgressSink, which is another pointer to the same object. And if the channel is being held by something other than the stack, it would seem like AsyncOpen succeeded, so we should have gotten OnStopRequest. I guess we don't know which member of the channel is pointing to the shim?
Comment 4•9 years ago
|
||
HttpBaseChannel::ReleaseListeners() is not used on the child process. One thing the channel doesn't throw away is the load group reference. There are several places we remove the channel (self) from the load group, never readd (AFAIK) and never release the load group after that.
Comment 6•9 years ago
|
||
The loadgroup should not be an issue. But if we don't clean up mProgressSink/mCallbacks/mListener on the child, that should cause all sorts of leaks...
Comment 7•9 years ago
|
||
Sorry, I don't think I'll be able to provide any useful information here before I leave on vacation.
Flags: needinfo?(josh)
Jason, this sounds kinda bad. The leaks cause really bad cycle collector performance, which can lead to frequent, long pauses. Is this something that could be fixed easily? Could you find someone to work on it?
Flags: needinfo?(jduell.mcbugs)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → valentin.gosu
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(jduell.mcbugs)
Updated•9 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P1]
Updated•9 years ago
|
Updated•9 years ago
|
Assignee | ||
Comment 10•9 years ago
|
||
Releasing the listeners in OnStopRequest seems to ameliorate the issue - the entries for washingtonpost.com disappear from about:memory after a GC,CC,Minimize. That's assuming I'm testing this correctly. Can someone check with the try build? https://treeherder.mozilla.org/#/jobs?repo=try&revision=7681f9e4f7ab
Updated•9 years ago
|
Attachment #8622127 -
Flags: review?(jduell.mcbugs) → review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 13•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/02eb7fe4cdd1
Keywords: checkin-needed
Comment 14•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/02eb7fe4cdd1
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Reporter | ||
Comment 15•9 years ago
|
||
Valentin, could you get approval for landing this on Aurora? People are running e10s there and it would be nice to get it fixed.
Flags: needinfo?(valentin.gosu)
Assignee | ||
Comment 16•9 years ago
|
||
Comment on attachment 8622127 [details] [diff] [review] Listeners are not released in OnStopRequest in e10s mode Approval Request Comment [Feature/regressing bug #]: e10s/HttpChannelChild [User impact if declined]: Rather large and long lasting memory leaks in the content process [Describe test coverage new/current, TreeHerder]: All green on TreeHerder. Has been on m-c for a few days. [Risks and why]: Low risk. Just clears some refPtrs in OnStopRequest. [String/UUID change made/needed]: None.
Flags: needinfo?(valentin.gosu)
Attachment #8622127 -
Flags: approval-mozilla-aurora?
Updated•9 years ago
|
status-firefox40:
--- → affected
Comment 17•9 years ago
|
||
Comment on attachment 8622127 [details] [diff] [review] Listeners are not released in OnStopRequest in e10s mode We are going to merge aurora in beta next Monday. e10s is going to be disabled in beta. So, not taking it as it could have potential regressions.
Attachment #8622127 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Reporter | ||
Updated•9 years ago
|
Blocks: GhostWindows
You need to log in
before you can comment on or make changes to this bug.
Description
•