Closed Bug 1628249 Opened 5 years ago Closed 4 years ago

Delay setting nsFrameLoader's mPendingBrowsingContext to null a bit

Categories

(Core :: DOM: Navigation, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla79
Fission Milestone M6a
Tracking Status
firefox79 --- fixed

People

(Reporter: peterv, Assigned: peterv)

References

Details

Attachments

(1 file)

With session history in the parent enabled we end up access a frameloader's browsing context after nsFrameLoader::DestroyDocShell. The culprit is session store code trying to get to the session history object from the frameloader's browsing context (https://searchfox.org/mozilla-central/rev/a707541ff423ade0d81cef6488e6ecfa09273886/browser/components/sessionstore/SessionStore.jsm#1192) This ends up triggering an assertion in nsFrameLoader::GetExtantBrowsingContext because mPendingBrowsingContext has been set to null. Nika suggested moving the nulling of mPendingBrowsingContext to nsFrameLoader::DestroyComplete instead of doing it in nsFrameLoader::DestroyDocShell.

Assignee: nobody → peterv
Status: NEW → ASSIGNED

P3 M6

Peter says he can't land this fix yet because it causes a leak in a test that intentionally crashes a content process.

Fission Milestone: --- → M6
Priority: -- → P3
Blocks: 1570255

I think I finally mostly figured out why this causes a window to leak until shutdown on browser/components/sessionstore/test/browser_crashedTabs.js. It looks like we end up with a remote frameloader on which nsFrameLoader::DestroyComplete is called. Then, a little bit later, nsFrameLoaderOwner::ChangeRemotenessCommon is called, and we end up in nsFrameLoader::DestroyDocShell. nsFrameLoader::DestroyDocShell calls Detach on mPendingBrowsingContext if mWillChangeProcess is false and mPendingBrowsingContext->EverAttached() is true, which they are for the particular case we're looking at. Because my patch nulls out mPendingBrowsingContext in nsFrameLoader::DestroyComplete, by the time we get back to nsFrameLoader::DestroyDocShell we don't detach mPendingBrowsingContext, and so it sticks around in the BCG etc.

One solution that I tried which does work is to only null out mPendingBrowsingContext in nsFrameLoader::DestroyComplete if the frameloader is not remote. But it feels wrong, because then neither nsFrameLoader::DestroyComplete nor nsFrameLoader::DestroyDocShell will null it for remote frameloaders (I suppose I could null it in DestroyDocShell for remote frame loaders and in DestroyComplete for non-remote o_O).

It was a bit surprising to me that we call DestroyDocShell and DestroyComplete for both remote and non-remote frameloaders, but in the opposite order. Anyway, Nika, maybe you have an idea?

Flags: needinfo?(nika)

When BrowserParent is destroyed, it seems like it always call DestroyComplete, which I think is why the DestroyComplete method exists in the first place (to split it from the main "destroying this browser has started" part). It seems we never end up calling DestroyDocShell in the subprocess crash case :-/

I'm not sure if there's a good way to reconcile these which isn't super messy, at least not a simple one, so perhaps let's change the assertion which was failing, and make it instead return nullptr when we have already detached mPendingBrowsingContext, and have it return nullptr in this case, avoiding the issues caused by changing destruction order for now?

Flags: needinfo?(nika)
No longer blocks: 1570255

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is P3 (Backlog,) indicating it has been triaged, the bug's Severity is being updated to S3 (normal.)

Severity: normal → S3

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:peterv, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(peterv)

M6a because Peter needs to determine whether this is still a problem.

He couldn't land the current patch because it caused a leak?

Fission Milestone: M6 → M6a

Peter is going to update the patch to address comment 4.

Attachment #9139098 - Attachment description: Bug 1628249 - Delay setting nsFrameLoader's mPendingBrowsingContext to null a bit. r?nika! → Bug 1628249 - Return null from nsFrameLoader::GetExtantBrowsingContext if we're in the middle of destroying the nsFrameLoader. r?nika!
Pushed by pvanderbeken@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/165af6024bdc Return null from nsFrameLoader::GetExtantBrowsingContext if we're in the middle of destroying the nsFrameLoader. r=nika
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79
Flags: needinfo?(peterv)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: