Closed Bug 1633471 Opened 5 years ago Closed 4 years ago

Crash in [@ IPCError-browser | RecvUpdateDocumentPrincipal Trying to reuse WindowGlobalParent but the principal of the new document]

Categories

(Core :: DOM: Navigation, defect, P1)

Unspecified
All
defect

Tracking

()

VERIFIED FIXED
mozilla78
Fission Milestone M5b
Tracking Status
firefox-esr68 --- unaffected
firefox75 --- unaffected
firefox76 --- unaffected
firefox77 --- verified
firefox78 --- verified

People

(Reporter: gsvelto, Assigned: annyG)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(3 files)

This bug is for crash report bp-6ef347da-94d8-49f8-bbe3-49b200200420.

Top 10 frames of crashing thread:

0 mozglue.dll replace_realloc memory/replace/phc/PHC.cpp:1265
1  @0x96e377 
2 xul.dll static mozilla::dom::FeaturePolicyUtils::ForEachFeature dom/security/featurepolicy/FeaturePolicyUtils.cpp:92
3 xul.dll mozilla::dom::FeaturePolicy::InheritPolicy dom/security/featurepolicy/FeaturePolicy.cpp:42
4 xul.dll mozilla::dom::Document::InitFeaturePolicy dom/base/Document.cpp:3460
5 xul.dll mozilla::dom::Document::StartDocumentLoad dom/base/Document.cpp:3173
6 xul.dll nsHTMLDocument::StartDocumentLoad dom/html/nsHTMLDocument.cpp:463
7 xul.dll nsContentDLF::CreateInstance layout/build/nsContentDLF.cpp
8 xul.dll nsDocShell::NewContentViewerObj docshell/base/nsDocShell.cpp:7787
9 xul.dll nsDocShell::CreateContentViewer docshell/base/nsDocShell.cpp:7518

This looks like a regression that started with buildid 20200416094846. I've attached a full stack trace for the main process.

Something landed in bug 1594529 2020-04-15.

Flags: needinfo?(agakhokidze)
Priority: -- → P3
Regressed by: 1594529
Has Regression Range: --- → yes

hmm, is this Fission only? I think not.

[Tracking Requested - why for this release]:
crashes are bad

Priority: P3 → P1
Assignee: nobody → agakhokidze
Flags: needinfo?(agakhokidze)

The reason for this crash is the code I added in my patch here https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/dom/ipc/WindowGlobalParent.cpp#260-262 [1]

When I’m stepping through this crash in a debugger, the assertion here also fails https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/dom/base/nsGlobalWindowOuter.cpp#2187 [2]

This usually happens when we are trying to print certain webpages (link omitted here). In the specific case I am debugging, inside of nsGlobalWindowOuter::SetNewDocument when we call nsGlobalWindowOuter::WouldReuseInnerWindow we end up returning the result of FastEqualsConsideringDomain and since both principals have set explicit domains we end up taking this path https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/caps/BasePrincipal.h#367-368. Both principles subsume each other and WouldReuseInnerWindow returns true which leads to us reusing the inner window. However, later in the function SetNewDocument when we check if the principals are Equals to each other (whether in [1] or [2]) , the code takes a different path - it compares origins and returns false here https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/caps/BasePrincipal.h#349-350

For a brief moment I thought maybe we should not use EqualsConsideringDomain, but I see here that Boris did this on purpose in Bug 1552541. I'm not sure what the right approach here is...

Should WouldReuseInnerWindow use both FastEqualsConsideringDomain and Equals and make it so we only reuse the inner window if both of those return true? I'm getting an assertion failure here[1] if I go that route.. Going to ask Nika to maybe shed some light on this.
[1] https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/dom/base/Element.cpp#1134..

Flags: needinfo?(nika)

Tracking for Fission M5b because this is a crash and Anny is already investigating.

Severity: -- → critical
Fission Milestone: --- → M5b

We discussed this a bit on matrix yesterday, and talked about a few possible solutions here. The assertion failure which :annyG is running into in comment 4 is probably bug 1570804, which is a separate issue.

The reason for this problem is related to a small oversight in bug 1552541 around the interaction with printing. Usually, the principal for the first non-about:blank document in an iframe does not have document.domain set on it, as it is a fresh load, and has not had the chance to have its principal changed. However, in the print-preview case, our static-cloned document copies the principal of the document it is a clone of. This means that EqualsConsideringDomain can consider two cross-origin principals to be equal, as both will have set document.domain.

We don't want to be preserving the inner window in that situation. The easiest way to do this is probably to check Equals() && EqualsConsideringDomain(), like Anny was suggesting earlier, although that could certainly be optimized.

Flags: needinfo?(nika)
Summary: Crash in [@ IPCError-browser | RecvUpdateDocumentPrincipal Trying to reuse WindowGlobalParent but the principal of the new document] → Assertion failure: mDocumentPrincipal->Equals(aNewDocumentPrincipal)
Summary: Assertion failure: mDocumentPrincipal->Equals(aNewDocumentPrincipal) → Crash in [@ IPCError-browser | RecvUpdateDocumentPrincipal Trying to reuse WindowGlobalParent but the principal of the new document]
Attachment #9146501 - Attachment description: Bug 1633471 - Make principal check stricter when printing documents, r?nika! → Bug 1633471 - Do not reuse inner window when printing, r?nika!
Pushed by agakhokidze@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8197bf1a1fed Do not reuse inner window when printing, r=nika
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78

The patch landed in nightly and beta is affected.
:annyG, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(agakhokidze)
Attached file beta uplift approval request (deleted) —

Approval Request Comment
[Feature/Bug causing the regression]: This is not a regression, but an existing issue that is exposed because of the stricter check introduced in bug 1594529.
[User impact if declined]: Sometimes when the user tries to print certain pages, the tab crashes.
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: Not sure, I did a manual test myself. But here are steps to reproduce just in case:
Go to one of the URLs in the related crash reports and try to print the page. We should not crash when printing.
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: it is low risk
[Why is the change risky/not risky?]:
The change is low risk because its a 3 line C++ change making a decision about when to reuse a window for a new document more strict.
[String changes made/needed]: none

Flags: needinfo?(agakhokidze)
Attachment #9147355 - Flags: approval-mozilla-beta?

Anny, you didn't attach attach a patch to your uplift request (example in another bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1636568#c4)

Flags: needinfo?(agakhokidze)

Sorry about that. You can use the same patch (it applies cleanly on top of mozilla-beta) - https://phabricator.services.mozilla.com/D74261

Flags: needinfo?(agakhokidze)
Comment on attachment 9147355 [details] beta uplift approval request Patch is minimal and add tests, , uplift approved for beta 6, thanks.
Attachment #9147355 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

Reproduced the initial issue with an old Beta build 77.0b5 using Windows 10.
Verified - Fixed in Beta 77.0b6, Beta 77.0b7, and latest Nightly 78.0a1 using Windows 10, Ubuntu 18.04, and Mac 10.15.

Status: RESOLVED → VERIFIED
QA Whiteboard: [qa-triaged]
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: