Closed Bug 1596826 Opened 5 years ago Closed 5 years ago

Crash in [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]]

Categories

(Core :: Widget: Cocoa, defect, P1)

Desktop
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla75
Tracking Status
firefox-esr68 --- unaffected
firefox70 --- unaffected
firefox71 --- unaffected
firefox72 + fixed
firefox73 + fixed
firefox74 + fixed
firefox75 --- fixed

People

(Reporter: marcia, Assigned: jrmuizel)

References

Details

(5 keywords)

Crash Data

This bug is for crash report bp-a5045bb6-f803-443a-a7ff-358f00191115.

Seen while looking at macOS specific crash stats: https://bit.ly/2O9qu0Y. Marking as sec sensitive since all the crashes have a possible UAF signature. The first crash occurred in 20191106132950 and all the crashes happen on 10.14.

No specific URLs that look helpful, and no comments.

Possible regression range based on build ID: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4d585c7edc7683e4b35eca6b18c9a646a1b8a78d&tochange=96b58f95ed7333672e6dba134d091015328d299b

Top 10 frames of crashing thread:

0 libobjc.A.dylib objc_msgLookupSuper2 
1 AppKit -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] 
2 AppKit -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] 
3 AppKit -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] 
4 AppKit -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] 
5 AppKit -[NSView buildLayerTree] 
6 AppKit NSViewBuildLayerTreeForDisplay 
7 AppKit -[NSWindow displayIfNeeded] 
8 AppKit __NSWindowGetDisplayCycleObserverForDisplay_block_invoke 
9 AppKit NSDisplayCycleObserverInvoke 

Bug 1592739 appears in that range.

Flags: needinfo?(mstange)
Crash Signature: [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] → [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]]

Adding :harry since browser_bootstrapped_custom_toolbar.js seems to trigger this intermittently and there is bug 1593001 that landed in the range given in comment 0, although it isn't obvious from a quick look how this would be triggering the crashes here.

Crash Signature: [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] → [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xf]
Flags: needinfo?(htwyford)
Group: core-security → layout-core-security
Crash Signature: [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xf] → [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xf] [@ NXMapRemove]

I would be surprised if bug 1593001 was causing this. That patch makes changes in nsTouchBarInputIcon and nsTouchBarUpdater. Both of those classes communicate between classes directly related to the Touch Bar and actual Cocoa layout code. Since bug 1593001 adds new null-checks to bail out of nsTouchBarInputIcon and nsTouchBarUpdater, it would cause fewer new calls to NSView, not more/different calls. That said, I'll continue to keep an eye on this.

Flags: needinfo?(htwyford)

How can we stop people filing new intermittents for this?

Flags: needinfo?(aryx.bugmail)

It has been added as bug summary + url to a document of bugs not suggested by Treeherder. Sheriffs and Treeherder don't have access to security bugs.

Flags: needinfo?(aryx.bugmail)
Crash Signature: [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xf] [@ NXMapRemove] → [@ -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xf] [@ NXMapRemove] …
Crash Signature: , void*) const + 0xf] [@ NXMapRemove] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xc] → , void*) const + 0xf] [@ NXMapRemove] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xc]

We have shipped our last beta for 71, as this is a sec bug, I am setting 71 as fix-optional in case a safe fix could be uplifted as a ridealong in a dot release.

Crash Signature: , void*) const + 0xf] [@ NXMapRemove] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xc] → , void*) const + 0xf] [@ NXMapRemove] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xc] [@ CA::Layer::getter(unsigned int, _CAValueType, void*) + 0x38]
Crash Signature: , void*) const + 0xf] [@ NXMapRemove] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xc] [@ CA::Layer::getter(unsigned int, _CAValueType, void*) + 0x38] → , void*) const + 0xf] [@ NXMapRemove] [@ CA::AttrList::get(unsigned int, _CAValueType, void*) const + 0xc] [@ CA::Layer::getter(unsigned int, _CAValueType, void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncesto…
Crash Signature: , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] → , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]]

This appears stalled, but isn't marked as such. Lets figure out if there's something we can do here.

Flags: needinfo?(gpascutto)

(In reply to Harry Twyford [:harry] from comment #6)

I would be surprised if bug 1593001 was causing this.

fwiw, I was looking through the crash reports and this one occurred on a MacBookPro10,1 and this one on a MacBookPro12,1, neither of which have Touch Bars. Touch Bar-related classes aren't loaded on Macs without Touch Bars, except in testing environments. I'm fairly confident the cause here is something other than bug 1593001.

This appears stalled, but isn't marked as such. Lets figure out if there's something we can do here.

This is needinfo mstange already for the link to Bug 1592739.

Flags: needinfo?(gpascutto)

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #11)

It has been added as bug summary + url to a document of bugs not suggested by Treeherder. Sheriffs and Treeherder don't have access to security bugs.

This doesn't seem to have helped... Is there anything else we can do here?

Flags: needinfo?(aryx.bugmail)
Crash Signature: , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] → , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c]

Highlighted the bug in the latest sheriffing newsletter.

Crash Signature: , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] → , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c]
Flags: needinfo?(aryx.bugmail)

I've started looking into this. I haven't gotten very far yet.

Assignee: nobody → mstange
Status: NEW → ASSIGNED
Flags: needinfo?(mstange)

The priority flag is not set for this bug.
:spohl, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(spohl.mozilla.bugs)
Flags: needinfo?(spohl.mozilla.bugs)
Priority: -- → P1

The crash happens when -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] makes the recursive call to the same method on one of its subviews.
This must mean that there's a bad pointer to a deleted NSView stuck somewhere in the view hierarchy. I don't know how that could happen yet.
I'm now looking into whether NSZombieEnabled might be able to catch such a use-after-free bug, so that we could maybe enable it on a try push an re-trigger tests until we hit this crash. But first I'll try to see if NSZombieEnabled works on a standalone test app.

I've pushed a try build with NSZombieEnabled in the hopes that it may help us catch this happening sooner: https://hg.mozilla.org/try/rev/a3381c1b32e059ef70c5eb5fcba26ad2a0361f58
I will retrigger mochitest-browser-chrome-e10s-5 a bunch of times in the hopes to get it to reproduce.

If I don't find a way to debug/fix this, we can back out bug 1592739 on Beta. I believe that that bug is the most likely candidate in the regression range, but I don't have an explanation for how it could have caused it.

One thing that might be happening is that NSViewBuildLayerTreeForDisplay runs at a different time than before.

-[NSWindow displayIfNeeded] only calls NSViewBuildLayerTreeForDisplay when bit 0x18 is set in its wFlags.
This bit is set by NSWindowSetNeedsBuildLayerTree(), which is called by the following functions:

-[NSView didChangeValueForKey:]
-[NSView setLayer:]
-[NSView _setSuperview:]
-[NSView _setSuperview:]
-[NSView setSubviews:]
-[NSView buildLayerTree]
NSViewSetIsOpenGLBased

It's possible that we're one of these at a different time than before, though I don't know why or how.

Looking at the test failure logs linked from https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-11-28&endday=2019-12-05&tree=trunk&bug=1596826 and https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-11-28&endday=2019-12-05&tree=trunk&bug=1595530 , it looks like most of the crashes happen when browser windows or panels / context menus are opened.

(In reply to Markus Stange [:mstange] from comment #28)

If I don't find a way to debug/fix this, we can back out bug 1592739 on Beta. I believe that that bug is the most likely candidate in the regression range, but I don't have an explanation for how it could have caused it.

Tracking for 72; if it's ok I'd like to land the backout by next Thursday if we don't have a fix, so there's still a week in beta to verify. ni?self for that.

I'm going to prepare a backout patch in bug 1592739.

Crash Signature: , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] → , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] [@ libobjc.A.dylib@0x601d ]
Crash Signature: , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] [@ libobjc.A.dylib@0x601d ] → , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] [@ libobjc.A.dylib@0x601d ] [@ look…
Crash Signature: , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] [@ libobjc.A.dylib@0x601d ] [@ → , void*) + 0x38] [@ objc_msgSendSuper | -[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:]] [@ objc_msgSend + 0x16] [@ objc_msgSend + 0x17] [@ objc_msgSend + 0x1d] [@ objc_msgSendSuper + 0x2c] [@ libobjc.A.dylib@0x601d ] [@

The backout patch from bug comment 35 is going to need a bit of rebasing if we want to re-apply it to Beta73.

Markus, this is generating a lot of dupe bugs. What can we do for Beta73 to alleviate this?

Flags: needinfo?(mstange)

Sorry for the delay, again. :(
I have attached a fresh backout patch for 73 in bug 1592739. I'm now working on fixing this properly on central.

Flags: needinfo?(mstange)

Is there a plan or ETA to fix this in central/gecko 74? It's a bug for a noticeable frequency.

Flags: needinfo?(mstange)

So backing out the patch for bug 1592739 doesn't seem to have gotten rid of these crashes. Am I correct?

If I am, I could start using HookCase to try and diagnose these crashes. I'm very busy this week, but I could start next week.

AFAIK, the backout works, but we've only been landing it on mozilla-beta rather than mozilla-central, so each subsequent Gecko release remains affected after it merges there.

Crash Signature: lookUpImpOrForward + 0xb2] → lookUpImpOrForward + 0xb2] [@ objc_msgLookupSuper2 + 0x5d] [@ objc_msgLookup_fpret + 0x57]
Group: layout-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Crash Signature: lookUpImpOrForward + 0xb2] [@ objc_msgLookupSuper2 + 0x5d] [@ objc_msgLookup_fpret + 0x57] → lookUpImpOrForward + 0xb2] [@ objc_msgLookupSuper2 + 0x5d] [@ objc_msgLookup_fpret + 0x57]
Closed: 5 years ago
Flags: needinfo?(mstange)
Resolution: --- → FIXED
Target Milestone: --- → mozilla75
Assignee: mstange → jmuizelaar
Blocks: wr-mac
Blocks: wr-mac-nightly
No longer blocks: wr-mac
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.