Closed
Bug 1325834
Opened 8 years ago
Closed 8 years ago
Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::a11y::PDocAccessibleChild::OnMessageReceived
Categories
(Core :: Disability Access APIs, defect)
Tracking
()
RESOLVED
FIXED
mozilla55
People
(Reporter: n.nethercote, Assigned: bugzilla)
References
Details
(Keywords: crash, Whiteboard: aes+)
Crash Data
Attachments
(4 files, 1 obsolete file)
(deleted),
patch
|
jimm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
davidb
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/x-review-board-request
|
jimm
:
review+
|
Details |
(deleted),
patch
|
jimm
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-7aa74c38-d302-4440-8f7c-c6ea62161225.
=============================================================
135 occurrences of this crash on Nightly and Aurora over the past week, but from only 6 installations.
First Nightly occurrence was in 20161217030205, first Aurora occurrence was in 20161222004019, but the crashes don't show up in every day's build so the relevant code may have landed earlier than those dates.
I don't think this crash has shown up before so I suspect it's fallout from one of the a11y fixes that landed/uplifted recently.
tbsaunde, any ideas?
Flags: needinfo?(tbsaunde+mozbugs)
Comment 1•8 years ago
|
||
The fatal error message is "Error deserializing 'IAccessibleHolder'".
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::a11y::PDocAccessibleChild::OnMessageReceived] → [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::a11y::PDocAccessibleChild::OnMessageReceived]
[@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::Fata…
Updated•8 years ago
|
Whiteboard: aes?
Assignee | ||
Comment 2•8 years ago
|
||
Adding it to the pile.
Flags: needinfo?(tbsaunde+mozbugs)
Whiteboard: aes? → aes+
Reporter | ||
Updated•8 years ago
|
status-firefox52:
--- → affected
status-firefox53:
--- → affected
Comment 3•8 years ago
|
||
Trevor is back and agreed to take a look along with the other top e10s a11y stability issues.
Assignee: nobody → tbsaunde+mozbugs
Assignee | ||
Updated•8 years ago
|
Assignee: tbsaunde+mozbugs → aklotz
Comment 4•8 years ago
|
||
Comment 5•8 years ago
|
||
also, no 3rd party module correlations, just our shim dll. It's odd this shows up so high on the top crash list (#6 on aurora), I wonder if this is being pushed up by desktop helper apps like remote desktop that use our a11y apis?
(100.0% in signature vs 08.10% overall) moz_crash_reason = MOZ_CRASH()
(100.0% in signature vs 11.98% overall) Module "ia2marshal.dll" = true
(100.0% in signature vs 15.72% overall) accessibility = Active
--
(82.28% in signature vs 01.70% overall) Module "msiltcfg.dll" = true
(82.28% in signature vs 02.70% overall) Module "msi.dll" = true
(99.40% in signature vs 28.50% overall) Module "samlib.dll" = true
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Comment 6•8 years ago
|
||
I want to add some diagnostic asserts to COM proxy deserialization code so that we can narrow this down a bit.
Attachment #8825535 -
Flags: review?(jmathies)
Updated•8 years ago
|
Attachment #8825535 -
Flags: review?(jmathies) → review+
Assignee | ||
Comment 7•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/8da0d71a754c6dc949b65c50bfea0ed6eee1f8ee
Bug 1325834: Add MOZ_DIAGNOSTIC_ASSERTs to mscom proxy deserialization code; r=jimm
I had to back this out for Windows mn-e10s failures like https://treeherder.mozilla.org/logviewer.html#?job_id=67800524&repo=mozilla-inbound
https://hg.mozilla.org/integration/mozilla-inbound/rev/aa20cf1a138e
Flags: needinfo?(aklotz)
Also appears to have broken Windows talos jobs like https://treeherder.mozilla.org/logviewer.html#?job_id=67804127&repo=mozilla-inbound
Assignee | ||
Comment 10•8 years ago
|
||
Assignee | ||
Comment 11•8 years ago
|
||
Those failures are actually useful to this cause.
After a few try builds to properly massage the failures so that they are surfaced in our CI crash dumps, I see that CreateStreamOnHGlobal is failing with E_INVALIDARG. If that is the case, then we would never have sent a valid stream to begin with.
But I have no idea why that would be failing. The parameters are trivial!
Flags: needinfo?(aklotz)
Reporter | ||
Comment 12•8 years ago
|
||
Crashes with this signature:
> mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::a11y::PDocAccessibleChild::OnMessageReceived
are #6 topcrash in Nightly 20170113030227.
Assignee | ||
Comment 13•8 years ago
|
||
Looks like I was getting bad data. A few more try pushes later, I see that we're trying to serialize a null object. This works, but isn't really supposed to be happening.
I don't think that I can move further on this until bug 1329977 lands.
Status: NEW → ASSIGNED
Depends on: 1329977
Comment 14•8 years ago
|
||
It seems likely that some documents are created in content processes without
a DocAccessibleChild actor because there is no docshell or tabchild
associated with the document. However DocAccessible::DoInitialUpdate()
already calls functions that assume the document is associated with a
docshell. So hopefully trying to create the child actor there will mean it
is more successful.
Attachment #8827429 -
Flags: review?(dbolter)
Comment 15•8 years ago
|
||
Comment on attachment 8827429 [details] [diff] [review]
create the DocAccessibleChild in DocAccessible::DoInitialUpdate()
Review of attachment 8827429 [details] [diff] [review]:
-----------------------------------------------------------------
::: accessible/generic/DocAccessible.cpp
@@ +1466,5 @@
> + if (IPCAccessibilityActive()) {
> + nsIDocShell* docShell = mDocumentNode->GetDocShell();
> + if (RefPtr<dom::TabChild> tabChild = dom::TabChild::GetFrom(docShell)) {
> + DocAccessibleChild* ipcDoc = new DocAccessibleChild(this);
> + SetIPCDoc(ipcDoc);
Notes to self: there is a case in NotificationController::WillRefresh where we set an IPCDoc if there isn't one, but we don't tell the parent (we don't SendBindChildDoc).
Attachment #8827429 -
Flags: review?(dbolter) → review+
Comment 16•8 years ago
|
||
Pushed by tsaunders@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cbce75114440
create the DocAccessibleChild in DocAccessible::DoInitialUpdate() r=davidb
Comment 17•8 years ago
|
||
Pushed by tsaunders@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/148ee5df4dd3
fixup windows bustage landed on a CLOSED TREE
Comment 18•8 years ago
|
||
bugherder |
Updated•8 years ago
|
Comment 19•8 years ago
|
||
I'm not sure we've seen this crash since build 20170116030326 ?
Updated•8 years ago
|
Comment 20•8 years ago
|
||
(In reply to David Bolter [:davidb] from comment #19)
> I'm not sure we've seen this crash since build 20170116030326 ?
I see them, latest reports are on 20170130030205. so still a valid bug.
Assignee | ||
Comment 21•8 years ago
|
||
Comment hidden (mozreview-request) |
Comment 23•8 years ago
|
||
mozreview-review |
Comment on attachment 8841679 [details]
Bug 1325834: Add crash reporter annotations to report COM marshaling failure codes;
https://reviewboard.mozilla.org/r/115834/#review117442
Attachment #8841679 -
Flags: review?(jmathies) → review+
Comment 24•8 years ago
|
||
Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/82b3706348ae
Add crash reporter annotations to report COM marshaling failure codes; r=jimm
Comment 25•8 years ago
|
||
bugherder |
Comment 26•8 years ago
|
||
Hi Aaron, where would I look for these annotations? Are they collected in a field on the crash report?
Flags: needinfo?(aklotz)
Comment 27•8 years ago
|
||
(In reply to David Bolter [:davidb] from comment #26)
> Hi Aaron, where would I look for these annotations? Are they collected in a
> field on the crash report?
Yes, in the metadata tab. For instance: bp-4c57c544-4220-43d4-883c-3ca452170310
Flags: needinfo?(aklotz)
Assignee | ||
Comment 28•8 years ago
|
||
Here are today's aggregations, with the HRESULT constant names substituted for the error codes:
CoGetInterfaceAndReleaseStream failures:
1 E_INVALIDARG 23 36.51 %
2 TYPE_E_CANTLOADLIBRARY 16 25.40 %
3 TYPE_E_LIBNOTREGISTERED 4 6.35 %
CoMarshalInterface failures:
1 E_INVALIDARG 43 68.25 %
Assignee | ||
Comment 29•8 years ago
|
||
Here's a breakdown by OS version:
Windows 7
=========
CoGetInterfaceAndReleaseStream failures:
1 E_INVALIDARG 23 53.49 %
2 TYPE_E_CANTLOADLIBRARY 8 18.60 %
CoMarshalInterface failures:
1 E_INVALIDARG 31 72.09 %
Windows 10
==========
CoGetInterfaceAndReleaseStream failures:
1 TYPE_E_CANTLOADLIBRARY 8 40.00 %
2 TYPE_E_LIBNOTREGISTERED 4 20.00 %
It is interesting to note that there are no E_INVALIDARG errors from CoGetInterfaceAndReleaseStream on Windows 10.
CoMarshalInterface failures:
1 E_INVALIDARG 12 60.00 %
Assignee | ||
Comment 30•8 years ago
|
||
(In reply to Aaron Klotz [:aklotz] from comment #29)
> It is interesting to note that there are no E_INVALIDARG errors from
> CoGetInterfaceAndReleaseStream on Windows 10.
I say this because the IStream implementation that we use for that use case is from SHCreateMemStream, which was rewritten for Windows 8 with a more robust implementation. I am wondering whether the Windows 7 implementation is causing us problems in that case. I might try using a different stream implementation to see if that helps at all.
Assignee | ||
Comment 31•8 years ago
|
||
I also did some additional cleanup for robustness in the hope that maybe it will help deal with some of these other errors.
Attachment #8847365 -
Flags: review?(jmathies)
Assignee | ||
Comment 32•8 years ago
|
||
This revision includes some more cleanup
Attachment #8847365 -
Attachment is obsolete: true
Attachment #8847365 -
Flags: review?(jmathies)
Attachment #8847380 -
Flags: review?(jmathies)
Comment 33•8 years ago
|
||
Comment on attachment 8847380 [details] [diff] [review]
Change mscom::ProxyStream to use CreateStreamOnHGlobal instead of SHCreateMemStream on Windows 7 (r2)
Review of attachment 8847380 [details] [diff] [review]:
-----------------------------------------------------------------
::: ipc/mscom/COMPtrHolder.h
@@ +105,5 @@
> int bufLen;
> const BYTE* buf = proxyStream.GetBuffer(bufLen);
> MOZ_ASSERT(buf || !bufLen);
> aMsg->WriteInt(bufLen);
> + if (bufLen) {
Curious, why add this check?
::: ipc/mscom/ProxyStream.cpp
@@ +99,5 @@
> + if (!stream) {
> + return nullptr;
> + }
> + } else {
> + HGLOBAL hglobal = ::GlobalAlloc(GMEM_MOVEABLE, aInitBufSize);
we have a helper for HGLOBALs that might be useful -
http://searchfox.org/mozilla-central/source/xpcom/base/nsWindowsHelpers.h#220
Attachment #8847380 -
Flags: review?(jmathies) → review+
Assignee | ||
Comment 34•8 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #33)
> Comment on attachment 8847380 [details] [diff] [review]
> Change mscom::ProxyStream to use CreateStreamOnHGlobal instead of
> SHCreateMemStream on Windows 7 (r2)
>
> Review of attachment 8847380 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Curious, why add this check?
>
ProxyStream is now more permissive toward marshaling null COM objects. Instead of returning a failure and crashing IPC, it will now output an empty buffer. In that case I wanted to skip the unnecessary call to WriteBytes. I do suspect that we may be seeing null objects being passed through a11y IPC. While this is not good, I think that crashing in IPC is the wrong place for those problems to be surfaced. If we're passing nulls, I'd rather that the failure be exposed in code that actually points to the real problem, i.e. when a11y actually tries to do something with that null.
> ::: ipc/mscom/ProxyStream.cpp
> @@ +99,5 @@
> > + if (!stream) {
> > + return nullptr;
> > + }
> > + } else {
> > + HGLOBAL hglobal = ::GlobalAlloc(GMEM_MOVEABLE, aInitBufSize);
>
> we have a helper for HGLOBALs that might be useful -
> http://searchfox.org/mozilla-central/source/xpcom/base/nsWindowsHelpers.h#220
I decided against using that because ownership of the HGLOBAL is immediately handed off to the stream in the call to CreateStreamFromHGlobal.
Assignee | ||
Comment 35•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/2989d8d208f26e8f699e313fba1e45c8fc993381
Bug 1325834: Make mscom::ProxyStream use CreateStreamOnHGlobal instead of SHCreateMemStream on Windows 7; r=jimm
Comment 36•8 years ago
|
||
bugherder |
Assignee | ||
Comment 37•8 years ago
|
||
This hasn't been seen for quite some time. Resolving.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment 38•8 years ago
|
||
Do we need to backport something to Aurora/Beta here still?
status-firefox54:
--- → affected
status-firefox55:
--- → fixed
status-firefox-esr52:
--- → disabled
Flags: needinfo?(aklotz)
Target Milestone: --- → mozilla55
Assignee | ||
Comment 39•8 years ago
|
||
It's disabled on those channels.
Comment 40•7 years ago
|
||
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
Updated•5 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•