Crash in [@ webrtc::videocapturemodule::VideoCaptureImpl::DeliverCapturedFrame]
Categories
(Core :: WebRTC: Audio/Video, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox109 | --- | unaffected |
firefox110 | --- | wontfix |
firefox111 | --- | fixed |
People
(Reporter: mccr8, Assigned: pehrsons)
References
(Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(3 files, 1 obsolete file)
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-beta+
|
Details |
Crash report: https://crash-stats.mozilla.org/report/index/8ad89e28-c397-49ab-bff6-616b70230119
Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Top 10 frames of crashing thread:
0 XUL nsCOMPtr<nsIThreadObserver>::operator= xpcom/base/nsCOMPtr.h:663
0 XUL mozilla::ThreadEventQueue::PutEventInternal xpcom/threads/ThreadEventQueue.cpp:140
0 XUL mozilla::ThreadEventQueue::PutEvent xpcom/threads/ThreadEventQueue.cpp:73
0 XUL mozilla::ThreadEventTarget::Dispatch xpcom/threads/ThreadEventTarget.cpp:122
0 XUL nsThread::Dispatch xpcom/threads/nsThread.cpp:676
1 XUL nsIEventTarget::Dispatch dist/include/nsIEventTarget.h:42
1 XUL mozilla::camera::CamerasParent::RecvStartCapture const dom/media/systemservices/CamerasParent.cpp:970
1 XUL mozilla::media::LambdaRunnable<mozilla::camera::CamerasParent::RecvStartCapture dom/media/systemservices/MediaUtils.h:77
2 XUL nsThread::ProcessNextEvent xpcom/ds/nsTObserverArray.h:89
3 XUL NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:473
This signature is pretty bad, but I opened a half dozen of these crashes and they all had CamerasParent::RecvStartCapture on the stack and are all on Mac OS, so I'm going to guess that this is fallout from bug 1806605.
Reporter | ||
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
|
||
The guess is reasonable. And this is very similar to bug 1811226 in terms of how it crashes.
That said, I have not been able to gather from the crash reports how this could be happening, nor have I been able to reproduce anything close to this (with ASAN).
Assignee | ||
Comment 2•2 years ago
|
||
I let a high-res capture that was used in multiple tabs run for a while since it was showing significant latency. That triggered the system to run out of memory. In my case this was handled gracefully by MacOS so I have no evidence it could be the source of any problems, but it's the closest I've gotten to a lead.
Assignee | ||
Comment 3•2 years ago
|
||
S4 because this is only enabled in nightly and early beta.
Comment 4•2 years ago
|
||
Set release status flags based on info from the regressing bug 1806605
Comment 5•2 years ago
|
||
Wontfix 110 as there is a low volume of crashes and this is gated pre-release channels.
Assignee | ||
Comment 6•2 years ago
|
||
Here are recent crashes with VideoCaptureAvFoundation
on the stack.
There are two kinds of crashes in the callback:
This bug and bug 1811226 I'm guessing are symptoms of the race above involving the callback having happened.
It seems most likely that there is a race between the frame callback (on a frameQueue), and the dtor. The AVCaptureSession's stopRunning (which we already wait for) is claimed to stop capture synchronously. However, it doesn't seem unreasonable that there are still frames in flight to the frameQueue once the session has stopped, and hence there's a race.
Thanks to :farre for bouncing ideas with me here.
Assignee | ||
Comment 7•2 years ago
|
||
There's also alwaysDiscardsLateVideoFrames = NO
suggesting frames can be queued up here.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 8•2 years ago
|
||
Comment 10•2 years ago
|
||
Copying crash signatures from duplicate bugs.
Assignee | ||
Comment 11•2 years ago
|
||
Leaving open so we can evaluate the effectiveness of the patch before closing.
Comment 12•2 years ago
|
||
Comment 13•2 years ago
|
||
bugherder |
Assignee | ||
Comment 14•2 years ago
|
||
Assignee | ||
Comment 15•2 years ago
|
||
This is similar to what the old backend does when stopping. Given that we do not
fully understand the failure mode yet, this is a speculative fix.
Comment 16•2 years ago
|
||
Comment 17•2 years ago
|
||
bugherder |
Assignee | ||
Comment 18•2 years ago
|
||
This didn't have the desired effect per crash-stats.
Assignee | ||
Comment 19•2 years ago
|
||
Filed upstream: https://bugs.webrtc.org/14907
Assignee | ||
Comment 20•2 years ago
|
||
This fixes applyConstraints which re-uses the backend with new settings.
Without this patch stopCapture would unset the delegate but startCapture would
not set it again.
Comment 21•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Comment 22•2 years ago
|
||
bugherder |
Assignee | ||
Comment 23•2 years ago
|
||
Comment on attachment 9318229 [details]
Bug 1811641 - [libwebrtc] Set output delegate when starting instead of during init. r?#webrtc-reviewers!
Beta/Release Uplift Approval Request
- User impact if declined: Camera reconfigs on Mac (in early beta or earlier) will cause all video tracks coming from that camera to freeze.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce: I don't think this needs manual test from QE since it is enabled only to early beta. But if you disagree the STR is simple:
Open https://jsfiddle.net/pehrsons/0ehb87au/
Click start
Verify that video is flowing after reconfigs (camera must be visibly reset, otherwise one may have to adjust the initial requested width). - List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): It is moving the timing of a call on the camera backend. Threading or lifetimes are not affected.
- String changes made/needed:
- Is Android affected?: No
Comment 24•2 years ago
|
||
Thanks :phersons for the uplift request.
Do you still need the leave-open on this keyword or can this be resolved?
We can look at the uplift, but wondering if there was any other work planned for this bug?
Assignee | ||
Comment 25•2 years ago
|
||
We still have crashes occurring so we can continue trying to fix them here or open another bug if that simplifies something? Unfortunately we have no insight into these crashes other than through crash-stats, so they're tricky to debug.
Comment 26•2 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #25)
We still have crashes occurring so we can continue trying to fix them here or open another bug if that simplifies something? Unfortunately we have no insight into these crashes other than through crash-stats, so they're tricky to debug.
Thanks for the context, it gets tricky when we are tracking a fix across multiple releases but I see this is only impacts early beta for now - it won't impact 111 release.
I'll take the uplift so we can see in the beta population, though the volume of the crash is low
Comment 27•2 years ago
|
||
Comment on attachment 9318229 [details]
Bug 1811641 - [libwebrtc] Set output delegate when starting instead of during init. r?#webrtc-reviewers!
Approved for 111.0b3
Comment 28•2 years ago
|
||
bugherder uplift |
Reporter | ||
Comment 29•2 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #25)
We still have crashes occurring so we can continue trying to fix them here or open another bug if that simplifies something? Unfortunately we have no insight into these crashes other than through crash-stats, so they're tricky to debug.
If you fixed some of the crashes, I'd change the summary of this bug to describe a bit what you did fix, and then open a new bug for the remaining crashes, if you think there's something worth investigating. Keeping a particular bug about a specific set of patches makes it easier to understand which releases have which patches, and bugs are cheap.
Reporter | ||
Comment 30•2 years ago
|
||
...and also close this bug, I meant to add.
Updated•2 years ago
|
Assignee | ||
Comment 31•2 years ago
|
||
I agree, I'm just not sure we really fixed something. It does seem like we fixed this class of crashes in the frame callback though. Unfortunately the others seem unrelated.
Description
•