Closed Bug 1621500 Opened 5 years ago Closed 4 years ago

Crash in [@ webrtc::videocapturemodule::VideoCaptureImpl::DeliverCapturedFrame]

Categories

(Core :: WebRTC: Audio/Video, defect, P2)

74 Branch
Desktop
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- wontfix
firefox74 --- wontfix
firefox75 --- wontfix
firefox76 --- wontfix
firefox77 --- wontfix
firefox78 --- wontfix
firefox79 --- wontfix
firefox80 --- fixed

People

(Reporter: rory, Assigned: dminor)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug is for crash report bp-aa9589c9-b2ca-4f53-bf9d-0bd040200311.

Top 10 frames of crashing thread:

0 XUL webrtc::videocapturemodule::VideoCaptureImpl::DeliverCapturedFrame media/webrtc/trunk/webrtc/modules/video_capture/video_capture_impl.cc:129
1 XUL webrtc::videocapturemodule::VideoCaptureImpl::IncomingFrame media/webrtc/trunk/webrtc/modules/video_capture/video_capture_impl.cc:230
2 GeoServices -[GEOPDPictureItem copyWithZone:] 
3 CoreVideo CVPixelBuffer::removeAttachment 
4 CoreMedia bufQRemoveElementAtIndex 
5 libmozglue.dylib free memory/build/malloc_decls.h:54
6 XUL -[RTCVideoCaptureIosObjC captureOutput:didOutputSampleBuffer:fromConnection:] media/webrtc/trunk/webrtc/modules/video_capture/objc/rtc_video_capture_objc.mm:328
7 libobjc.A.dylib cache_getImp 
8 AVFoundation __49-[AVCaptureVideoDataOutput _render:sampleBuffer:]_block_invoke 
9 libdispatch.dylib _dispatch_call_block_and_release 

Component: Audio/Video → WebRTC

All of the crashes seem to be null pointers on OS X. VideoCaptureImpl::IncomingFrame is called directly from Objective-C code here [1]. It's possible this is a late callback, but looking at the code, it takes a lock when registering and removing data callbacks [2] but not when iterating on the set [3], which seems suspicious.

[1] https://searchfox.org/mozilla-central/rev/278046367dab878316f60f0bd7f740cf73f3c447/media/webrtc/trunk/webrtc/modules/video_capture/objc/rtc_video_capture_objc.mm#328
[2] https://searchfox.org/mozilla-central/rev/278046367dab878316f60f0bd7f740cf73f3c447/media/webrtc/trunk/webrtc/modules/video_capture/video_capture_impl.cc#105
[3] https://searchfox.org/mozilla-central/rev/278046367dab878316f60f0bd7f740cf73f3c447/media/webrtc/trunk/webrtc/modules/video_capture/video_capture_impl.cc#129

Assignee: nobody → dminor
Status: UNCONFIRMED → NEW
Component: WebRTC → WebRTC: Audio/Video
Ever confirmed: true
Priority: -- → P2

DeliverCaptureFrame is only called from IncomingFrame which takes the lock, so that can't be the problem.

We're seeing what looks like occasional late callbacks in
VideoCaptureImpl::DeliverCapturedFrame on OS X. When we call stopCapture, it in
turns calls directOutputToNil which should cause any newly capture frames to be
dropped. It is not clear from the existing code or the documentation what would
happen to any frames which are already enqueued. It looks like it is possible
for frames to be delivered on the old queue, which would explain late callbacks.

Marking this leave-open in case this isn't the problem underlying the crashes we're seeing.

Keywords: leave-open
Pushed by dminor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0bf2c6b5eed6
Null out _owner in RTCVideoCaptureIosObjC; r=ng

I'm landing a reverting change in Bug 1625288. I was trying to duplicate here what we do with the Android video capture code [1], but made the mistake of nulling out _owner in Stop, rather than adding a new method that is called when the corresponding C++ destructor is invoked. At this point, I'll wait until Firefox 77 to work on this again, to give us time to catch regressions.

[1] https://searchfox.org/mozilla-central/rev/b712398b7fae54ef377a558d6f16dede7a7f8530/dom/media/systemservices/android_video_capture/video_capture_android.cc#198

Attachment #9134419 - Attachment is obsolete: true

Looks like a somewhat rare nullptr crash on OS X.

Severity: normal → S3
Has STR: --- → no

See a couple of cases on 77 too, but not really recently.

No longer depends on: 1646904

I'm trying a different approach where we switch to using a serial queue (the newer upstream code does this) and then dispatch a task to that queue synchronously during shutdown. This should prevent late callbacks, which I think is the underlying cause of the crashes here.

Try run here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7aeea9ee0c0580038287a573ec9e08cbb6bc7e7a

The DISPATCH_QUEUE_PRIORITY_DEFAULT queue currently in use starts tasks as they
arrive rather than running tasks sequentially. By using a dedicated serial queue,
frames will be encoded and delivered in the order in which they are added to the
queue. It also makes it possible to post a task to the queue when video capture
is stopped, which can guarantee that all frames have been delivered before stop
completes. This should fix a crash we're seeing in DeliverCapturedFrame which
appears to be caused by racing between the main thread in Stop() and the capture
thread in captureOutput. The new queue is set to target the
DISPATCH_QUEUE_PRIORITY_DEFAULT, so tasks will still run there. This should give
us serial execution without the overhead of starting a new thread to manage our
queue.

The new upstream implementation creates a serial queue rather than using
DISPATCH_QUEUE_PRIORITY_DEFAULT. It also raises the priority of the queue above
default, but I think that should wait for a separate bug.

Pushed by dminor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3495f968530d
Use serial queue for video decoding on OS X; r=ng
Status: NEW → RESOLVED
Closed: 4 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: