Closed Bug 962719 Opened 11 years ago Closed 9 years ago

ASSERTION: Shouldn't have already notified of finish *and* have output!: '!streamHasOutput[i] || !stream->mNotifiedFinished'

Categories

(Core :: WebRTC: Audio/Video, defect, P4)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: florian, Assigned: karlt)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(12 files, 5 obsolete files)

(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
padenot
: review+
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
Very similar to bug 957832, but different steps to reproduce. Whenever I click the 'Stop Sharing' item I'm adding in bug 885796, I get this assertion twice in my terminal: ###!!! ASSERTION: Shouldn't have already notified of finish *and* have output!: '!streamHasOutput[i] || !stream->mNotifiedFinished', file srcdir/content/media/MediaStreamGraph.cpp, line 398 mozilla::MediaStreamGraphImpl::UpdateCurrentTime()+0x000005bc [NightlyDebug.app/Contents/MacOS/XUL +0x026b724c] mozilla::MediaStreamGraphImpl::RunThread()+0x0000012c [NightlyDebug.app/Contents/MacOS/XUL +0x026ba46c] mozilla::(anonymous namespace)::MediaStreamGraphInitThreadRunnable::Run()+0x00000041 [NightlyDebug.app/Contents/MacOS/XUL +0x026c87b1] nsThread::ProcessNextEvent(bool, bool*)+0x0000069a [NightlyDebug.app/Contents/MacOS/XUL +0x001225fa] NS_ProcessNextEvent(nsIThread*, bool)+0x000000a8 [NightlyDebug.app/Contents/MacOS/XUL +0x00023068] mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)+0x00000278 [NightlyDebug.app/Contents/MacOS/XUL +0x006d9618] MessageLoop::RunInternal()+0x00000076 [NightlyDebug.app/Contents/MacOS/XUL +0x0065a226] MessageLoop::RunHandler()+0x00000015 [NightlyDebug.app/Contents/MacOS/XUL +0x0065a135] MessageLoop::Run()+0x0000002d [NightlyDebug.app/Contents/MacOS/XUL +0x0065a0dd] nsThread::ThreadFunc(void*)+0x00000157 [NightlyDebug.app/Contents/MacOS/XUL +0x00120bf7] _pt_root+0x00000165 [NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x0023b9b5] _pthread_start+0x00000147 [/usr/lib/system/libsystem_c.dylib +0x00014742] While working on tests for the WebRTC UI in bug 804611, I also noticed that when this assertion happens, the recording-device-events notification is fired twice (I expect it once) for no apparent reason; I'm assuming this is related. Unless I get reviews quickly, to reproduce this you'll need to apply first the patch from bug 958071 and then the patch from bug 885796.
Blocks: 804611
Bug 958071, bug 885796 and bug 804611 have all landed now. The tests added in bug 804611 contain workarounds for this bug. I was hoping to at least have someone comment here to say if this assertion looks serious, or like something that we can continue to ignore for a while.
roc: see comment 1 "I was hoping to at least have someone comment here to say if this assertion looks serious, or like something that we can continue to ignore for a while." (given that the bugfixes have landed)
Flags: needinfo?(roc)
The assertion was turned into a warning in bug 957359, so the tests should not require warnings. In fact it would be easier to debug if the tests didn't contain workarounds, so I can re-enable the assertion and notice if anything fails.
(In reply to Karl Tomlinson (:karlt) from comment #3) > In fact it would be easier to debug if the tests > didn't contain workarounds, The workaround is for the recording-device-events notification that's received twice when it's expected only once, as explained in comment 0. > so I can re-enable the assertion and notice if anything fails. In comment 0 I said "Whenever I click the 'Stop Sharing' item I'm adding in bug 885796, I get this assertion twice in my terminal". Any reason why this way to reproduce isn't usable to debug?
(In reply to Florian Quèze [:florian] [:flo] from comment #4) > In comment 0 I said "Whenever I click the 'Stop Sharing' item I'm adding in > bug 885796, I get this assertion twice in my terminal". Any reason why this > way to reproduce isn't usable to debug? I've just verified that these steps to reproduce still work with an up to date debug build: 1. Load https://apprtc.webrtc.org/ in a tab. 2. Click "Share selected devices" and wait for the image of your webcam to appear 3. Click the green camera icon near the URL bar. 4. Click on the down arrow near 'Continue Sharing' 5. Click 'Stop Sharing' At this point, this warning appears twice in my terminal: [Parent 28176] WARNING: Shouldn't have already notified of finish *and* have output!: '!streamHasOutput[i] || !stream->mNotifiedFinished', file /Users/florian/buildhg/mozilla-central/content/media/MediaStreamGraph.cpp, line 407 If you want an _automated_ way to trigger the warning/assertion, you can edit http://mxr.mozilla.org/mozilla-central/source/browser/base/content/test/general/get_user_media.html?force=1#15 and remove the "fake: true", then run: ./mach mochitest-browser browser/base/content/test/general/browser_get_user_media.js
Thanks for the STR. This try run re-enabled the assertion and ran the test added in bug 804611. https://tbpl.mozilla.org/php/getParsedLog.php?id=35511324&tree=Try It reported > TEST-KNOWN-FAIL | chrome://mochitests/content/browser/browser/base/content/test/general/browser_get_user_media.js | Got the 'recording-device-events' notification twice, likely because of bug 962719 but I don't see this assertion failure in the output.
(In reply to Karl Tomlinson (:karlt) from comment #6) > Thanks for the STR. > > This try run re-enabled the assertion and ran the test added in bug 804611. [...] I don't see this assertion failure in the output. See the last paragraph of comment 5. You NEED to remove "fake: true" from the test case for the assertion to be triggered. And you need to run the test locally; AFAIK the test slaves don't have webcams attached (hence the use of fake streams for the test).
If the recording-device-events is unexpectedly fired twice, even when the assertion does not fail, doesn't that imply that the extra notification is unrelated to this assertion failure bug?
(In reply to Karl Tomlinson (:karlt) from comment #8) > If the recording-device-events is unexpectedly fired twice, even when the > assertion does not fail, doesn't that imply that the extra notification is > unrelated to this assertion failure bug? No. My assumption is that there's a bug causing some code to be called twice when it shouldn't, and that the code being called twice triggers the assertion for real streams. Fake streams hide the assertion. After debugging you may conclude that this assumption wasn't correct; but without prior knowledge of this specific piece of code, this assumption seemed reasonable to me.
mStateComputedTime is about 20 - 30 ms ahead of mCurrentTime, as expected. RecomputeBlocking() uses mStateComputedTime and stream->mFinished to set stream->mBlocked. UpdateCurrentTime() sets mNotifiedFinished in the first iteration after mFinished and stream->mBlocked have been set. In that iteration and the next, UpdateCurrentTime() uses values in mBlocked for 10 ms intervals from mCurrentTime to calculate what length of time in that period the stream was blocked. The stream is not detected blocked during that period because mCurrentTime has not yet reached the value mStateComputedTime had when mBlocked was set. The assertion failing indicates that the finished notification (on the graph thread) is sent before the last output notifications. I would like to change the output notifications to be consistent with the period of stream processing, which is based around mStateComputedTime, not mCurrentTime.
Assignee: nobody → karlt
Blocks: 956574
No longer blocks: 957832
Blocks: 957359
Blocks: 967606
No longer blocks: 967606
Still relevant?
backlog: --- → webRTC+
Rank: 45
Flags: needinfo?(karlt)
Priority: -- → P4
Yes. Re-enabling the assertion shows this is still failing. https://treeherder.mozilla.org/#/jobs?repo=try&revision=021f4e756537 The cause is also involved in bug 956574 and bug 1053011.
Flags: needinfo?(karlt)
Attachment #8642961 - Flags: review?(padenot)
Attachment #8642962 - Flags: review?(padenot)
I saw this assertion failing intermittently on linux dom/media/webspeech/synth/test/test_speech_cancel.html at https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa93c908bf4e
Attachment #8642963 - Flags: review?(padenot)
Attached patch introduce mProcessedTime (deleted) — Splinter Review
Attachment #8642965 - Flags: review?(padenot)
Attached patch move video frame scheduling to a helper method (obsolete) (deleted) — Splinter Review
Attachment #8642967 - Flags: review?(roc)
Attachment #8642968 - Flags: review?(roc)
This does not affect the target time of the video frames, but may mean that more frames are displayed. Covering from mProcessedTime to mStateComputedTime ensures that no chunks are missed.
Attachment #8642971 - Flags: review?(roc)
This provides HAVE_CURRENT_DATA readyState. The "waiting" event is only for when "Playback has stopped because the next frame is not available, but the user agent expects that frame to become available in due course." "If the previous ready state was HAVE_FUTURE_DATA or more, and the new ready state is HAVE_CURRENT_DATA or less If the media element was potentially playing before its readyState attribute changed to a value lower than HAVE_FUTURE_DATA, and the element has not ended playback, and playback has not stopped due to errors, paused for user interaction, or paused for in-band content, the user agent must queue a task to fire a simple event named timeupdate at the element, and queue a task to fire a simple event named waiting at the element." Currently NotifiyBlockingChanged() is usually called before NotifyEvent(EVENT_FINISHED) which is not what HTMLMediaElement wants because that incorrectly generates a "waiting" event before "ended". However, in unusual circumstances, and more frequently with the changes here to make main thread state consistent with processing boundaries, EVENT_FINISHED can be delivered before NotifyBlockingChanged(). Finished can be determined immediately after mFinished is set, but blocked is not determined until RecomputeBlocking() has run. In that situation, we still need UpdateReadyStateInternal() to run and want NextFrameStatus() to return NEXT_FRAME_UNAVAILABLE.
Attachment #8642973 - Flags: review?(roc)
Notifications are now up to date with processing, and MediaStream::GetCurrentTime() now returns "the main-thread's view of how much data has been processed by this stream", as documented.
Attachment #8642974 - Flags: review?(padenot)
Finished can be determined immediately after mFinished is set, but blocked is not determined until RecomputeBlocking() has run.
Attachment #8642975 - Flags: review?(padenot)
Attachment #8642976 - Flags: review?(padenot)
browser_get_user_media.js was renamed to browser_devices_get_user_media.js in 1d0e1126f1da. It seems that fixing the cause of assertions here does not resolve the cause of timeout in those tests when removing fake: true from the getUserMedia options. https://treeherder.mozilla.org/#/jobs?repo=try&revision=2cce95815cd4
(In reply to Karl Tomlinson (ni?:karlt) from comment #25) > browser_get_user_media.js was renamed to browser_devices_get_user_media.js > in 1d0e1126f1da. It seems that fixing the cause of assertions here does not > resolve the cause of timeout in those tests when removing fake: true from > the getUserMedia options. As I noted in comment 7, removing fake: true will only help reproducing locally. Using fake devices is required for the test infrastructure.
Comment on attachment 8642971 [details] [diff] [review] provide video frames to the container as soon as available Review of attachment 8642971 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/MediaStreamGraph.cpp @@ +1156,5 @@ > + if (!chunk) > + return; > + > + if(chunk->mFrame != aStream->mLastPlayedVideoFrame) { > + OutputVideoFrame(aStream, &chunk->mFrame, start); I'm confused. Why are we outputting multiple frames here? All but the last one will be ignored, I think.
Attachment #8642971 - Flags: review?(roc) → review-
Attachment #8642962 - Flags: review?(padenot) → review+
Attachment #8642963 - Flags: review?(padenot) → review+
Attachment #8642965 - Flags: review?(padenot) → review+
Attachment #8642974 - Flags: review?(padenot) → review+
Attachment #8642975 - Flags: review?(padenot) → review+
Attachment #8642976 - Flags: review?(padenot) → review+
Attachment #8642961 - Flags: review?(padenot) → review+
Attachment #8642967 - Attachment is obsolete: true
Attachment #8642971 - Attachment is obsolete: true
This avoids needing to find another place to count frame IDs.
Attachment #8644737 - Flags: review?(roc)
Attached patch use VideoFrameContainer::NewFrameID() (obsolete) (deleted) — Splinter Review
Attachment #8644738 - Flags: review?(roc)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #28) > I'm confused. Why are we outputting multiple frames here? The goal is to output all frames available so that they can be composited according to TimeStamp targetTime (and if they are not composited, then frame dropped counts are available). > All but the last one will be ignored, I think. Ah, OK. SetCurrentImages() only hands off previous images to the ImageClient immediately if called InImageBridgeChildThread(). Otherwise, as when called on the thread here, the GetCurrentImages() call happens on the ImageBridgeChild thread after SetCurrentImageInternal() and races with subsequent SetCurrentImages() calls. https://dxr.mozilla.org/mozilla-central/rev/5cf4d2f7f2f2b3df2f1edd31b8bdce7882f3875c/gfx/layers/ImageContainer.cpp#312 https://dxr.mozilla.org/mozilla-central/rev/5cf4d2f7f2f2b3df2f1edd31b8bdce7882f3875c/gfx/layers/ipc/ImageBridgeChild.cpp#420-429 Keeping to a single SetCurrentImages() call per iteration should usually give the image bridge child time to collect.
Attachment #8644740 - Flags: review?(roc)
I guess overflow will usually take at least 200 days, but AFAIK there is no disadvantage to using unsigned.
Attachment #8644741 - Flags: review?(roc)
Comment on attachment 8644740 [details] [diff] [review] provide video frames to the container as soon as available Review of attachment 8644740 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/MediaStreamGraph.cpp @@ +1169,5 @@ > + } > + images.AppendElement(ImageContainer::NonOwningImage(image, targetTime)); > + > + if (!aStream->mNotifiedFinished) { > + aStream->mLastPlayedVideoFrame = *frame; The meaning of mLastPlayedVideoFrame is unclear now. Here we'll be setting it to the last frame in the queue, but up above we'll skip adding it to the queue, which seems wrong. @@ +1180,5 @@ > for (uint32_t i = 0; i < aStream->mVideoOutputs.Length(); ++i) { > VideoFrameContainer* output = aStream->mVideoOutputs[i]; > > + for (auto& image : images) { > + image.mFrameID = output->NewFrameID(); Setting new frame IDs for every image in the list isn't really right. The same image can appear in multiple SetCurrentFrames calls over time, and each time it should have the same frame ID. Maybe that's a bit hard to do though... setting a new frame ID should be safe but might mean useless work.
Attachment #8644740 - Flags: review?(roc) → review-
Comment on attachment 8644738 [details] [diff] [review] use VideoFrameContainer::NewFrameID() I mis-assumed that RenderVideoFrames() dispatched frames only once. Given it can dispatch the same frame repeatedly, we don't want a different frame id each time.
Attachment #8644738 - Attachment is obsolete: true
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #34) > > + > > + if (!aStream->mNotifiedFinished) { > > + aStream->mLastPlayedVideoFrame = *frame; > > The meaning of mLastPlayedVideoFrame is unclear now. Here we'll be setting > it to the last frame in the queue, but up above we'll skip adding it to the > queue, which seems wrong. I was mis-assuming that we only sent frames to the compositor once, and that the compositor would choose from all those that have been sent, and so this existing logic was re-used to avoid sending the same frame again. That assumption was not correct, but this logic is still useful to avoid assigning a new frame id to the same frame. > Setting new frame IDs for every image in the list isn't really right. The > same image can appear in multiple SetCurrentFrames calls over time, and each > time it should have the same frame ID. Maybe that's a bit hard to do > though... setting a new frame ID should be safe but might mean useless work. Each PlayVideo() call processes an adjacent but distinct interval so the mLastPlayedVideoFrame logic provides that the same frame is not added again with what would be a new frame id. With the other patches here, the notifications, time updates, and processing all use the same time interval. This interval is a little in advance of the current time so that the frames are ready in time, but now that all intervals align, we can't examine stream times or data prior to the interval and so can't go back as far as the current time. Trying to keep old stream data with up-to-date notifications by using a different time interval for AdvanceTimeVaryingValuesToCurrentTime() would require analysing blocking over two different intervals, one for notifications and one for AdvanceTimeVaryingValuesToCurrentTime(), which I'd like to avoid. However the frames that we need to keep from the previous iteration are still on the VideoFrameContainer, so this patch collects them from there.
Attachment #8646167 - Flags: review?(roc)
Attachment #8644740 - Attachment is obsolete: true
Comment on attachment 8646167 [details] [diff] [review] provide video frames to the container as soon as available v3 Review of attachment 8646167 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/MediaStreamGraph.cpp @@ +1190,5 @@ > + output->GetImageContainer()->GetCurrentImages(&previousImages); > + uint32_t j = previousImages.Length(); > + for ( ; j--; ) { > + if (previousImages[j].mTimeStamp < currentTimeStamp) > + break; If previousImage[j].mTimeStamp < currentTimeStamp the frame could still be relevant, if it's the most recent frame before currentTimeStamp and the first timestamp of the frames in the graph is > currentTimeStamp. Right?
Attachment #8646167 - Attachment description: provide video frames to the container as soon as available v3 → provide video frames to the container as soon as available v3
Attachment #8646167 - Flags: review?(roc) → review-
Depends on: 1194558
dom/canvas/test/test_capture.html noticed that multiple images are left in the container when no more frames are generated. Although I assume the test could be made to pass by returning the last image from AutoLockImage::GetImage(), it seems to be the responsibility of the producer to remove expired images, so I've added support for that. This version also avoids overlapping timestamps should there be large time skew. It takes the simple approach of dropping frames. I don't know whether it would be worth scheduling the frames anyway with the next frame's timestamp to support counting frame dropping. (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #37) > > + uint32_t j = previousImages.Length(); > > + for ( ; j--; ) { > > + if (previousImages[j].mTimeStamp < currentTimeStamp) > > + break; > > If previousImage[j].mTimeStamp < currentTimeStamp the frame could still be > relevant, if it's the most recent frame before currentTimeStamp and the > first timestamp of the frames in the graph is > currentTimeStamp. > > Right? Correct. The code was designed to keep the most recent frame before currentTimeStamp. I've added a comment to indicate that, and fixed the loop to leave j at 0 when all frames have timestamp > currentTimeStamp; The previous version of the patch used the start time of each chunk for newImages. If there were multiple tracks, that start time may have been before currentTimeStamp and so keeping the most recent old frame may have sometimes been unnecessary. However, that approach for start time could result in timestamps out of order with multiple tracks. This version fixes that. If the last track finishes, then it's last frame should be played until the point it finishes instead of replacing it earlier when a frame from another track starts earlier. Timestamps are now always > currentTimeStamp. https://treeherder.mozilla.org/#/jobs?repo=try&revision=3791153a14ce
Attachment #8647852 - Flags: review?(roc)
Attachment #8646167 - Attachment is obsolete: true
Attachment #8647852 - Attachment description: provide video frames to the container as soon as available v4 → provide video frames to the container as soon as available v4
Attachment #8647852 - Flags: review?(roc) → review+
With the workaround for bug 1181006, attachment 8386517 [details] notices a 20 ms reduction in Web Audio currentTime delay. It should be simple to test that currentTime aligns with 128 sample blocks, but currently that is only true when there is more than one node due to ExtraCurrentTime(), as tracked in bug 1053011.
Blocks: 1053011
Flags: needinfo?(roc) → in-testsuite?
Keywords: perf
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: