Closed
Bug 1406027
Opened 7 years ago
Closed 7 years ago
input-only AudioCallbackDriver overchurns MediaStreamGraph
Categories
(Core :: Audio/Video: MediaStreamGraph, defect, P1)
Tracking
()
VERIFIED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox56 | --- | wontfix |
firefox57 | --- | verified |
firefox58 | --- | verified |
People
(Reporter: pehrsons, Assigned: pehrsons)
References
Details
(Keywords: regression)
Attachments
(1 file, 1 obsolete file)
(deleted),
text/x-review-board-request
|
padenot
:
review+
ritu
:
approval-mozilla-beta+
|
Details |
STR
Load https://lab.silpaop.com/bug_audio/
Click Start/Stop
Speak
Click Start/Stop
Click Play
Expected: hear your own clear voice
Actualy: voice is distorted, and I think duration has doubled (i.e., for every recorded audio chunk we recorded another chunk of silence)
Assignee | ||
Comment 1•7 years ago
|
||
Interestingly it works fine here: https://jsfiddle.net/pehrsons/a9gugbut/8/
Comment 2•7 years ago
|
||
On this example , if we remove the "autoplay" attribute on the video node , the bug is present
Assignee | ||
Comment 3•7 years ago
|
||
I see -- so when the stream being recorded is not played in a media element at the same time we exhibit the bug. See https://jsfiddle.net/pehrsons/a9gugbut/10/
Assignee | ||
Comment 4•7 years ago
|
||
This is not in the recorder as such, but it appears to be MediaStreamGraph's resampling of input data.
I caught this in rr and in my case we are resampling from 32000 to 44100, and I have a case of 185 samples of sound and 71 samples of silence for a 256 sample iteration. Still digging for the root cause.
Assignee: nobody → apehrson
Status: NEW → ASSIGNED
Component: Audio/Video: Recording → Audio/Video: MediaStreamGraph
Assignee | ||
Comment 5•7 years ago
|
||
In a way bug 1296531 regressed this because it turned off direct listeners for audio with full duplex enabled.
But really it just exposes a latent issue in AudioCallbackDriver.
See this logs that are hopefully self-explanatory:
Input only:
> 2017-10-06 09:14:48.672980 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph Appending 320 frames
> 2017-10-06 09:14:48.673220 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph interval[135488; 135744] (frames: 132) (duration ticks: 256)
> 2017-10-06 09:14:48.674901 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph interval[135744; 136000] (frames: 177) (duration ticks: 256)
> 2017-10-06 09:14:48.675140 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph TrackUnionStream appending 71 null samples
> 2017-10-06 09:14:48.677960 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph interval[136000; 136256] (frames: 132) (duration ticks: 256)
> 2017-10-06 09:14:48.678054 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph TrackUnionStream appending 256 null samples
The null samples here are causing the distortions.
Input+Output:
> 2017-10-06 09:16:59.944354 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph Appending 320 frames
> 2017-10-06 09:16:59.944458 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph interval[83167; 83295] (frames: 132) (duration ticks: 128)
> 2017-10-06 09:16:59.946813 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph interval[83295; 83423] (frames: 176) (duration ticks: 256)
> 2017-10-06 09:16:59.950029 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph interval[83423; 83654] (frames: 133) (duration ticks: 128)
With output present we're fine. From the log it's clear that we don't unconditionally round up the MSG iteration duration.
This is because there is no output mixer writing to the scratch buffer to keep mBuffer.Available() at bay, [1], [2]. The same input+output log with mBuffer.Available() printed:
> 2017-10-06 09:33:49.529814 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph Appending 320 frames
> 2017-10-06 09:33:49.529963 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph interval[107103; 107231] (frames: 132) (duration ticks: 128) (available: 79)
> 2017-10-06 09:33:49.531827 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph interval[107231; 107359] (frames: 177) (duration ticks: 256) (available: 128)
> 2017-10-06 09:33:49.535139 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph interval[107359; 107590] (frames: 132) (duration ticks: 128) (available: 4)
Assignee | ||
Comment 6•7 years ago
|
||
Assignee | ||
Comment 7•7 years ago
|
||
I'm keeping this a P2 since we can work around the issue with direct listeners for now.
Rank: 15
Summary: gum-audio only recording is distorted → input-only AudioCallbackDriver overchurns MediaStreamGraph
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 10•7 years ago
|
||
I'm holding review on the test until I have data from try.
Test-only: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1dd00e4ba01f41176c1f1ccaff4e3d983b6761c9
With fix: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f5dd53e3b5c244f6361224d39a57a23decfe1a48
Comment 11•7 years ago
|
||
We just had a chat on irc about this. This needs to _not_ use an AudioContext in the mochitest, only gUM + MediaRecorder, and _then_ use an AudioContext to check the content (or use the duration, but I'd rather have the content checked).
An AudioContext always has an output (so an AudioMixer is always present), so it hides the bug.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 14•7 years ago
|
||
Assignee | ||
Comment 15•7 years ago
|
||
So while I had expected the test without the fix to fail on linux, it doesn't because the module-sine-source we use there for input has much less frequent callbacks -- i.e., MediaStreamGraph iterations are much longer (ca 8k frames compared to my microphone where it was ~130 frames), and add distortions per iteration.
We have plans to swap this source module for a monitor in bug 1406350. With the monitor I get even more frequent callbacks than with a microphone so reproducing there should be easy.
For now we'll land the fix with manual testing and I'll file a followup for the test that depends on bug 1406350.
Assignee | ||
Updated•7 years ago
|
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8915938 -
Attachment is obsolete: true
Attachment #8915938 -
Flags: review?(jib)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 18•7 years ago
|
||
I'm making this a P1 based on new findings. The issue fixed here seems to be the root cause of most our other sever audio distortion reports (see bug 1407088 comment 3 for instance).
Rank: 15 → 5
Priority: P2 → P1
Comment 19•7 years ago
|
||
mozreview-review |
Comment on attachment 8915939 [details]
Bug 1406027 - Always use the mixer so AudioCallbackDriver can track processed samples.
https://reviewboard.mozilla.org/r/187200/#review193144
Attachment #8915939 -
Flags: review?(padenot) → review+
Comment 20•7 years ago
|
||
Pushed by pehrsons@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/02be66f5ff0c
Always use the mixer so AudioCallbackDriver can track processed samples. r=padenot
Comment 21•7 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Assignee | ||
Updated•7 years ago
|
Blocks: 1319445
status-firefox56:
--- → wontfix
status-firefox57:
--- → affected
status-firefox-esr52:
--- → unaffected
Version: 58 Branch → 53 Branch
Assignee | ||
Comment 22•7 years ago
|
||
Comment on attachment 8915939 [details]
Bug 1406027 - Always use the mixer so AudioCallbackDriver can track processed samples.
Approval Request Comment
[Feature/Bug causing the regression]: bug 1319445
[User impact if declined]: A peer connection (webrtc) sending audio while same tab is *not* playing any MediaStream with audio, or WebAudio, to speakers will contain a lot of distortions, and build up a large delay, as observable by the receiver.
[Is this code covered by automated tests?]: No. Bug 1406910 will add a mochitest that tests this specific code path, as soon as we have a virtual input device that's more like a microphone (bug 1406350)
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: Yes, since there's no automated coverage yet. STR:
1 load https://jsfiddle.net/pehrsons/0cLfdk4u/
2 press "Start" and allow the capture
3 when the video appears we start recording audio and video for 3 seconds, say something
4 The recording is played automatically. Verify that you hear no audio distortions (Linux is very obvious, Mac not so much but still present, might depend on microphone so verify the broken state in 56 first)
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: The simplicity of the patch. It does make us take a new path for some cases inside a hot audio path, but there's very low computational overhead in this path. Just passing some integers around.
[String changes made/needed]: None
Attachment #8915939 -
Flags: approval-mozilla-beta?
Hi Jan-Ivar, Paul, is this a must fix for 57? This is an old regression (from 53) and I would rather not add any more risk to 57 by uplifting this. Can it ride the 58 train?
Flags: needinfo?(padenot)
Flags: needinfo?(jib)
Comment 24•7 years ago
|
||
As comment 18 mentions this is the root cause of several audio distortion reports. Audio distortions are the worst, as users strongly and negatively associate it with the browser. The length of time (some subset of) users have suffered this regression doesn't seem mitigating to me.
The low risk mentioned in comment 22 makes uplift appealing to me. If Paul agrees about the low risk, I'd say it's worth taking.
Flags: needinfo?(jib)
Comment 25•7 years ago
|
||
I've looked over this patch as Module Owner of webrtc. It's a very simple, low-risk patch.
The testcase above (doing any live recording) is seriously impacted (and that's the primary use of MediaRecorder, though in many cases if you're recording video (not just audio) a self-image display will be given, and that (even if muted) should bypass this issue. That would not be the case in audio-only recordings - typically they won't be attached to an element and playing while recording.
It does apparently also help some other cases of reported distortions/delay. Andreas: what other real-world usecases are fixed by this? I.e. please expand on comment 18 with specifics (beyond "seems to be").
Flags: needinfo?(apehrson)
Assignee | ||
Comment 26•7 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #25)
> I've looked over this patch as Module Owner of webrtc. It's a very simple,
> low-risk patch.
>
> The testcase above (doing any live recording) is seriously impacted (and
> that's the primary use of MediaRecorder, though in many cases if you're
> recording video (not just audio) a self-image display will be given, and
> that (even if muted) should bypass this issue. That would not be the case
> in audio-only recordings - typically they won't be attached to an element
> and playing while recording.
MediaRecorder uses a direct listener for audio in 57 so in most cases (like a getUserMedia stream straight to the recorder) it is not affected. It would however be affected with something as simple as `new MediaRecorder(new MediaStream(gUMStream.getTracks()));`.
> It does apparently also help some other cases of reported distortions/delay.
> Andreas: what other real-world usecases are fixed by this? I.e. please
> expand on comment 18 with specifics (beyond "seems to be").
PeerConnections are the ones affected since 53 because in 53 we disabled direct listeners when full duplex is enabled (turned on in 49 though perhaps it was backed out for a couple of versions, but irrelevant).
It was made worse by e10s-multi because of the lower chance of having a background tab in the same process play stream audio.
It was further made worse by MSG-per-document because now the same tab needs to play stream audio for this not to happen (bug 1330360, in 56).
The concrete use case for this in 57 is then sending microphone audio over an RTCPeerConnection without playing anything out. I could see this happen in either
1 A broadcasting mode, bug 1407088 is an example of this
2 When you're the first one to join a video conferencing service backed by a media server (SFU or MCU) so you start publishing audio immediately. However, you don't play audio locally until someone else joins and you start receiving their stream. I haven't tested this case in particular but at this point we have passed a big buffer of null data to the peer connection so I imagine a latency has built up. Perhaps webrtc.org can handle this, perhaps not. Jesup, do you know?
The fiddle in the STR in comment 22 reproduces this issue in RTCPeerConnection by recording remotely: https://jsfiddle.net/pehrsons/0cLfdk4u/
Flags: needinfo?(apehrson) → needinfo?(rjesup)
Comment on attachment 8915939 [details]
Bug 1406027 - Always use the mixer so AudioCallbackDriver can track processed samples.
Thanks Jan-Ivar, Andreas, Paul for the additional info. I agree that audio distortions, if widespread, is a severe problem and warrants an uplift, Beat57+
Attachment #8915939 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 29•7 years ago
|
||
bugherder uplift |
Updated•7 years ago
|
Flags: qe-verify+
Comment 31•7 years ago
|
||
I managed to reproduce the bug using the steps from comment 22 on an older version of Nightly (2017-10-05) on macOS 10.13 and Ubuntu 16.04 x64.
I retested everything using latest Nightly 58 and beta 57.0b12 on the same platforms and the bug is not reproducing anymore.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Updated•4 years ago
|
Flags: needinfo?(rjesup)
You need to log in
before you can comment on or make changes to this bug.
Description
•