Open Bug 1532193 Opened 6 years ago Updated 2 years ago

YouTube playback plays audio but shows as buffering

Categories

(Core :: Audio/Video: Playback, defect, P2)

ARM64
Windows 10
defect

Tracking

()

Tracking Status
firefox66 --- ?
firefox67 --- affected

People

(Reporter: ajones, Unassigned)

References

(Blocks 1 open bug)

Details

Occasionally when playing a YouTube video (often one that hasn't been watched recently) it will start playing the audio but show the buffering spinner. The video will eventually start playing but will have the wrong A/V sync.

Priority: -- → P2

Hi Anthony, did you find this on Nightly67 or Beta66?

Hi Brindusa, Tania, please drop a comment here if you have noticed this issue during the ARM64 testing as well. Thanks!

Flags: needinfo?(tmaity)
Flags: needinfo?(brindusa.tot)
Flags: needinfo?(ajones)

Just talked to Anthony, he was on Nightly 67.

Flags: needinfo?(ajones)

I have attempted to reproduce this issue on an aarch64 build of Nightly v67.0a1 from 2019-03-05 (on Lenovo Yoga, ARM/Qualcomm processor laptop) by heavy stressing the browser with many youtube videos playing at once or one-by-one, but no audio/video desynchronization could be observed after many attempts.
Furthermore, I thought that it might have been observed on the 32bit non-aarch64 build, where the videos would load a lot slower, but a desynchronization still could not be reproduced.

Anthony, can you give me more details on how you succeed in reproducing the audio/video desynchronization?

Flags: needinfo?(brindusa.tot) → needinfo?(ajones)
Flags: needinfo?(tmaity)
Assignee: nobody → cpearce

Steps to reproduce:

Expected results:

Each video starts flawlessly

Actual results:

The audio plays but only shows the first video frame. The issue is intermittent but usually happens in the first 3 or so videos.

Flags: needinfo?(ajones)

Summary of findings about the bug to date:

  • This issue effects MSE and src=url video; it repro's on ajones' device on YouTube videos, and also https://www.nzonscreen.com/title/shortland-street-first-episode-1992 which is a src=url video.
  • It is intermittent.
  • It appears to not repro at all on some Lenovo C630 devices, but given that it's intermittent, this is hard to be sure about.
  • The logging in VideoSink::RenderVideoFrames() shows that the video frames are faithfully being sync'd to the audio clock.
  • I added logging to AudioSink::PopFrames() to log (mStartTime + mLastGoodPosition).ToMicroseconds() and this shows that mStartTime + mLastGoodPosition diverges from the timestamp of the audio samples we're playing; on my x64 machine, the timestamp of the sample being played is always within a few ms, but on my ARM64 laptop it can end up being 7 seconds diverged.
  • When ajones attaches a USB headset, the bug does not repro. This suggests to me that there's a problem with the onboard sound hardware.
  • Edge does not have the bug.

I think the logging in AudioSink and the different behaviour when using a USB headset shows that we have a compat issue with the audio hardware.

(In reply to Chris Pearce [:cpearce (GMT+13)] from comment #5)

  • I added logging to AudioSink::PopFrames() to log (mStartTime + mLastGoodPosition).ToMicroseconds() and this shows that mStartTime + mLastGoodPosition diverges from the timestamp of the audio samples we're playing; on my x64 machine, the timestamp of the sample being played is always within a few ms, but on my ARM64 laptop it can end up being 7 seconds diverged.

It doesn't seem to wander, rather it is off by a fixed amount that varies from playback to playback. I've seen it being minutes off on some videos, playing the entire audio followed by the entire video.

Matthew is going to look into this.

Assignee: cpearce → kinetik

I cannot use the test page in comment 4 because it only works in New Zealand. NordVPN app is unusable because the TAP-Windows would not install on Lenovo Yoga C630 (probably because of its different architecture, arm64); Therefore, I cannot confirm this unless we have some other VPN app that I could use (free or Mozilla paid).

I will retire from this bug's investigation for now. If I can help with testing, please provide the necessary information and NI me.

@Anthony: It may be important to know which device you are using. Which device reproduces this issue?

Flags: needinfo?(ajones)

(In reply to Bodea Daniel [:danibodea] from comment #8)

I cannot use the test page in comment 4 because it only works in New Zealand. NordVPN app is unusable because the TAP-Windows would not install on Lenovo Yoga C630 (probably because of its different architecture, arm64); Therefore, I cannot confirm this unless we have some other VPN app that I could use (free or Mozilla paid).

I will retire from this bug's investigation for now. If I can help with testing, please provide the necessary information and NI me.

@Anthony: It may be important to know which device you are using. Which device reproduces this issue?

I'm using the Lenovo Yoga C630 WOS. This issue is reproducible on YouTube but the above is the most reliable repro I could find. This site uses src= video and is therefore not related to MSE.

Flags: needinfo?(ajones)

Confirming that this does seem to be a WASAPI issue around the behaviour of IAudioClock. Testing with the old WinMM backend (media.cubeb.backend = "winmm") seems to cause this issue to disappear, but the WinMM audio clock has a different issue as if it's running slowly or with poor granularity. Unsure if the WinMM behaviour is related to the WASAPI issue at this stage.

I had trouble reproducing this as easily as Anthony for quite some time, but now I seem to be able to reproduce it reliably with just two videos: in a fresh run of Nightly starting at https://www.nzonscreen.com/title/shortland-street-1992-f5c/series, first watching "Carmen after the truck crash", then as soon as it starts (and has no delay or broken A/V sync), clicking back and watching "Chris and Rachel's Wedding". No need to clear caches between runs.

libcubeb uses IAudioClock::GetPosition to fetch the current audio stream position to compute the playout latency. Per MSDN, IAudioClock::GetPosition may take "a few milliseconds" before it starts returning a non-zero position when used with a freshly started or reset stream. In this case, on a random stream (unsure of occurrence rate, but every second stream using the steps in the first paragraph of this comment), IAudioClock::GetPosition will continuously return a position of 0 for 6-7 seconds and then start returning a greater-than-zero position at the expected rate of updates but starting from 0 rather than the correct position in the stream, resulting in libcubeb calculating the playout latency as being 6-7 seconds. When this occurs, audio has actually started playing far before this 6-7s delay from IAudioClock::GetPosition resolves itself, so when GetPosition starts returning non-zero values, the position is wrong/offset - resulting in incorrect A/V sync.

Other than taking 6-7 seconds, there's no obvious pattern to the IAudioClock behaviour. All WASAPI calls complete in expected times, i.e. GetPosition nor other WASAPI calls are blocking for extended periods. IAudioClock::GetPosition never returns errors, the return value is always S_OK (also verified S_FALSE was not being returned, indicating a timeout fetching the position). In the non-broken streams, the delay between stream start and the first non-zero GetPosition is around 750ms (surprisingly long), but the positions are correct.

libcubeb previously did not pass a pointer to receive a pu64QPCPosition. Having added that, this value updates as expected (with a small offset from QueryPerformanceCounter) from the first IAudioClock::GetPosition call. As this is effectively a timestamp of when pu64Position was updated it's not directly useful to us.

Extending libcubeb to use IAudioClock2::GetDevicePosition reveals that the underlying device position is updating from stream start onward as expected (updates happen each time the device buffers are refilled via the event mechanism). Curiously, in the broken streams, the value returned by GetDevicePosition jumps backwards (by approximately ~7s in the case I observed) over two calls made within a few milliseconds before continuing to update from the new, lower value at the expected rate. Once the device position is >= the pre-backwards-jump value, IAudioClock::GetPosition seems to start returning non-zero values.

So we're possibly seeing a driver or hardware issue where the underlying audio device clock resets or rolls backwards, possibly then causing the code that guarantees IAudioClock::GetPosition returns monotonically increasing positions to reject position updates until the device position is >= the pre-rollback device position.

Using IAudioClock2::GetDevicePosition directly may be usable as a workaround for this issue - I'll need to investigate more to be certain.

When playing a broken stream, I also noticed that IAudioClock::GetPosition starts returning the correct position after the stream is stopped. In one example, the playout "delay" was 7.2s. The last GetPosition while playing returned 5.59s. Stream stopped (render thread shutdown). Next GetPosition returns 12.72s, jumping forward by nearly 7.2s.

Qualcomm believe this is fixed in a new version of their drivers. I am waiting to receive a copy of the updated driver to verify that.

Assignee: kinetik → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.