Closed Bug 761274 Opened 12 years ago Closed 12 years ago

Linux <audio> broken, corrupted (and <video> b/c of audio) regression

Categories

(Core :: Audio/Video, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla17
Tracking Status
firefox15 + verified
firefox16 + verified

People

(Reporter: shawnlandden, Assigned: kinetik)

References

()

Details

(Keywords: regression)

Attachments

(6 files, 1 obsolete file)

video goes fast/slow, sound corrupted. Images in video good. pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=64187d60fae7&tochange=36e938e51481 likely culprit: Bug 756944 - Import ALSA cubeb backend. r=doublec https://hg.mozilla.org/mozilla-central/rev/4fd1d6d84b07
Component: Untriaged → Video/Audio
Product: Firefox → Core
Blocks: 756944
Depends on: video
Keywords: regression
Can you please attach the output from alsa-info.sh (found here: http://git.alsa-project.org/?p=alsa-driver.git;a=blob_plain;f=utils/alsa-info.sh) to the bug? Can you confirm that creating and setting the boolean pref "media.use_cubeb" to false works around the problem you're seeing?
yes, i set that pref to false, and it fixed the problem. If I then set the pref to true, and reset the audio context (stop, then start--pause/play only doesn't do it) It goes back to the corrupted state
Attached file alsa-info.sh output (deleted) —
Thanks, I'll see if I can reproduce this in a VM running the same software.
In a VirtualBox VM, I can't reproduce this with Ubuntu 12.04 LTS x86. Installing the x64_64 version to test against now.
No luck with a fresh or updated x86_64 install, either.
You could try kvm's emulated intel ac97. Its not HDA audio, which which would be better (closer) however.
Cant confirm any problems with html5 audio/video on Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/16.0 Firefox/16.0a1 with either cubeb setting My alsa-info output http://www.alsa-project.org/db/?f=6b970b27fbd72c1ede6e98aaf4e7d7b1883e088d
I built this: https://github.com/kinetiknz/cubeb/ against both alsa and pulse backends. And test_tone worked in both cases. However ever with a recompile with system compiler and libraries, Firefox audio is broken if i turn cubeb audio on.
Thanks, that's useful to know. The major difference between test_tone and Firefox is the latency requested when creating the stream. Would you mind trying the same buffering value in Firefox? Create the integer pref "media.cubeb_latency_ms" and try setting it to 250.
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to scientes from comment #7) > You could try kvm's emulated intel ac97. Its not HDA audio, which which > would be better (closer) however. I tried this (and the ich6 device, which is HDA) with KVM and a Fedora 17 host, but was unable to reproduce--same result as with VirtualBox and a Windows 7 host. I checked the VirtualBox config and found it was using ac97; I tried it again with the virtualized HDA device but couldn't get any sound out of the VM at all.
> Create the integer pref "media.cubeb_latency_ms" and try setting it to 250. I tried setting this at 250, and between 10 and 1000, and no value made it work, and it is an integer type.
Thanks for testing that. Would you mind running PulseAudio in verbose mode, reproducing the playback problem in Firefox, and attaching the resulting PA output? Instructions here: https://wiki.ubuntu.com/PulseAudio/Log
I can't reproduce the problem after manually launching pulseaudio even if started just as "pulseaudio" from bash ....wierd
That is odd. When you can reproduce the problem, can you confirm that Firefox is still using PulseAudio? Open the PA volume control and check the applications tab for "ALSA plug-in [firefox]".
> When you can reproduce the problem, can you confirm that Firefox is still using PulseAudio? Yes. > Open the PA volume control and check the applications tab for "ALSA plug-in [firefox]". Yes, this is present.
I just reproduced it and it was not showing up in the pavucontrol, but still made the corrupting sounds. I turned cubeb off, and then it worked, when i turned cubeb back on, it still worked after a reload of the page, disconnecting from pulseaudio and then reconnecting.
Adding qawanted to see if we can reproduce. If not, we'll untrack for release.
I'm feeling its some sort of race condition at this point, even though initially it was 100%--could be related to the pulseaudio daemon running at nice -11 by default.
In my case pulseaudio daemon running at nice level -6, but I have same issue.
runetmember@gmail.com, would you mind posting the output of alsa-info.sh (see comment 1 for a link)?
Attached file Another alsa-info.sh report (deleted) —
Sure, attached. Only E-Mu 0204 is in use.
I'm having the problem no with youtube html5. when i hit play, it goes in and out of the "applications" sound prefs, like its getting kicked off the bus for some reason If i set media.cubeb_latency_ms to 115(ms) it works if I then set it to 114(ms) it goes corrupted (above symptoms) again. Not sure if I was testing with youtube before, but if I was, youtube resets to flash if you do a force-reset, (ctrl-r) but if you turn off flash, while it resets to flash, it will fallback to html5. That might explain the wierd behavior reported above. I then reproduced this with the url in the bug report (<audio> tag)
Thanks, that is useful info. Are you testing with a very recent nightly? The latency pref turned out to be broken until I fixed bug 765524 a few days ago (the result being that if the pref was present, it would force a latency of 20ms). I was going to get you to retest with that fix, and also with an instrumented build, but hadn't had time to prepare it yet. The disconnect/reconnects are probably caused by the underrun handling--when we call snd_pcm_recover it causes the ALSA PulseAudio plugin to disconnect and reconnect from the PA server. That a higher latency works around the problem adds weight to that theory.
Yes I am on the nightly channel, and am up to date http://hg.mozilla.org/mozilla-central/rev/10e019421e6b I was surprised that there was such a specific cutoff, I expected it to be more range.
There are instrumented builds here: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mgregan@mozilla.com-65a68f56e408/ They log to stderr, and will produce quite a lot of output. If you could run them while reproducing the problem for a short time and attach the resulting log, that would be much appreciated.
Just a friendly reminder: if you are able to download and reproduce this bug with the instrumented build I mentioned in comment 26, that would be greatly appreciated. Thanks in advance.
I reproduced the problem, then set up the latency, and playback was fine, then set latency down, and playback broke again. Lots of "underrun occurred" as was suspected stdout: Suppressed timeupdate during seeking: currentTime=0.000000, new time=0.000000 Suppressed timeupdate during seeking: currentTime=0.000000, new time=0.000000 nsBuiltinDecoder::SeekingStopped, next state=2 Suppressed timeupdate during seeking: currentTime=0.000000, new time=0.000000 nsBuiltinDecoder::SeekingStopped, next state=2 nsBuiltinDecoder::SeekingStopped, next state=2 Suppressed timeupdate during seeking: currentTime=0.000000, new time=0.000000 nsBuiltinDecoder::SeekingStopped, next state=2 nsBuiltinDecoder::SeekingStopped, next state=2 Suppressed timeupdate during seeking: currentTime=0.000000, new time=0.000000 nsBuiltinDecoder::SeekingStopped, next state=2
Thanks for the log! The buffer and period sizes reported during setup look correct and match what I see locally. Buffer size is 100ms as requested, with a period size of just under 25ms (4 periods per buffer). After the first wakeup where an entire buffer worth of data is requested, a second wakeup occurs immediately and is already reporting an underrun via the result of snd_pcm_avail_update. I'd expect to see the second wakeup after ~25ms, requesting an additional period (25ms) of data. It seems like something's going wrong at the alsa-pulse or PA level, but I'm not sure what yet.
Bug 770815 has a PA log with the bug reproduced, which gives me another clue to work on. See bug 770815 comment 9.
I can reproduce this locally now, by using a hacked version of the ALSA PulseAudio plugin to force the PA stream to have buffer sizes different to what we've requested via ALSA. I modified this chunk http://gitorious.org/alsa/alsa-plugins/blobs/50562a9d5d54270ab67f183013204d24cdbeff21/pulse/pcm_pulse.c#line801 so that every reference to io->buffer_size and io->period_size is treated as 2x the actual value. This results in the PA stream being configured with twice the latency we've requested, and approximates the same failure we're seeing here when PA increases the latency on our behalf. So far I've confirmed that the buffer and period sizes reported back to us via ALSA after stream setup remain the initial computed values e.g. if we request 100ms latency, the buffer size will be 4410 frames and is reported as that size once the stream is configured (that is, after PA setup has resulted in a different size being selected). When the stream is started, snd_pcm_avail_update requests no more than 4410 frames initially, and reports zero after writing the requested number of frames. Calling snd_pcm_avail_update again immediately after a write reports zero, but sleeping on the PCM's waitable via poll results in an immediate wakeup requesting more data--except now snd_pcm_avail_update returns -EPIPE (signalling an underrun), so we enter xrun processing. In short, ALSA ends up in a situation where the higher level API thinks the buffer/period sizes are what it's reporting to us, but the ALSA PulseAudio plugin is working with a different (larger) buffer. We don't hit this bug with the old sydneyaudio backend for two major reasons: the first is that we used to use a 500ms latency, and the second is that the ALSA refill is driven by blocking writes rather than using poll to wait for space. The former means that for the cases we've seen so far, PA would not have upsized the latency request behind ALSA's back. Even if it had, because of the latter case, the buffer refill seems to be serviced correctly by the blocking-write model that sydneyaudio uses (even with my hacked plugin they old SA backend works). I'll take this up on the ALSA/PA devel lists. I suspect we're not going to be able to work around this except by bumping the default latency requested up high enough to cover the problem configurations.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
Note to self (and anyone trying to reproduce Linux audio issues in a VM): while looking through the provided PA log, I noticed that PA detects if it's running in a VM (pa_running_in_vm()) and disables tsched mode if so. tsched mode has much stricter requirements on timing and drivers, so it's where we're much more likely to encounter bugs and other behavioural issues. Also, I can reproduce this without a hacked ALSA PA plugin. pacmd list-sinks reports a latency range of 46-2000ms for my active sink (listed under the configured latency field). Setting media.cubeb_latency_ms to 45ms works, and 44ms fails per the bug description.
Keywords: qawanted
(In reply to Matthew Gregan [:kinetik] from comment #33) > Also, I can reproduce this without a hacked ALSA PA plugin. pacmd > list-sinks reports a latency range of 46-2000ms for my active sink (listed > under the configured latency field). Setting media.cubeb_latency_ms to 45ms > works, and 44ms fails per the bug description. This is somehow connected to the PA server increasing the minimum latency over time. When I restart the PA server (either with pulseaudio -k and auto respawn or by running it in non-daemon debug mode), pacmd list-sinks reports the latency as 0.5-2000ms and I can run the cubeb stream with any latency. In the case reported in bug 770815, the server is running with tsched disabled, so it's slightly different--it hasn't increased latency over time and just happens to run with a default latency higher than we're requesting.
Attached patch patch v0 (deleted) — Splinter Review
I haven't found a nice way to detect and work around this, and even if the ALSA PA plugin was fixed now we'd need to work around this for already deployed versions, so here's an ugly hack to do so now. Detect the ALSA PA plugin using the same logic we're already using in sydneyaudio, and round the latency up to 200ms if less is requested. Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=779af852bad4
Attachment #642725 - Flags: review?(chris.double)
scientes, runetmember, marb, and anyone else on this bug who was having problems, would you mind testing the builds here (http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mgregan@mozilla.com-779af852bad4/( and updating the bug to let me know if this work around is sufficient.
Yes, it is working for me.
Comment on attachment 642725 [details] [diff] [review] patch v0 Review of attachment 642725 [details] [diff] [review]: ----------------------------------------------------------------- Sorry about the delay, I missed this in my queue! ::: content/media/nsAudioStream.cpp @@ +884,2 @@ > { > return static_cast<nsBufferedAudioStream*>(aThis)->StateCallback(aState); Return not needed now that StateCallback's return type is void.
Attachment #642725 - Flags: review?(chris.double) → review+
Comment on attachment 642725 [details] [diff] [review] patch v0 Per discussion on the mailing list, a better workaround is to tell the PA ALSA plugin to ignore underruns. Patch coming up.
Attachment #642725 - Attachment is obsolete: true
Attached patch patch v1 (obsolete) (deleted) — Splinter Review
This contains a different workaround in cubeb_alsa.c The rest of the patch is the same as the last revision, but with the review comment addressed. Try push: https://tbpl.mozilla.org/?tree=Try&rev=e2a1b014a263
Attachment #647067 - Flags: review?(chris.double)
Attachment #647067 - Flags: review?(chris.double) → review+
Apologies for a second round of testing, but scientes, runetmember, marb, and anyone else on this bug who was having problems, would you mind testing the builds here https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mgregan@mozilla.com-e2a1b014a263/ and letting me know if this fix works for you?
Issue is still present for me with this build.
If we'd like to get a fix into FF15 still, we should be landing a final solution on mozilla-central as soon as possible. FF15 beta 3 (of 6) is going to build tomorrow.
(In reply to runetmember from comment #44) > Issue is still present for me with this build. Thanks for letting me know. It did work for me, but perhaps that workaround only helps within a small range of latency difference (I tested a PA server running with a minimum latency of 56ms while playing a stream configured for 30ms). Did the first attempted fix work for you? If so, I'll go ahead and land that one for now.
(In reply to Matthew Gregan [:kinetik] from comment #46) > Thanks for letting me know. It did work for me, but perhaps that workaround > only helps within a small range of latency difference (I tested a PA server > running with a minimum latency of 56ms while playing a stream configured for > 30ms). Also, would you mind attaching the output of "pacmd list-sinks"? That'll show me the minimum latency your PA is running with.
Comment on attachment 642725 [details] [diff] [review] patch v0 Per comment 45, for the sake of expediency I'll unobsolete the original fix and land it. It's simple and safe, and is closer to the behaviour we used to have. We're using the same PA detection code in the old audio backend already, and the other only change is to force the requested latency of the stream upwards.
Attachment #642725 - Attachment is obsolete: false
Comment on attachment 647067 [details] [diff] [review] patch v1 And marking this one obsolete since it's an incomplete fix.
Attachment #647067 - Attachment is obsolete: true
Comment on attachment 642725 [details] [diff] [review] patch v0 [Approval Request Comment] Bug caused by (feature/regressing bug #): 623444 User impact if declined: broken HTML 5 playback on Linux with particular configurations Testing completed (on m-c, etc.): locally Risk to taking this patch (and alternatives if risky): very low, uses quirk detection already in use in the old audio backend, plus increases audio stream latency to a value closer to the old backend String or UUID changes made by this patch: none
Attachment #642725 - Flags: approval-mozilla-beta?
Attachment #642725 - Flags: approval-mozilla-aurora?
Attached file latency test (deleted) —
(In reply to runetmember from comment #44) > Issue is still present for me with this build. If you could compile and run the attached test and attach the resulting output, that'd be very useful for working out why the second fix didn't work for you. Compile it with: % cc $(pkg-config alsa --cflags --libs) latency.c And then just run it: % ./a.out default node type is pulse 0.00: latency = 1, buffer_size = 96, period_size = 32 2.20: avail = 96, wrote = 96 [...] If it prints "xrun, giving up", try passing a larger latency via the command line, e.g. $ ./a.out 100 (also try 50, 200, 500) Thanks in advance!
I guess it's not so important now, but for me patch v1 was working.
(In reply to marb from comment #53) > I guess it's not so important now, but for me patch v1 was working. That's still useful to know, thank you. It's a better fix, and I plan to revisit that solution when I get some more time; hopefully I can find out why it doesn't work for runetmember.
Attached file pacmd list-sinks output (deleted) —
> Did the first attempted fix work for you? Link in Comment 38 is give 404 now. I guess I will able to check patch v0 in 15 beta 3 soon, right? > If you could compile and run the attached test and attach the resulting output, that'd be very useful for working out why the second fix didn't work for you. I would like to, but I'm not programmer. If you upload binary, I will check it. > Also, would you mind attaching the output of "pacmd list-sinks"? Attached.
(In reply to runetmember from comment #55) > > Did the first attempted fix work for you? > Link in Comment 38 is give 404 now. I guess I will able to check patch v0 in > 15 beta 3 soon, right? Sorry, that test build expired. I pushed a new one, it should be available in an hour or so: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mgregan@mozilla.com-adb9fbbd06cf/ I'm fairly confident this will work, but it's good to collect more data to confirm that. > I would like to, but I'm not programmer. If you upload binary, I will check > it. Thanks, while preparing a binary that I knew would work on your system I found what I suspect is the problem, which is that Ubuntu's ALSA config space is quite different to the system I was testing on, so the workaround wasn't being applied. Once I've fixed that, I'll let you know so you can test the new build.
> I pushed a new one, it should be available in an hour or so Ok, I download it, and will check later. This bug is reproducible for me not on every work session. Sometimes (like today) bug disappear after reboot. I will check test build when I again get bug reproducible.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment on attachment 642725 [details] [diff] [review] patch v0 [Triage Comment] Approving for Aurora 16, but holding off till the next beta for approval there. That'll allow for some bake time.
Attachment #642725 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Blocks: 779392
I'll continue working on the "patch v1"/handle_underrun workaround in bug 779392.
Matthew can you confirm if this is fixed on Aurora builds now?
I can confirm this is fixed in nightly builds, after removing workarounds discussed above or using a new profile. (original reporter)
(In reply to Lukas Blakk [:lsblakk] from comment #62) > Matthew can you confirm if this is fixed on Aurora builds now? Tested locally and the fix works as expected. How I tested: 1. Run the following command line to discover the PA server's minimum latency: % pacmd list-sinks | awk '/index:/ {p=$1=="*"}; p==1 && /latency/ {print $0}' current latency: 0.00 ms configured latency: 0.00 ms; range is 56.00 .. 2000.00 ms 2. Note that the current minimum is 56ms (this value depends on the server's state) 3. Set the hidden integer pref media.cubeb_latency_ms to a value less than 56ms (I used 20ms) (this isn't required if the minimum is higher than 100ms, our default) 4. Opened a test WebM file and confirmed that it plays back as expected with audio 5. Reset the pref to the default value
Comment on attachment 642725 [details] [diff] [review] patch v0 We'll take this on Beta then, please land today so this gets into Beta 4.
Attachment #642725 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Verified on: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0 (20120808131812)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: