Closed Bug 1133190 Opened 10 years ago Closed 10 years ago

Crashes in cubeb_stream_init

Categories

(Core :: Audio/Video, defect)

All
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox36 + fixed
firefox37 + fixed
firefox38 + fixed

People

(Reporter: jimm, Assigned: padenot)

References

Details

(Keywords: crash)

Crash Data

Attachments

(9 files)

#1 top content crasher right now. Appears to be tied to playing video on youtube. https://crash-stats.mozilla.com/report/list?product=Firefox&range_value=3&range_unit=days&date=2015-02-14&signature=RtlpWaitOnCriticalSection+|+audioses.dll%400x3ec0&version=Firefox%3A38.0a1#tab-reports https://crash-stats.mozilla.com/report/index/7bd99c09-4bd8-48e3-9223-5f7f02150212 4 xul.dll `anonymous namespace'::SafeRelease<IAudioClient>(IAudioClient*) media/libcubeb/src/cubeb_wasapi.cpp 5 ntdll.dll RtlDeNormalizeProcessParams 6 xul.dll `anonymous namespace'::owned_critical_section::enter() media/libcubeb/src/cubeb_wasapi.cpp 7 xul.dll `anonymous namespace'::setup_wasapi_stream(cubeb_stream*) media/libcubeb/src/cubeb_wasapi.cpp 8 xul.dll `anonymous namespace'::wasapi_stream_init(cubeb*, cubeb_stream**, char const*, cubeb_stream_params, unsigned int, long (*)(cubeb_stream*, void*, void*, long), void (*)(cubeb_stream*, void*, cubeb_state), void*) media/libcubeb/src/cubeb_wasapi.cpp 9 xul.dll cubeb_stream_init media/libcubeb/src/cubeb.c 10 xul.dll mozilla::AudioStream::OpenCubeb(cubeb_stream_params&, mozilla::AudioStream::LatencyRequest) dom/media/AudioStream.cpp 11 xul.dll mozilla::AudioStream::Init(int, int, mozilla::dom::AudioChannel, mozilla::AudioStream::LatencyRequest) dom/media/AudioStream.cpp 12 xul.dll mozilla::AudioSink::InitializeAudioStream() dom/media/AudioSink.cpp 13 xul.dll mozilla::AudioSink::AudioLoop()
Crash Signature: [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] → [@ RtlpWaitOnCriticalSection | audioses.dll@0x3ec0] [@ RtlpWaitOnCriticalSection | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ RtlpWaitOnCriticalSection | ATL::CComObject<CAudioClient>::`vector deleting destructor…
Summary: crash in RtlpWaitOnCriticalSection | audioses.dll@0x3ec0 → Crashes in cubeb_stream_init
Crash Signature: destructor''(unsigned int)] → destructor''(unsigned int)] [@ RtlpWaitOnCriticalSection | EtwEventEnabled | `anonymous namespace''::owned_critical_section::enter() ]
[Tracking Requested - why for this release]: By far the #1 crash in 36.0b9
Crash Signature: destructor''(unsigned int)] [@ RtlpWaitOnCriticalSection | EtwEventEnabled | `anonymous namespace''::owned_critical_section::enter() ] → destructor''(unsigned int)] [@ RtlpWaitOnCriticalSection | EtwEventEnabled | `anonymous namespace''::owned_critical_section::enter() ] [@ zzz_AsmCodeRange_End | RtlpDeCommitFreeBlock | `anonymous namespace''::owned_critical_section::enter()] [@ zzz_As…
Bug 1133306 has some good data on this: Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c939ab0df211&tochange=c4500f0d25cd Suspect: Bug 1127213 and Bug 1125804 Also two sets of STR: (Bug 1133306 comment #0) 1. Start browser with newly created profile (e10s is blocked by IME detect) 2. Open https://www.youtube.com/watch?v=R6KJjPqlPz4&index=3&list=PL75_HhpYGJQ1Fzv9a46FlHfiy-fJusKBZ 3. Click seek bar within buffered range Actual Results: Crash immediately when I click seek bar within buffered (Bug 1133306 comment #3) 1. Start browser with newly created profile (e10s is blocked by IME detect) 2. Open https://www.youtube.com/watch?v=R6KJjPqlPz4&index=3&list=PL75_HhpYGJQ1Fzv9a46FlHfiy-fJusKBZ 3. Click a thumnail in the right-hand play lists(black background area). And repeat Step3 if need
Crash Signature: , cubeb_stat...] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_End] → , cubeb_stat...] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_End] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_Begin ]
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #3) > Suspect: Bug 1127213 and Bug 1125804 Given that, Paul, could you please take a look? This right now is a blocker for shipping 36 or even doing an RC this upcoming week.
Flags: needinfo?(padenot)
I'm trying to repro (method 1) on an inbound debug build from a few days ago, no luck so far. Will try again on beta... and also repro #2 Turned off e10s on inbound - seeking caused a bunch of e10s errors sometimes: ###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0xFFFB,name=???) Value error: message was deserialized, but contained an illegal value [Parent 14704] ###!!! ASSERTION: Wrong size for this Shmem!: 'Error', file c:\mozilla\inbound2\ipc\glue\Shmem.cpp, line 459 ###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0xFFFC,name=???) Payload error: message could not be deserialized ###!!! [Parent][DispatchAsyncMessage] Error: (msgtype=0xFFFB,name=???) Value error: message was deserialized, but contained an illegal value
AH, my debug build was built on a base from the 6th or so, so the suspect landings were after that.
With a current m-i debug build, I also cannot get it to crash (e10s off). Note: my video driver is blacklisted for acceleration (Lenovo W520 2011 (INtel+nvidia GPUs) with the original Intel GFX driver)
Tried on Nightly 2/15 and current Beta on Win 8.1 with a DX9 video card (NVidia 7600GT), cannot reproduce there.
And nothing on the Lenovo W520, Win7 using beta either
Alice - can you get more info about your setup and post it here? The Troubleshooting info in particular, plus perhaps what audio system and drivers your windows machine uses. If this is related to video somehow, it might be because my systems have 2d acceleration disabled by default.
Flags: needinfo?(alice0775)
bp-5be5a987-0851-4c78-bf05-9c1d82150215 with HWA off and non-e10s. In this case, you might need step4. And Beta36.0b9 did not crash in my test. Aurora37.0 and Nightly38.0 crashes in my test. Steps 1. Start browser with newly created profile 2. Open https://www.youtube.com/watch?v=R6KJjPqlPz4&index=3&list=PL75_HhpYGJQ1Fzv9a46FlHfiy-fJusKBZ 3. Click seek bar within buffered range 4. And rapidly repeat step3 (should not pause the video and should not mute while testing)
Flags: needinfo?(alice0775)
(In reply to Alice0775 White from comment #11) > bp-5be5a987-0851-4c78-bf05-9c1d82150215 with HWA off and non-e10s. Err bp-b3f39f2c-2b31-4ba5-b0df-448552150215 with HWA off and non-e10s.
Attached file about_support.txt (deleted) —
(In reply to Randell Jesup [:jesup] from comment #10) > Alice - can you get more info about your setup and post it here? The > Troubleshooting info in particular, plus perhaps what audio system and > drivers your windows machine uses. About:support: attached. Realtek High Definition Audio 6.0.1.5864 Stereo Speaker > > If this is related to video somehow, it might be because my systems have 2d > acceleration disabled by default. I dint think so, because it crashes with disabled HWA.
This is connected to audio, nor video. cubeb is AFAIK an audio thing.
(In reply to Alice0775 White from comment #12) > (In reply to Alice0775 White from comment #11) > > bp-5be5a987-0851-4c78-bf05-9c1d82150215 with HWA off and non-e10s. > > Err > bp-b3f39f2c-2b31-4ba5-b0df-448552150215 with HWA off and non-e10s. I'm afraid that shows almost nothing in the stack It does not repro for me on Aurora either, on either system. One has default W520 audio (using USB headphones, note), the other has a Creative soundcard in it (old Creative SB Audigy 2 ZS)
"By far the #1 crash in 36.0b9" is interesting, given the above. Likely there's some timing and/or driver issue involved. Alice's inability to repro on b9 when she can repro on nightly/aurora would also point to timing
So, it is a super-common crasher on b9. One example: https://crash-stats.mozilla.com/report/index/5571a4d4-ba75-42ca-8a6e-73d7d2150213 Note the crash address, freed memory likely. Again implicating timing and locks (and this is likely the critical section (stream_reset_lock) itself that's freed somehow, or not initialized. The stm (cubeb_stream) struct is allocated with calloc(), so it shouldn't be garbage to start. Alice (or kairo): any chance of getting a MSVC backtrace/locals or dump so we can look at higher levels? ALso: can you try to repro with a debug build? There are lock assertions which might be firing in debug before it crashes. Thanks!!
Flags: needinfo?(alice0775)
Attached file log.txt, crash on Nightly (deleted) —
Flags: needinfo?(alice0775)
Attached file log2.txt, crash on Aurora (deleted) —
Crashing is slightly difficult on the Debug build of Nightly and Aurora.
Attachment #8564779 - Attachment description: log.txt → log.txt, crash on Nightly
Attachment #8564780 - Attachment description: log2.txt → log2.txt, crash on Aurora
Attached file log3.txt, crash on Nightly (deleted) —
Since this only just came up in the recent beta, it must be a regression from bug 1131768. Jesup, does this diff help you debug? https://hg.mozilla.org/releases/mozilla-beta/rev/69a8d311ddd6 Another option is to just roll back the entire series of audio-device changes given that MSE isn't happening in 36.
Blocks: 1131768
Crash Signature: , cubeb_stat...] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_End] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_Begin ] → , cubeb_stat...] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_End] [@ RtlpNotOwnerCriticalSection | zzz_AsmCodeRange_Begin ] [@ RtlRaiseStatus | RtlRaiseStatus | devicetopologyp_IID_Lookup ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x348…
(In reply to David Major [:dmajor] (UTC+13) from comment #23) > Another option is to just roll back the entire series of audio-device > changes given that MSE isn't happening in 36. Yes, I think it's time to do this. There have been too many regressions already
Preparing a backout patch for beta and aurora now. If we decided we want it on aurora for MSE, we can backport it again after it has stabilized.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
Flags: needinfo?(padenot)
Attached patch bug1133190_v0.patch (deleted) — Splinter Review
This patch applies cleanly to beta and aurora, and reverts the following changesets: https://hg.mozilla.org/releases/mozilla-beta/rev/20f7d44346da https://hg.mozilla.org/releases/mozilla-beta/rev/0411d20465b4 https://hg.mozilla.org/releases/mozilla-beta/rev/9579b9ab68ca https://hg.mozilla.org/releases/mozilla-beta/rev/69a8d311ddd6 https://hg.mozilla.org/releases/mozilla-beta/rev/69a8d311ddd6 Approval Request Comment [Feature/regressing bug #]: bug 698079 [User impact if declined]: crashes during media playback, possibly other undiscovered regressions [Describe test coverage new/current, TreeHerder]: n/a [Risks and why]: low, reverting to code that we have been shipping for some time [String/UUID change made/needed]: none
Attachment #8564804 - Flags: approval-mozilla-beta?
Attachment #8564804 - Flags: approval-mozilla-aurora?
Crash Signature: audioses.dll@0x348bf ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x3489f ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x32dcf ] → audioses.dll@0x348bf ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x3489f ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x32dcf ] [@ MD4Transform ] [@ RtlpNotOwnerCriticalSection | MD4Transform ]
From alice's logs: Nightly: [5916] WARNING: AudioStream::OpenCubeb() fe70140 failed to init cubeb: file c:\builds\moz2_slave\m-cen-w32-d-000000000000000000\build\src\dom\media\AudioStream.cpp, line 513 [5916] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file c:\builds\moz2_slave\m-cen-w32-d-000000000000000000\build\src\dom\media\AudioStream.cpp, line 393 Aurora: [724] WARNING: AudioStream::OpenCubeb() 16dda4e0 failed to init cubeb: file c:\builds\moz2_slave\m-aurora-w32-d-000000000000000\build\dom\media\AudioStream.cpp, line 513 [724] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file c:\builds\moz2_slave\m-aurora-w32-d-000000000000000\build\dom\media\AudioStream.cpp, line 393 [724] WARNING: Initializing AudioStream failed.: file c:\builds\moz2_slave\m-aurora-w32-d-000000000000000\build\dom\media\AudioSink.cpp, line 155 Note, this doesn't get us the local vars/etc of a debugger backtrace, but it may point to the event that laid the landmine
Crash Signature: audioses.dll@0x348bf ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x3489f ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x32dcf ] [@ MD4Transform ] [@ RtlpNotOwnerCriticalSection | MD4Transform ] → audioses.dll@0x348bf ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x3489f ] [@ RtlRaiseStatus | RtlRaiseStatus | audioses.dll@0x32dcf ] [@ MD4Transform ] [@ RtlpNotOwnerCriticalSection | MD4Transform ] [@ zzz_AsmCodeRange_Begin | RtlpWaitOnCr…
(In reply to Matthew Gregan [:kinetik] from comment #25) > Preparing a backout patch for beta and aurora now. If we decided we want it > on aurora for MSE, we can backport it again after it has stabilized. We are trying to ship MSE in 37, that is still the plan. And 37 is on Aurora now.
Comment on attachment 8564804 [details] [diff] [review] bug1133190_v0.patch As Kairo said, we want to ship MSE in 37. So, we want to fix the bug itself in 37.
Attachment #8564804 - Flags: approval-mozilla-beta?
Attachment #8564804 - Flags: approval-mozilla-beta+
Attachment #8564804 - Flags: approval-mozilla-aurora?
Attachment #8564804 - Flags: approval-mozilla-aurora-
So, Alice's log show that netwerk is failing at some cache operation, while instantiating nsBinaryStreams, MediaSource is failing at reading some data (possibly related), the MP4 reader is failing at reading data from cache, gfx is failing at allocating texture, and cubeb is failing at allocating a resampler. Overall, it does not look great.
The try push in comment 32 is based on current inbound, and includes various patches that fix various issues with the code that seem to cause the issues, here, but more importantly, enables more logging. Alice, you seem to be able to reproduce easily (I've all day with tried the same driver version you mention on a dell machine and on a lenovo machine, with various audio settings, no luck), would you mind grabbing the builds [0] from the try push and giving us logs ? Also, if possible, could you enable the MediaDecoder:5 NSPR log as well? [0]: https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/paul@paul.cx-f6aab427b0cb/
Flags: needinfo?(alice0775)
Attached file logtrybuild.zip (deleted) —
log set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,MediaDecoder:5
Flags: needinfo?(alice0775)
Alice, do you confirm attachment 8565135 [details] shows a crash? I would also ideally need the stdout for this process, if possible, as my custom logging is not hooked to the NSPR logging system.
Flags: needinfo?(alice0775)
(In reply to Paul Adenot (:padenot) from comment #35) > Alice, do you confirm attachment 8565135 [details] shows a crash? Yes, the try OPT build crashed. > I would > also ideally need the stdout for this process, if possible, as my custom > logging is not hooked to the NSPR logging system. Is there a way to logging both stdout and NSPR log? The following is not enough? > set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,MediaDecoder:5 > firefox.exe Which build do you want to logging? the try OPT build or try DEBUG build?
Flags: needinfo?(alice0775) → needinfo?(padenot)
(In reply to Alice0775 White from comment #36) > (In reply to Paul Adenot (:padenot) from comment #35) > > Alice, do you confirm attachment 8565135 [details] shows a crash? > > Yes, the try OPT build crashed. > > > I would > > also ideally need the stdout for this process, if possible, as my custom > > logging is not hooked to the NSPR logging system. > > Is there a way to logging both stdout and NSPR log? > The following is not enough? > > set NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,MediaDecoder:5 > > firefox.exe > > Which build do you want to logging? the try OPT build or try DEBUG build? The opt build. In fact, it should not matter, I enabled assertion and logging on both.
Flags: needinfo?(padenot)
Attached file log.txt (deleted) —
log.txt(Standard Output and Standard Error) crashed with try DEBUG build.
Attached file NSPR_LOG_FILE.zip (deleted) —
NSPR_LOG_FILE with try DEBUG build (logged at the same time as Comment 38)
Assignee: kinetik → padenot
(In reply to Sylvestre Ledru [:sylvestre] from comment #30) > Comment on attachment 8564804 [details] [diff] [review] > bug1133190_v0.patch > > As Kairo said, we want to ship MSE in 37. So, we want to fix the bug itself > in 37. I think it's a mistake to keep these changes on Aurora until they have stabilized on Nightly, but okay. Backout landed on beta: https://hg.mozilla.org/releases/mozilla-beta/rev/82339d98aa6a
Comment on attachment 8564804 [details] [diff] [review] bug1133190_v0.patch (In reply to Matthew Gregan [:kinetik] from comment #40) > I think it's a mistake to keep these changes on Aurora until they have > stabilized on Nightly, but okay. I agree with Matthew. Our goal is to stabilize Aurora. We should backout this patch and reland a proper patch when it is ready. I am reverting the previous Aurora decision (Sylvestre and I have already discussed this) and approving this backout for Aurora. Aurora+
Attachment #8564804 - Flags: approval-mozilla-aurora- → approval-mozilla-aurora+
Crash Signature: RtlpWaitOnCriticalSection | `anonymous namespace''::wasapi_stream_init(cubeb*, cubeb_stream**, char const*, cubeb_stream_params, unsigned int, long (*)(cubeb_stream*, void*, void*, long), void (*)(cubeb_stream*, void*, cubeb_st... ] → RtlpWaitOnCriticalSection | `anonymous namespace''::wasapi_stream_init(cubeb*, cubeb_stream**, char const*, cubeb_stream_params, unsigned int, long (*)(cubeb_stream*, void*, void*, long), void (*)(cubeb_stream*, void*, cubeb_st... ] [@ `anonymous namesp…
Attached patch patch (deleted) — Splinter Review
So, the issue is that when you have some combination of hardware/driver, and try to init too much AudioClients, you get back an 0x88890017 error very often. This is WASAPI telling us the mixer process does not have enough CPU to create another stream. Since we recreate streams when we seek, we end up with a bunch of streams. At the same time, it looks like (looking at the memory footprint of the audio mixer process in the task manager), that the system takes some time before un-registering a stream on the server side (maybe this depends on the driver, I can only see this on one machine, the others are fine), meaning we have a bunch of stream going on at the same time on the server side, and it's the server side that seem to decide whether we can create a new stream or not. Also we do the resampling on the client side, which can be costly (it depends on the resampling ratio, and most of the time, it's 44100 -> 48000 or the opposite, which is not a simple case). All the crashes we've seen in the past few days were caused by the auto_unlock mutex trying to re-lock on a deleted mutex, because we just went into the error handling path, and just wasapi_stream_destroy-ed the stream, and the number of crashes are hopefully explained by the above. I filed bug 1134258 and bug 1134263 as follow ups.
Attachment #8566072 - Flags: review?(kinetik)
Attachment #8566072 - Flags: review?(kinetik) → review+
(In reply to Matthew Gregan [:kinetik] from comment #42) > Backed landed on aurora: > https://hg.mozilla.org/releases/mozilla-aurora/rev/e62dd2ded2fc *Backout* landed on aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/e62dd2ded2fc
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Depends on: 1134977
No longer depends on: 1134977
Depends on: 1135878
Depends on: 1133381
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: