Closed Bug 620721 Opened 14 years ago Closed 13 years ago

Intermittent test_timeupdate_small_files.html | Test timed out. followed by shutdown timeout

Categories

(Core :: Audio/Video, defect)

Other Branch
All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla8

People

(Reporter: philor, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

+++ This bug was initially created as a clone of Bug #581025 +++ http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1292951714.1292954402.2593.gz Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/12/21 09:15:14 s: talos-r3-fed64-026 58347 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | r11025_s16_c1.wav - shouldn't get timeupdate after ended --DOMWINDOW == 19 (0x6bd3218) [serial = 1866] [outer = 0x389e940] [url = http://mochi.test:8888/tests/content/media/test/test_standalone.html] --DOMWINDOW == 18 (0x8973e78) [serial = 1867] [outer = (nil)] [url = http://mochi.test:8888/tests/content/media/test/320x240.ogv] --DOMWINDOW == 17 (0x8d3db78) [serial = 1868] [outer = (nil)] [url = http://mochi.test:8888/tests/content/media/test/seek.webm] 58348 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_timeupdate_small_files.html | Test timed out. ... TEST-UNEXPECTED-FAIL | Shutdown | application timed out after 330 seconds with no output
(In reply to comment #18) > philringnalda%gmail.com > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1294908298.1294911095.4865.gz > Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2011/01/13 00:44:58 Right before the timeout, the log says: 58534 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | 320x240.ogv - shouldn't get timeupdate after ended 58535 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | 320x240.ogv - shouldn't get timeupdate after ended 58536 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | 320x240.ogv - should see at least one timeupdate: 0.26600000262260437 58537 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | r11025_s16_c1.wav - shouldn't get timeupdate after ended 58538 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | r11025_s16_c1.wav - shouldn't get timeupdate after ended NEXT ERROR 58539 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_timeupdate_small_files.html | Test timed out. So it looks like in this case we didn't send the "ended" event in the WAV decoder for r11025_s16_c1.wav.
This is our old enemy, sa_stream_write() not returning from nsAudioStream::Drain(), causing the nsWaveDecoder::Stop() to be unable to Shutdown() the decoder thread on the main thread. (In reply to comment #58) > bill%wg9s.com > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1295813336.1295816295.25498.gz > Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2011/01/23 12:08:56 Relevant thread stacks from that log: Thread 0 (crashed) 0 linux-gate.so + 0x424 eip = 0x0096f424 esp = 0xbf9779e0 ebp = 0xbf977a38 ebx = 0x0939f3c8 esi = 0x00000000 edi = 0x000000d5 eax = 0xfffffffc ecx = 0x00000080 edx = 0x000000d5 efl = 0x00000202 Found by: given as instruction pointer in context 1 libnspr4.so!PR_Wait [ptsynch.c : 614 + 0x14] eip = 0x00316b03 esp = 0xbf977a40 ebp = 0xbf977a68 Found by: previous frame's frame pointer 2 libxul.so!nsAutoMonitor::Wait [nsAutoLock.h : 346 + 0x14] eip = 0x010d4357 esp = 0xbf977a70 ebp = 0xbf977a88 ebx = 0x034ddb94 Found by: call frame info 3 libxul.so!nsEventQueue::GetEvent [nsEventQueue.cpp : 85 + 0x12] eip = 0x0274354a esp = 0xbf977a90 ebp = 0xbf977ac8 ebx = 0x034ddb94 Found by: call frame info 4 libxul.so!nsThread::nsChainedEventQueue::GetEvent [nsThread.h : 114 + 0x1b] eip = 0x0274783a esp = 0xbf977ad0 ebp = 0xbf977ae8 ebx = 0x034ddb94 Found by: call frame info 5 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 616 + 0x5e] eip = 0x02745380 esp = 0xbf977af0 ebp = 0xbf977b88 ebx = 0x034ddb94 Found by: call frame info 6 libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 250 + 0x1f] eip = 0x026cc95c esp = 0xbf977b90 ebp = 0xbf977bc8 ebx = 0x034ddb94 esi = 0x0ce77b10 Found by: call frame info 7 libxul.so!nsThread::Shutdown [nsThread.cpp : 492 + 0x12] eip = 0x0274652c esp = 0xbf977bd0 ebp = 0xbf977c58 ebx = 0x034ddb94 esi = 0x0ce77b10 Found by: call frame info 8 libxul.so!nsWaveDecoder::Stop [nsWaveDecoder.cpp : 1388 + 0x1b] eip = 0x01be83a5 esp = 0xbf977c60 ebp = 0xbf977c68 ebx = 0x034ddb94 esi = 0x093a8ffc Found by: call frame info 9 libxul.so!nsRunnableMethodImpl<void (nsWaveDecoder::*)(), true>::Run [nsThreadUtils.h : 345 + 0x4e] eip = 0x01bea5a9 esp = 0xbf977c70 ebp = 0xbf977c78 ebx = 0x034ddb94 esi = 0x093a8ffc Found by: call frame info 10 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 633 + 0x18] eip = 0x02745402 esp = 0xbf977c80 ebp = 0xbf977d18 ebx = 0x034ddb94 esi = 0x093a8ffc Found by: call frame info [...] Thread 19 0 linux-gate.so + 0x424 eip = 0x0096f424 esp = 0x9ff18d14 ebp = 0x9ff18d38 ebx = 0x9ff18d60 esi = 0x00000000 edi = 0x00c58ff4 eax = 0xfffffdfc ecx = 0x00000001 edx = 0xffffffff efl = 0x00000293 Found by: given as instruction pointer in context 1 libasound.so.2.0.0 + 0x4b835 eip = 0x055cc836 esp = 0x9ff18d40 ebp = 0x9ff18db8 Found by: previous frame's frame pointer 2 libasound.so.2.0.0 + 0x4ba0a eip = 0x055cca0b esp = 0x9ff18dc0 ebp = 0x9ff18dd8 Found by: previous frame's frame pointer 3 libasound.so.2.0.0 + 0x4bb4a eip = 0x055ccb4b esp = 0x9ff18de0 ebp = 0x9ff18e28 Found by: previous frame's frame pointer 4 libasound.so.2.0.0 + 0x901ea eip = 0x056111eb esp = 0x9ff18e30 ebp = 0x9ff18ea8 Found by: previous frame's frame pointer 5 libasound.so.2.0.0 + 0x4611b eip = 0x055c711c esp = 0x9ff18eb0 ebp = 0x9ff18ee8 Found by: previous frame's frame pointer 6 libxul.so!sa_stream_write [sydney_audio_alsa.c : 260 + 0x1a] eip = 0x01beb11d esp = 0x9ff18ef0 ebp = 0x9ff18f28 Found by: previous frame's frame pointer 7 libxul.so!nsAudioStreamLocal::Drain [nsAudioStream.cpp : 520 + 0x36] eip = 0x01b91c7e esp = 0x9ff18f30 ebp = 0x9ff18f58 ebx = 0x034ddb94 esi = 0x00002b10 edi = 0x01b91d4e Found by: call frame info 8 libxul.so!nsWaveStateMachine::Run [nsWaveDecoder.cpp : 791 + 0x1b] eip = 0x01be9b9c esp = 0x9ff18f60 ebp = 0x9ff19148 ebx = 0x034ddb94 esi = 0x0e573a00 edi = 0x01b91d4e Found by: call frame info 9 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 633 + 0x18] eip = 0x02745402 esp = 0x9ff19150 ebp = 0x9ff191e8 ebx = 0x034ddb94 esi = 0x00000000 edi = 0x003d0f00 Found by: call frame info
(In reply to comment #190) > philor > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1302226650.1302229743.31455.gz > Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2011/04/07 18:37:30 > > s: talos-r3-fed-023 Stack here is slightly different from that in comment 59 in that it's not a Drain call resulting in the block, it's a normal audio write: Thread 21 0 linux-gate.so + 0x424 eip = 0x00d9b424 esp = 0xa13fbcd4 ebp = 0xa13fbcf8 ebx = 0xa13fbd20 esi = 0x00000000 edi = 0x00c58ff4 eax = 0xfffffdfc ecx = 0x00000001 edx = 0xffffffff efl = 0x00000293 Found by: given as instruction pointer in context 1 libasound.so.2.0.0 + 0x4b835 eip = 0x055cc836 esp = 0xa13fbd00 ebp = 0xa13fbd78 Found by: previous frame's frame pointer 2 libasound.so.2.0.0 + 0x4ba0a eip = 0x055cca0b esp = 0xa13fbd80 ebp = 0xa13fbd98 Found by: previous frame's frame pointer 3 libasound.so.2.0.0 + 0x4bb4a eip = 0x055ccb4b esp = 0xa13fbda0 ebp = 0xa13fbde8 Found by: previous frame's frame pointer 4 libasound.so.2.0.0 + 0x901ea eip = 0x056111eb esp = 0xa13fbdf0 ebp = 0xa13fbe68 Found by: previous frame's frame pointer 5 libasound.so.2.0.0 + 0x4611b eip = 0x055c711c esp = 0xa13fbe70 ebp = 0xa13fbea8 Found by: previous frame's frame pointer 6 libxul.so!sa_stream_write [sydney_audio_alsa.c : 260 + 0x1a] eip = 0x01df02dd esp = 0xa13fbeb0 ebp = 0xa13fbee8 Found by: previous frame's frame pointer 7 libxul.so!nsAudioStreamLocal::Write [nsAudioStream.cpp : 471 + 0x28] eip = 0x01d993a5 esp = 0xa13fbef0 ebp = 0xa13fbf88 ebx = 0x035b755c esi = 0x00001000 edi = 0x0e371f38 Found by: call frame info 8 libxul.so!nsBuiltinDecoderStateMachine::PlayFromAudioQueue [nsBuiltinDecoderStateMachine.cpp : 645 + 0x5a] eip = 0x01d927da esp = 0xa13fbf90 ebp = 0xa13fc008 ebx = 0x035b755c esi = 0x00000800 edi = 0x0e371f38 Found by: call frame info 9 libxul.so!nsBuiltinDecoderStateMachine::AudioLoop [nsBuiltinDecoderStateMachine.cpp : 510 + 0x29] Thread 0 (crashed) 0 linux-gate.so + 0x424 eip = 0x00d9b424 esp = 0xbfc36ac0 ebp = 0xbfc36b18 ebx = 0x09723fd8 esi = 0x00000000 edi = 0x000000eb eax = 0xfffffffc ecx = 0x00000080 edx = 0x000000eb efl = 0x00000206 Found by: given as instruction pointer in context 1 libnspr4.so!PR_Wait [ptsynch.c : 614 + 0x14] eip = 0x00859b6b esp = 0xbfc36b20 ebp = 0xbfc36b48 Found by: previous frame's frame pointer 2 libxul.so!mozilla::Monitor::Wait [BlockingResourceBase.cpp : 346 + 0x14] eip = 0x02780f46 esp = 0xbfc36b50 ebp = 0xbfc36b88 ebx = 0x035b755c Found by: call frame info 3 libxul.so!mozilla::MonitorAutoEnter::Wait [Monitor.h : 228 + 0x13] eip = 0x011e8894 esp = 0xbfc36b90 ebp = 0xbfc36b98 ebx = 0x035b755c Found by: call frame info 4 libxul.so!nsEventQueue::GetEvent [nsEventQueue.cpp : 83 + 0x12] eip = 0x027f47a3 esp = 0xbfc36ba0 ebp = 0xbfc36bc8 ebx = 0x035b755c Found by: call frame info 5 libxul.so!nsThread::nsChainedEventQueue::GetEvent [nsThread.h : 109 + 0x1b] eip = 0x027f891e esp = 0xbfc36bd0 ebp = 0xbfc36be8 ebx = 0x035b755c Found by: call frame info 6 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 601 + 0x5e] eip = 0x027f6636 esp = 0xbfc36bf0 ebp = 0xbfc36c88 ebx = 0x035b755c Found by: call frame info 7 libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 250 + 0x1f] eip = 0x0277f460 esp = 0xbfc36c90 ebp = 0xbfc36cc8 ebx = 0x035b755c esi = 0x0ad2cea0 Found by: call frame info 8 libxul.so!nsThread::Shutdown [nsThread.cpp : 481 + 0x12] eip = 0x027f7638 esp = 0xbfc36cd0 ebp = 0xbfc36d38 ebx = 0x035b755c esi = 0x0ad2cea0 Found by: call frame info 9 libxul.so!nsBuiltinDecoder::Stop [nsBuiltinDecoder.cpp : 133 + 0x1d] Thread 19 0 linux-gate.so + 0x424 eip = 0x00d9b424 esp = 0x9363dc54 ebp = 0x9363dc88 ebx = 0x1097c7a4 esi = 0x00000000 edi = 0x0000000b eax = 0xfffffe00 ecx = 0x00000080 edx = 0x00000002 efl = 0x00000202 Found by: given as instruction pointer in context 1 libnspr4.so!PR_Lock [ptsynch.c : 206 + 0xa] eip = 0x00858aae esp = 0x9363dc90 ebp = 0x9363dcb8 Found by: previous frame's frame pointer 2 libnspr4.so!PR_EnterMonitor [ptsynch.c : 561 + 0xd] eip = 0x008598b2 esp = 0x9363dcc0 ebp = 0x9363dce8 ebx = 0x0086e460 Found by: call frame info 3 libxul.so!mozilla::Monitor::Enter [BlockingResourceBase.cpp : 315 + 0xd] eip = 0x02781c8c esp = 0x9363dcf0 ebp = 0x9363dd28 ebx = 0x035b755c Found by: call frame info 4 libxul.so!mozilla::MonitorAutoEnter::MonitorAutoEnter [Monitor.h : 218 + 0xc] eip = 0x011e871e esp = 0x9363dd30 ebp = 0x9363dd48 ebx = 0x035b755c esi = 0x0f3de1a4 Found by: call frame info 5 libxul.so!nsBuiltinDecoderStateMachine::StopPlayback [nsBuiltinDecoderStateMachine.cpp : 693 + 0x14] eip = 0x01d8f8f0 esp = 0x9363dd50 ebp = 0x9363dda8 ebx = 0x035b755c esi = 0x0f3de1a4 Found by: call frame info 6 libxul.so!nsBuiltinDecoderStateMachine::AdvanceFrame [nsBuiltinDecoderStateMachine.cpp : 1518 + 0x12] eip = 0x01d90eed esp = 0x9363ddb0 ebp = 0x9363df28 ebx = 0x035b755c esi = 0x00000001 Found by: call frame info 7 libxul.so!nsBuiltinDecoderStateMachine::Run [nsBuiltinDecoderStateMachine.cpp : 1301 + 0xa] eip = 0x01d922b1 esp = 0x9363df30 ebp = 0x9363e158 ebx = 0x035b755c esi = 0x00000001 edi = 0x003d0f00 Found by: call frame info
A couple of things: 1. We don't use the write workaround from bug 573924 in drain, so it's possible to hit the write>avail hang there. 2. Bug 574924 comment 11 says "if the first write is smaller than avail, it's still possible to hang in snd_pcm_writei", so maybe we're hitting that case here. 3. It's possible that the workaround is needed in more cases than just resuming from a stream cork. I tried to reproduce this on my machine last week, but never hit it over 24 hours of running 5-10 copies of the test in parallel. I'm on Fedora 13 with a more recent PulseAudio that may include a fix for this particular bug. Without being able to reproduce the hang, the best I can do is come up with test patches for at least items 1 and 2 and land them to see if they help.
Logging reveals we hit item 2 with the following files, at least: http://mochi.test:8888/tests/content/media/test/r11025_s16_c1.wav (gdb) p avail $5 = 5513 (gdb) p nframes $6 = 2048 http://mochi.test:8888/tests/content/media/test/small-shot.ogg (gdb) p avail $8 = 5564 (gdb) p nframes $9 = 1024 I saw this in the normal (non-resume) case, but with the right load/timing this could happen after resuming a cork caused by network buffering. Working around this requires writing sufficient silence before the audio data to make an avail-sized write. The audio clock will need to be adjusted by the amount of silence written to retain A/V sync.
(In reply to comment #194) > I tried to reproduce this on my machine last week, but never hit it over 24 > hours of running 5-10 copies of the test in parallel. I'm on Fedora 13 with a > more recent PulseAudio that may include a fix for this particular bug. Without > being able to reproduce the hang, the best I can do is come up with test > patches for at least items 1 and 2 and land them to see if they help. This is the right approach to take here, please land whatever experimental patches you need here, to see if they help. You can also submit the patch to try and take lots of additional test runs to try to guess whether they help or not.
(In reply to comment #194) > 2. Bug 574924 comment 11 says "if the first write is smaller than avail, it's > still possible to hang in snd_pcm_writei", so maybe we're hitting that case > here. I ran 30-40 test runs on tryserver with a patch to fix this, and it seems to help, but bug 651023 is making it difficult to be sure, so I'll fix that first. Unfortunately, this approach breaks test_audiowrite.html and possibly real code using mozWriteAudio, so it needs more work.
OS: Linux → BeOS
Target Milestone: --- → mozilla5
Version: Trunk → Other Branch
(In reply to comment #233) > (In reply to comment #194) > > 2. Bug 574924 comment 11 says "if the first write is smaller than avail, it's > > still possible to hang in snd_pcm_writei", so maybe we're hitting that case > > here. > > I ran 30-40 test runs on tryserver with a patch to fix this, and it seems to > help, but bug 651023 is making it difficult to be sure, so I'll fix that first. > > Unfortunately, this approach breaks test_audiowrite.html and possibly real code > using mozWriteAudio, so it needs more work. If you have a patch here which you suspect will fix this, then I'd say please land it. We can always reopen this bug if it happens again.
Blocks: 648353
Can we please land the patches Matthew was talking about in comment 233? This bug is one of the issues that is blocking bug 648353 from landing...
It's not ready, as I said in comment 233 is breaks test_audiowrite.html and likely other things. I'm working on an improved fix that avoids that problem, so I'll post the patch soon.
(In reply to comment #236) > It's not ready, as I said in comment 233 is breaks test_audiowrite.html and > likely other things. I'm working on an improved fix that avoids that problem, > so I'll post the patch soon. OK, thanks!
Depends on: cubeb
Is bug 623444 going to be the fix here?
Fixed by Bug 675839.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: mozilla5 → mozilla8
No longer depends on: cubeb
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.