Closed Bug 832768 Opened 12 years ago Closed 8 years ago

Intermittent anything in any of test_buffered.html,test_bug465498.html,test_bug495145.html,test_chaining.html,test_progress.html,test_reactivate.html,test_replay_metadata.html

Categories

(Core :: Audio/Video: Playback, defect, P5)

x86
Linux
defect

Tracking

()

RESOLVED INVALID

People

(Reporter: philor, Unassigned)

References

Details

Attachments

(1 file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=18975600&tree=Mozilla-Inbound
Rev3 Fedora 12x64 mozilla-inbound debug test mochitest-1 on 2013-01-20 05:42:40 PST for push 2106f6631cd6
slave: talos-r3-fed64-022

https://tbpl.mozilla.org/php/getParsedLog.php?id=18973324&tree=Mozilla-Inbound
Rev3 Fedora 12 mozilla-inbound debug test mochitest-1 on 2013-01-20 02:13:12 PST for push 88daef90f2ab
slave: talos-r3-fed-080

181681 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_buffered.html | Test timed out.
181745 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_bug465498.html | Test timed out.
181795 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_bug495145.html | Test timed out.
181948 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_chaining.html | Test timed out.
181949 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test timeouts, giving up.
181950 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 211 remaining tests.

https://tbpl.mozilla.org/php/getParsedLog.php?id=18976273&tree=Mozilla-Inbound
Rev3 Fedora 12 mozilla-inbound debug test mochitest-1 on 2013-01-20 06:22:17 PST for push 56efd49e70dd
slave: talos-r3-fed-062

185612 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_playback.html | Test timed out.
185864 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_progress.html | Test timed out.
185881 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_reactivate.html | Test timed out.
185968 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_replay_metadata.html | Test timed out.
185969 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test timeouts, giving up.
185970 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 163 remaining tests.
Meh.
Roc, this has gotten significantly flakier on Linux recently. Added in with the overall state of flakiness of the audio/video mochitests and relatively little progress being made on improving the situation, I'm beginning to think that disabling our way to green is the only practical way forward at this point. Can you please have someone look into the failures in this suite? I'm happy to provide a list of bugs if it would help.
Flags: needinfo?(roc)
+cc Anthony Jones, new media team manager. Unfortunately he's on PTO.

If we were to disable out way to green, which tests would you like disabled?
This would obviously be one (note that bug 864682 also tracks timeouts in this test). Beyond that, test_playback.html, test_bug493187.html, and test_playback_rate.html would make the first cut. After that, I'd have to see where we're at.

Thanks for CCing Anthony. It actually occurred to me right after sending that comment that roc probably wasn't the right guy for the needinfo? request based on the recent reorganizing.
Flags: needinfo?(roc)
I've been actively trying to fix test_bug493187 on Windows, but it's been very slow going. I've also been at a work week this week, and am on PTO next week, so I will be working on it again once I get back from PTO.

Go ahead and disable those tests, but please put a comment in there saying the date on which they were disabled.
Disabled.
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2c911586bf0

And for record, man is content/media/test/Makefile.in a depressing sight. Between the huge number of disabled tests and the fact that tests are listed in nearly random order...
Whiteboard: [test disabled][leave open]
test_playback_rate.html and test_seek.html disabled on all platforms.
https://hg.mozilla.org/integration/mozilla-inbound/rev/81cae114022a
(In reply to TBPL Robot from comment #211)
> Standard8
> https://tbpl.mozilla.org/php/getParsedLog.php?id=41393717&tree=Mozilla-
> Central
> WINNT 6.2 mozilla-central debug test mochitest-1 on 2014-06-09 16:01:37
> revision: fc70d6d9a9b0
> slave: t-w864-ix-033
> 
> 5595 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_buffered.html | Test timed out.
> 5654 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_bug465498.html | Test timed out.
> 5678 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_bug495145.html | Test timed out.
> 5721 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_bug495300.html | Test timed out.
> 5722 INFO TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test
> timeouts, giving up.
> 5723 INFO TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 379
> remaining tests.
> 5728 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) |
> /tests/content/media/test/test_bug495300.html finished in a non-clean
> fashion, probably because it didn't call SimpleTest.finish()
> Return code: 1

Although it happened on Windows, it might be related to bug 1020227 though. Leave a note to check if this is about libcubeb shutdown hang later.
https://tbpl.mozilla.org/php/getParsedLog.php?id=41731260&tree=Try&full=1#error0

5677 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_buffered.html | Test timed out.
5719 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_bug448534.html | Test timed out.
5809 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_bug465498.html | Test timed out.
5835 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_bug495145.html | Test timed out.
5836 INFO TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test timeouts, giving up.
5837 INFO TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 374 remaining tests.
5842 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | /tests/content/media/test/test_bug495145.html finished in a non-clean fashion, probably because it didn't call SimpleTest.finish()
Return code: 1


[test_buffered.html]
21:33:56     INFO -  2014-06-14 04:33:56.576000 UTC - 0[230f140]: 2058c0c0 Trying load from src=blob:b28a29d8-04bc-4ab7-b35a-bf6d3fa7aace
21:33:56     INFO -  2014-06-14 04:33:56.576000 UTC - 0[230f140]: 2058c0c0 LoadResource document src=/tests/content/media/test/test_buffered.html
21:33:56     INFO -  2014-06-14 04:33:56.576000 UTC - 0[230f140]: 2058c0c0 Created decoder 6b48740 for type audio/mpeg
21:33:56     INFO -  2014-06-14 04:33:56.576000 UTC - 0[230f140]: Decoder=6b48740 MediaDecoderStateMachine ctor=9dc2c00 logging=0
21:33:56     INFO -  2014-06-14 04:33:56.576000 UTC - 0[230f140]: Decoder=6b48740 NotifyDownloadEnded, status=0
21:33:56     INFO -  2014-06-14 04:33:56.576000 UTC - 3108[fd3d860]: Decoder=6b48740 Decoding Media Headers
21:33:56     INFO -  2014-06-14 04:33:56.577000 UTC - 3108[fd3d860]: Decoder=6b48740 Decode metadata failed, shutting down decoder
21:33:56     INFO -  2014-06-14 04:33:56.577000 UTC - 3108[fd3d860]: Decoder=6b48740 Decode error, changed state to SHUTDOWN due to error

[test_bug448534.html]
21:38:57     INFO -  2014-06-14 04:38:57.299000 UTC - 0[230f140]: 1b68a5c0 Trying load from src=small-shot.mp3
21:38:57     INFO -  2014-06-14 04:38:57.299000 UTC - 0[230f140]: 1b68a5c0 LoadResource document src=/tests/content/media/test/test_bug448534.html
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 0[230f140]: 1b68a5c0 Created decoder 1166bf20 for type audio/mpeg
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 0[230f140]: ChannelMediaResource ctor, decoder=1166bf20 stream=15be5d08
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 0[230f140]: OpenChannel, mListener=122bd2c0 decoder=1166bf20 stream=15be5d08
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 0[230f140]: Decoder=1166bf20 MediaDecoderStateMachine ctor=11317000 logging=0
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 0[230f140]: OnStartRequest, decoder=1166bf20 stream=15be5d08
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 0[230f140]: OnStartRequest status=0 responseStatus=206 mOffset=0 mByteRange=0,0
21:38:57     INFO -  2014-06-14 04:38:57.319000 UTC - 2068[fd3be20]: Decoder=1166bf20 Decoding Media Headers
21:38:57     INFO -  2014-06-14 04:38:57.320000 UTC - 0[230f140]: Decoder=1166bf20 NotifyDownloadEnded, status=0
21:38:57     INFO -  2014-06-14 04:38:57.321000 UTC - 2068[fd3be20]: Decoder=1166bf20 Decode metadata failed, shutting down decoder
21:38:57     INFO -  2014-06-14 04:38:57.321000 UTC - 2068[fd3be20]: Decoder=1166bf20 Decode error, changed state to SHUTDOWN due to error


4 tests timed out because decoding metadata failed and they all failed in decoding MP3 files.
(In reply to TBPL Robot from comment #224)
> Tomcat
> https://tbpl.mozilla.org/php/getParsedLog.php?id=41870135&tree=Fx-Team
> WINNT 6.2 fx-team debug test mochitest-1 on 2014-06-17 05:55:18
> revision: 42250fd7fb3c
> slave: t-w864-ix-033
> 
> 5691 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_buffered.html | Test timed out.
> 5750 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_bug465498.html | Test timed out.
> 5774 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_bug495145.html | Test timed out.
> 5817 INFO TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_bug495300.html | Test timed out.
> 5818 INFO TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test
> timeouts, giving up.
> 5819 INFO TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 371
> remaining tests.
> 5824 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) |
> /tests/content/media/test/test_bug495300.html finished in a non-clean
> fashion, probably because it didn't call SimpleTest.finish()
> Return code: 1

[Parent 2224] WARNING: NS_ENSURE_TRUE(SUCCEEDED(hr)) failed: file c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\content\media\directshow\DirectShowReader.cpp, line 175

Again, failing in reading metadata while decoding mp3 files.
(owl.mp3 in test_buffered.html and small-shot.mp3 in test_bug465498.html)
(In reply to TBPL Robot from comment #312)

17:24:30     INFO -  [Parent 256] WARNING: NS_ENSURE_TRUE(SUCCEEDED(hr)) failed: file c:/builds/moz2_slave/m-beta-w32-d-00000000000000000/build/content/media/directshow/DirectShowReader.cpp, line 175

Is DirectShowReader used to decode MP3 on windows? Could it be possibly related to bug 1056032?
Flags: needinfo?(cpearce)
DirectShowReader is used for MP3 decoding on all Windows versions we support.

It will be related to bug 1056032, in that the code in decodeAudioData will call through to the DirectShowReader on a thread without initializing MS COM, since they stopped calling it on a SharedThreadPool. There's a patch there to fix that.
Flags: needinfo?(cpearce)
DirectShowReader::ReadMetadata() is run by the MediaTaskQueue which employs SharedThreadPool. Doesn't that guarantee decoding tasks are always run in a thread with MS COM initialized?
The decodeAudioData code creates and runs the reader directly, on a raw nsIThreadPool. It doesn't use a MediaDecoderStateMachine, see MediaDecodeTask in MediaBufferDecoder.cpp. It was using a SharedThreadPool for a while, but the WebAudio guys stopped doing that when they had shutdown problems, I presume because they weren't using it correctly.
But test_buffered.html in this case still use MediaDecoderStateMachine which will use MediaTaskQueue for the decoding task. I wonder if DirectShowReader failed for some other reason.

I will add some logs for diagnostics.
Try: https://tbpl.mozilla.org/php/getParsedLog.php?id=46668578&tree=Try&full=1
File: https://hg.mozilla.org/try/file/37f729492739/content/media/directshow/DirectShowUtils.cpp

19:37:39     INFO -  17801 INFO [Parent 392] WARNING: CoCreateInstance failed, hr=80040154: file c:\builds\moz2_slave\try-w32-d-00000000000000000000\build\content\media\directshow\DirectShowUtils.cpp, line 199
19:37:39     INFO -  17802 INFO [Parent 392] WARNING: dmoWrapper Init failed, hr=80040154: file c:\builds\moz2_slave\try-w32-d-00000000000000000000\build\content\media\directshow\DirectShowUtils.cpp, line 248
19:37:39     INFO -  17803 INFO [Parent 392] WARNING: NS_ENSURE_TRUE(SUCCEEDED(hr)) failed: file c:\builds\moz2_slave\try-w32-d-00000000000000000000\build\content\media\directshow\DirectShowReader.cpp, line 175

Finally I could repro the issue after adding some logs.
hr == 0x80040154 == REGDB_E_CLASSNOTREG == Class not registered

Hi Chris,
Do you have any idea about this error? It looks like a result of MS COM not initialized. However, as comment 317 said, SharedThreadPool used by MediaTaskQueue should guarantee MS COM initialization.
Flags: needinfo?(cpearce)
(In reply to JW Wang [:jwwang] from comment #321)
> Try:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46668578&tree=Try&full=1
> File:
> https://hg.mozilla.org/try/file/37f729492739/content/media/directshow/
> DirectShowUtils.cpp
> 
> 19:37:39     INFO -  17801 INFO [Parent 392] WARNING: CoCreateInstance
> failed, hr=80040154: file
> c:\builds\moz2_slave\try-w32-d-
> 00000000000000000000\build\content\media\directshow\DirectShowUtils.cpp,
> line 199
> 19:37:39     INFO -  17802 INFO [Parent 392] WARNING: dmoWrapper Init
> failed, hr=80040154: file
> c:\builds\moz2_slave\try-w32-d-
> 00000000000000000000\build\content\media\directshow\DirectShowUtils.cpp,
> line 248
> 19:37:39     INFO -  17803 INFO [Parent 392] WARNING:
> NS_ENSURE_TRUE(SUCCEEDED(hr)) failed: file
> c:\builds\moz2_slave\try-w32-d-
> 00000000000000000000\build\content\media\directshow\DirectShowReader.cpp,
> line 175
> 
> Finally I could repro the issue after adding some logs.
> hr == 0x80040154 == REGDB_E_CLASSNOTREG == Class not registered
> 
> Hi Chris,
> Do you have any idea about this error? 


This path triggers if we failed to create the MP3 decoder that ships with XP (which we'd expect to happen on non XP machines) and we also failed to instantiate the MP3 decoder that ships on later versions of Windows.

REGDB_E_CLASSNOTREG is a funny error. If the component was not registered, I'd expect it to *always* fail, i.e. be permaorange. Does this test always fail on the slave? Maybe that slave doesn't have the dshow mp3 decoder filter installed?

> It looks like a result of MS COM not
> initialized. However, as comment 317 said, SharedThreadPool used by
> MediaTaskQueue should guarantee MS COM initialization.

If MSCOM was not initialized, I'd expect us to fail a lot sooner, at the latest when we created the CLSID_MPEG1Splitter filter up on line 156.
Flags: needinfo?(cpearce)
Either that, or the machine's codec dll has been damaged.
It is interesting to find out:

Failure in comment 321: slave: t-w864-ix-033
Failure in comment 312: slave: t-w864-ix-033
Failure in comment 311: slave: t-w864-ix-033

Most failures come from the same machine.

As Chris said, it should be a problem of the machine t-w864-ix-033. Should we call Sheriff or someone to check if DLL is damaged or directshow not installed on t-w864-ix-033?
Include Sheriff for the machine error.
Flags: needinfo?(ryanvm)
Attached patch 832768_add_logs-v1.patch (deleted) — — Splinter Review
I still want to add some debugging logs which might come in handy in the future.
Assignee: nobody → jwwang
Status: NEW → ASSIGNED
Attachment #8478078 - Flags: review?(cpearce)
Attachment #8478078 - Flags: review?(cpearce) → review+
Please check in the patch 832768_add_logs-v1.patch. This patch only adds logs instead of fixing the problem. The logs will help debug such errors in the future. Please check comment 324 for the actual root cause.
Keywords: checkin-needed
(In reply to JW Wang [:jwwang] from comment #324)
> It is interesting to find out:
> 
> Failure in comment 321: slave: t-w864-ix-033
> Failure in comment 312: slave: t-w864-ix-033
> Failure in comment 311: slave: t-w864-ix-033

Oh my, good call. I'm going to disable the slave for diagnostics.
Flags: needinfo?(ryanvm)
(In reply to TBPL Robot from comment #332)
02:54:29     INFO -  [Parent 1004] WARNING: NS_ENSURE_TRUE(mVideo.mDecoder != nullptr) failed: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/fmp4/MP4Reader.cpp, line 412
02:54:29     INFO -  [Parent 1004] WARNING: Decoder=120fb33a0 ReadMetadata failed, res=80004005 HasValidMedia=0: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/MediaDecoderStateMachine.cpp, line 1943
02:54:29     INFO -  [Parent 1004] WARNING: Decoder=120fb33a0 Decode metadata failed, shutting down decoder: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/MediaDecoderStateMachine.cpp, line 1902
02:54:29     INFO -  [Parent 1004] WARNING: Decoder=120fb33a0 Decode error, changed state to SHUTDOWN due to error: file /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/MediaDecoderStateMachine.cpp, line 1875

Do we have limitation in the number of H264 decoder that can be created?
Flags: needinfo?(cpearce)
This is happening on OSX where MP4 has only recently been enabled.
jwwang: on Windows we limit ourselves to 8 hardware decoders, and then fallback to creating software decoders, but I don't think there's a limit to the number of software decoders we can instantiate, unless we run out of memory.

On Linux I believe we are only limited by memory constraints, as we use ffmpeg there.

jya: Can you answer jwwang's question in comment 340 in regards to MacOSX please?
Flags: needinfo?(cpearce) → needinfo?(jyavenard)
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #341)
> This is happening on OSX where MP4 has only recently been enabled.

It seems to happen on OSX 10.6 only, not 10.8. I wonder what difference makes this.
10.6 mp4 video playback was only made possible about 5 hours ago when this commit was pushed through
https://hg.mozilla.org/mozilla-central/rev/ac02a401a021

as such, I doubt any of those timeout issues have anything to do with the decoder, seeing that the timeout would have occurred with no decoder present !
Flags: needinfo?(jyavenard)
Embrace the dumping ground!
Summary: Intermittent timeout in 4 of test_buffered.html,test_bug465498.html,test_bug495145.html,test_chaining.html,test_playback.html,test_progress.html,test_reactivate.html,test_replay_metadata.html → Intermittent anything in any of test_buffered.html,test_bug465498.html,test_bug495145.html,test_chaining.html,test_playback.html,test_progress.html,test_reactivate.html,test_replay_metadata.html
Whiteboard: [test disabled][leave open]
Component: Audio/Video → Audio/Video: Playback
It is not easy to track individual issues when all test failures fall into this dump ground. I create bug 1220512 to track readyState issues.

Hi Ryan,
Do I need to remove "test_playback.html" from this bug title so it will match the title of bug 1220512 first? Is it possible to change the title of bug 1220512 to match all files in test_playback.html like " Intermittent test_playback.html | *.* checking readyState - got 4, expected 2"?
Flags: needinfo?(ryanvm)
(In reply to JW Wang [:jwwang] from comment #804)
> Do I need to remove "test_playback.html" from this bug title so it will
> match the title of bug 1220512 first?

Go for it.

> Is it possible to change the title of
> bug 1220512 to match all files in test_playback.html like " Intermittent
> test_playback.html | *.* checking readyState - got 4, expected 2"?

Treeherder looks at the filename first (not the content after the pipe) when generating matches, so you can make that change to the summary without any ill effects.
Flags: needinfo?(ryanvm)
Thanks. I will change the bug title.
Summary: Intermittent anything in any of test_buffered.html,test_bug465498.html,test_bug495145.html,test_chaining.html,test_playback.html,test_progress.html,test_reactivate.html,test_replay_metadata.html → Intermittent anything in any of test_buffered.html,test_bug465498.html,test_bug495145.html,test_chaining.html,test_progress.html,test_reactivate.html,test_replay_metadata.html
Priority: -- → P5
1 failure on trunk in the past week.
Assignee: jwwang → nobody
Status: ASSIGNED → NEW
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: