Closed
Bug 916135
Opened 11 years ago
Closed 11 years ago
Investigate webaudio mochitest timeouts on B2G
Categories
(Core :: Web Audio, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: emorley, Assigned: jwwang)
References
Details
Attachments
(3 files, 6 obsolete files)
(deleted),
patch
|
cajbir
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
padenot
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jwwang
:
review+
|
Details | Diff | Splinter Review |
The webaudio tests are currently extremely unreliable on B2G (see bug 820739 comment 787).
Disabling until the failures are resolved.
Reporter | ||
Comment 1•11 years ago
|
||
Whiteboard: [test disabled on B2G][leave open]
Comment 2•11 years ago
|
||
Comment 4•11 years ago
|
||
(In reply to comment #3)
> Bug 916773 may be at least part of the problem here.
Should we re-enable these tests now that bug 916773 is fixed?
Comment 5•11 years ago
|
||
How about enabling them on Try with 20+ retriggers and then we can decide.
Comment 7•11 years ago
|
||
Oh shoot, this is mochitest-3, not crashtest. :(
Comment 9•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #8)
> https://tbpl.mozilla.org/?tree=Try&rev=b1a2dc56e30b
lulz
Comment 10•11 years ago
|
||
Alright, I give up. jwwang, any chance you could take a look at this please?
Flags: needinfo?(jwwang)
Assignee | ||
Comment 12•11 years ago
|
||
command timed out: 1200 seconds without output, attempting to kill
Bug 820739 - Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" or "command timed out: 3600 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step
Bug 820811 - Intermittent Android "command timed out: 1200 seconds without output, attempting to kill" during "Install App on Device" step
Bug 873053 - Intermittent B2G xpcshell command timed out: 1200 seconds without output, attempting to kill from "AttributeError: 'NoneType' object has no attribute 'pid'"
Btw, what are the lines following the "command timed out: 1200" line in the light blue box?
I don't see them in the full log. Do they even matter?
Reporter | ||
Comment 13•11 years ago
|
||
(In reply to jwwang from comment #12)
> Btw, what are the lines following the "command timed out: 1200" line in the
> light blue box?
> I don't see them in the full log. Do they even matter?
Anything that's a link (and indented) in the TBPL blue box is a bug suggestion by TBPL, to help match up known intermittent failures with failures in automation. They are just bugzilla search results based on scraping the log for failure lines.
Assignee | ||
Comment 14•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=1932306fae09
20 successes in a row.
As content/media/test/* tests are disabled in my test, I wonder if there are some leftovers after running content/media/test which cause deadlock or something likewise in running content/media/webaudio/test.
Btw, test_audioBufferSourceNodeLazyLoopParam.html, test_audioBufferSourceNodeLoopStartEnd.html and test_audioBufferSourceNodeLoop.html are not found in my pushlog. Are they disabled elsewhere other than testing/mochitest/b2g.json?
Is it possible to change the order of tests? I would like to see how it plays out if we run content/media/webaudio/test first and then content/media/test.
Comment 15•11 years ago
|
||
(In reply to comment #14)
> https://tbpl.mozilla.org/?tree=Try&rev=1932306fae09
> 20 successes in a row.
>
> As content/media/test/* tests are disabled in my test, I wonder if there are
> some leftovers after running content/media/test which cause deadlock or
> something likewise in running content/media/webaudio/test.
That's possible.
> Btw, test_audioBufferSourceNodeLazyLoopParam.html,
> test_audioBufferSourceNodeLoopStartEnd.html and
> test_audioBufferSourceNodeLoop.html are not found in my pushlog. Are they
> disabled elsewhere other than testing/mochitest/b2g.json?
>
> Is it possible to change the order of tests? I would like to see how it plays
> out if we run content/media/webaudio/test first and then content/media/test.
You can try moving directories around. AFAIK we run the tests based on the alphabetical order of their names (sub-directories first.)
Assignee | ||
Comment 16•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=5a93c866e2d2
21 runs, each with 5 duplicates of content/media/webaudio/test (webaudio/test2~6).
2 of them timed out in convolverNode (is this a known issuue?), but none of them exhibit the same symptom as the previous command timed out.
I really suspect the command timed out is resulted from content/media/test. I will try to duplicate content/media/test and see if it will show the same results.
Comment 17•11 years ago
|
||
Looks like test_convolverNodeChannelCount.html and subsequent tests are timing out waiting for the online AudioContext, which makes that a different issue from bug 921197, but perhaps related to bug 920338 (which is also unexplained).
Assignee | ||
Comment 18•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=24ebdb331c6d
https://tbpl.mozilla.org/?tree=Try&rev=59dbdeddd772
where content/media/test/test_timeupdate_small_files.html and content/media/test/test_video_to_canvas.html are disabled.
https://tbpl.mozilla.org/?tree=Try&rev=71c827590484
https://tbpl.mozilla.org/?tree=Try&rev=67947612ee00
where content/media/test/test_timeupdate_small_files.html and content/media/test/test_video_to_canvas.html are enabled.
It looks like disabling the 2 files would help reduce the chance of command timeout right after test_preload_actions.html. However, test_preload_actions.html is run before the 2 files. I am confused about how files run later can affect the former one.
Btw, there is no timeout in webaudio test cases in my test. Once bug 919353 is landed, we will have stack trace to give a better clue.
Assignee | ||
Comment 19•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=e3e36c223293
66 successes in a row except 2 timeouts caused by app crashed.
There are several root causes here:
1. Only 1 omx.h264 codec available concurrently. Request for the 2nd one has to wait indefinitely until the 1st one is released. You might have to wait for a long time until GC is triggered since this is not like memory low condition.
2. Media elements not receiving loadedmetadata event in some test cases. The reason is unknown.
3. Implementation of OMXCodec::pause is broken. Calling OMXCodec::start later will return error and can't leave the paused state.
Fixes:
1. Setting the src attribute to an empty string to force releasing the underlying codec.
2. Setting the preload attribute to "auto".
3. Don't call OMXCodec::pause in OmxDecoder::Pause. However, this will break the fix for Bug 860760 and Bug 872462.
Updated•11 years ago
|
Priority: -- → P2
Assignee | ||
Comment 20•11 years ago
|
||
Attachment #8348555 -
Flags: review?(karlt)
Assignee | ||
Comment 21•11 years ago
|
||
Attachment #8348559 -
Flags: review?(karlt)
Assignee | ||
Comment 22•11 years ago
|
||
Attachment #8348562 -
Flags: review?(karlt)
Assignee | ||
Comment 23•11 years ago
|
||
TBPL results.
https://tbpl.mozilla.org/?tree=Try&rev=73d9daec231f
Comment 24•11 years ago
|
||
JW, karlt is away until 3rd of January, maybe you want to redirect those review requests.
Flags: needinfo?(jwwang)
Assignee | ||
Updated•11 years ago
|
Attachment #8348555 -
Flags: review?(karlt) → review?(paul)
Flags: needinfo?(jwwang)
Assignee | ||
Updated•11 years ago
|
Attachment #8348559 -
Flags: review?(karlt) → review?(paul)
Assignee | ||
Updated•11 years ago
|
Attachment #8348562 -
Flags: review?(karlt) → review?(paul)
Comment 25•11 years ago
|
||
Comment on attachment 8348555 [details] [diff] [review]
Fix timeout in acquiring omx codec and waiting for loadedmetadata event.
Review of attachment 8348555 [details] [diff] [review]:
-----------------------------------------------------------------
I'm not sure if removing event listener matters that much. Does that keep the element alive?
::: content/media/test/manifest.js
@@ +561,5 @@
> }
>
> +function removeNodeAndSource(n) {
> + if (n.parentNode) {
> + n.parentNode.removeChild(n);
n.remove()
@@ +563,5 @@
> +function removeNodeAndSource(n) {
> + if (n.parentNode) {
> + n.parentNode.removeChild(n);
> + }
> + // force release of underlying codec
I'd rather say "decoder" than codec, here.
@@ +567,5 @@
> + // force release of underlying codec
> + n.src = "";
> + while (n.firstChild) {
> + n.removeChild(n.firstChild);
> + }
I'm not sure why this is necessary. Even if the node has <source> children, they won't load until someone calls load() on the media element.
Also, if you remove a node, its children will get collected anyway (assuming we don't have a reference to the children, which should be fine).
::: content/media/test/reactivate_helper.html
@@ +21,5 @@
> var test = parent.gSmallTests[i];
> var elemType = /^audio/.test(test.type) ? "audio" : "video";
> // Associate these elements with the subframe's document
> var e = document.createElement(elemType);
> + e.setAttribute("preload", "auto");
e.preload = "auto";
I see you've used this pattern a lot in this patch, please fix all occurrences.
::: content/media/test/test_bug654550.html
@@ +17,5 @@
>
> /* Test for Bug 654550 */
>
> + // Parallel test must be disabled for media.video_stats.enabled is a global setting
> + // to prevent the setting from chaning unexpectedly in the middle of the test.
s/chaning/changing/
Comment 26•11 years ago
|
||
Comment on attachment 8348559 [details] [diff] [review]
Don't call OMXCodec::pause to avoid state error.
Review of attachment 8348559 [details] [diff] [review]:
-----------------------------------------------------------------
I'm not really knowledgeable on this code, can you have a look, Chris?
::: content/media/omx/OmxDecoder.cpp
@@ +461,5 @@
> }
> return false;
> }
>
> +static bool isInEmulator()
s/isInEmulator/IsInEmulator/
Attachment #8348559 -
Flags: review?(paul) → review?(chris.double)
Comment 27•11 years ago
|
||
Comment on attachment 8348562 [details] [diff] [review]
Enable tests in B2G.
Review of attachment 8348562 [details] [diff] [review]:
-----------------------------------------------------------------
Looking at the try push, r+.
Attachment #8348562 -
Flags: review?(paul) → review+
Comment 28•11 years ago
|
||
Comment on attachment 8348559 [details] [diff] [review]
Don't call OMXCodec::pause to avoid state error.
Review of attachment 8348559 [details] [diff] [review]:
-----------------------------------------------------------------
r+ with Paul Adenot's change.
Attachment #8348559 -
Flags: review?(chris.double) → review+
Assignee | ||
Comment 29•11 years ago
|
||
Comment on attachment 8348555 [details] [diff] [review]
Fix timeout in acquiring omx codec and waiting for loadedmetadata event.
Review of attachment 8348555 [details] [diff] [review]:
-----------------------------------------------------------------
::: content/media/test/manifest.js
@@ +561,5 @@
> }
>
> +function removeNodeAndSource(n) {
> + if (n.parentNode) {
> + n.parentNode.removeChild(n);
https://developer.mozilla.org/en-US/docs/Web/API/Node
I can't find the method in the DOM API. Where is it documented?
@@ +567,5 @@
> + // force release of underlying codec
> + n.src = "";
> + while (n.firstChild) {
> + n.removeChild(n.firstChild);
> + }
The idea is to force releasing underlying decoder. So we have to clean the "src" attribute and all <source> child nodes.
Since it is not the memory low case, we might wait indefinitely for garbage collection. Therefore, I try to empty the "src" attribute and all <source> child nodes to force releasing the decoder without waiting for garbage collection to kick in.
::: content/media/test/reactivate_helper.html
@@ +21,5 @@
> var test = parent.gSmallTests[i];
> var elemType = /^audio/.test(test.type) ? "audio" : "video";
> // Associate these elements with the subframe's document
> var e = document.createElement(elemType);
> + e.setAttribute("preload", "auto");
Will do.
Assignee | ||
Comment 30•11 years ago
|
||
Attachment #8348555 -
Attachment is obsolete: true
Attachment #8348555 -
Flags: review?(paul)
Attachment #8355972 -
Flags: review?(paul)
Assignee | ||
Comment 31•11 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #25)
> Comment on attachment 8348555 [details] [diff] [review]
> Fix timeout in acquiring omx codec and waiting for loadedmetadata event.
>
> Review of attachment 8348555 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I'm not sure if removing event listener matters that much. Does that keep
> the element alive?
Changing the "src" attribute will trigger events. We have to remove event listeners to prevent unexpected events that cause tests to fail.
Comment 32•11 years ago
|
||
Assign to JW since he is currently working on this fix
Assignee: nobody → jwwang
Comment 33•11 years ago
|
||
(In reply to JW Wang[:jwwang] from comment #29)
> Comment on attachment 8348555 [details] [diff] [review]
> Fix timeout in acquiring omx codec and waiting for loadedmetadata event.
>
> Review of attachment 8348555 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: content/media/test/manifest.js
> @@ +561,5 @@
> > }
> >
> > +function removeNodeAndSource(n) {
> > + if (n.parentNode) {
> > + n.parentNode.removeChild(n);
>
> https://developer.mozilla.org/en-US/docs/Web/API/Node
> I can't find the method in the DOM API. Where is it documented?
http://dxr.mozilla.org/mozilla-central/source/dom/interfaces/core/nsIDOMElement.idl#81
http://dom.spec.whatwg.org/#dfnReturnLink-0
Updated•11 years ago
|
Attachment #8355972 -
Flags: review?(paul) → review+
Assignee | ||
Comment 34•11 years ago
|
||
Attachment #8355972 -
Attachment is obsolete: true
Attachment #8356384 -
Flags: review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 35•11 years ago
|
||
Comment 36•11 years ago
|
||
Backed out for Android mochitest orange.
https://hg.mozilla.org/integration/b2g-inbound/rev/bcba9e0700e8
https://tbpl.mozilla.org/php/getParsedLog.php?id=32639777&tree=B2g-Inbound
Comment 37•11 years ago
|
||
Comment 38•11 years ago
|
||
And every other platform.
Assignee | ||
Comment 39•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #36)
> Backed out for Android mochitest orange.
> https://hg.mozilla.org/integration/b2g-inbound/rev/bcba9e0700e8
>
> https://tbpl.mozilla.org/php/getParsedLog.php?id=32639777&tree=B2g-Inbound
The failure is not about timeout. Should I also disable the tests to keep green?
Comment 40•11 years ago
|
||
It would be better if you understood why the test started failing before blindly disabling it. After all, tests are there to catch regressions :)
Assignee | ||
Comment 41•11 years ago
|
||
Attachment #8348562 -
Attachment is obsolete: true
Attachment #8357044 -
Flags: review+
Assignee | ||
Comment 42•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #40)
> It would be better if you understood why the test started failing before
> blindly disabling it. After all, tests are there to catch regressions :)
https://tbpl.mozilla.org/?tree=Try&rev=082d10686313
Disable test_invalid_reject.html and test_load_same_resource.html and file bugs for them. Since this patch enables more tests than what it disables, we can still benefit from catching more regressions. For the failed tests, I would like to deal with them in other bugs.
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 43•11 years ago
|
||
Comment on attachment 8357044 [details] [diff] [review]
Enable tests in B2G.
The test was failing across all platforms. This will only disable it on B2G.
Paul - Are you OK with disabling these tests on all platforms so the B2G tests can be re-enabled?
Attachment #8357044 -
Flags: review+ → review-
Flags: needinfo?(paul)
Updated•11 years ago
|
Keywords: checkin-needed
Comment 44•11 years ago
|
||
If JW promises to fix the two disabled tests, we can land this, yes.
Flags: needinfo?(paul)
Comment 45•11 years ago
|
||
Sorry, I apparently misread comment 43. This can't land as-is, as it break tests on all platforms, whereas it should only affect b2g, which shows that we are clearly missing something here.
Assignee | ||
Comment 47•11 years ago
|
||
Fix test failures caused by event handlers (onerror) not removed before calling removeNodeAndSource().
Attachment #8356384 -
Attachment is obsolete: true
Attachment #8361571 -
Flags: review?(padenot)
Assignee | ||
Comment 48•11 years ago
|
||
Attachment #8357044 -
Attachment is obsolete: true
Attachment #8361572 -
Flags: review?(padenot)
Assignee | ||
Comment 49•11 years ago
|
||
TBPL results:
https://tbpl.mozilla.org/?tree=Try&rev=4f6739947691
Reporter | ||
Comment 50•11 years ago
|
||
There was an intermittent failure on one of the retriggers:
https://tbpl.mozilla.org/php/getParsedLog.php?id=33141611&tree=Try
{
891 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_progress.html | Test timed out.
913 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_reactivate.html | Test timed out.
1058 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_replay_metadata.html | Test timed out.
1159 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek_out_of_range.html | Test timed out.
1160 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test timeouts, giving up.
}
Assignee | ||
Comment 51•11 years ago
|
||
I noticed that too. It was suspicious the 4 failures were temporally close to each other. There might be some other root causes for the timeouts. Given the failure was one in the hundred, I would like to check in the patches first to enable webaudio tests again and then trace other media test timeouts in other bugs so that we can make progress.
Comment 52•11 years ago
|
||
Comment on attachment 8361571 [details] [diff] [review]
Fix timeout in acquiring omx codec and waiting for loadedmetadata event.
Review of attachment 8361571 [details] [diff] [review]:
-----------------------------------------------------------------
r+ if you investigate the error.
Attachment #8361571 -
Flags: review?(paul) → review+
Updated•11 years ago
|
Attachment #8361572 -
Flags: review?(paul) → review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 53•11 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/4272e7b529de
https://hg.mozilla.org/integration/b2g-inbound/rev/3b168ee25f38
https://hg.mozilla.org/integration/b2g-inbound/rev/d5e526257ace
Keywords: checkin-needed
Comment 54•11 years ago
|
||
Backed out for B2G mochitest-2 failures.
https://hg.mozilla.org/integration/b2g-inbound/rev/1a95ee2442c8
https://tbpl.mozilla.org/php/getParsedLog.php?id=33344919&tree=B2g-Inbound
Assignee | ||
Comment 55•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=4f6739947691
It looks like test_mediarecorder_creation_fail.html didn't show up in my push log (Neither in mochitest-2 or 3 on B2G ICS Emulator Opt). Did I miss something to include the test?
Comment 56•11 years ago
|
||
It landed in bug 926684 on 10-Jan, so maybe your parent rev was just too old to include it?
Assignee | ||
Comment 57•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #54)
> Backed out for B2G mochitest-2 failures.
> https://hg.mozilla.org/integration/b2g-inbound/rev/1a95ee2442c8
>
> https://tbpl.mozilla.org/php/getParsedLog.php?id=33344919&tree=B2g-Inbound
The failure existed on b2g without my patch. It looks like SpecialPowers.setBoolPref is async and it fails to disable Ogg in time when recorder starts.
Since the whole content/media/test/* and content/media/webaudio/test/* are disabled on b2g, there are some new tests which fail on b2g on their own not caught.
I will rebase my patch, disable the test on b2g, and run tests on try again. I would like to check in this patch as soon as possible in case there are more test failures not caught on b2g. Or is it preferred to fix the failures before this patch can check in?
Comment 58•11 years ago
|
||
I see no problem with disabling individual failing tests (on B2G *only*) in order to get the rest re-enabled.
Assignee | ||
Comment 59•11 years ago
|
||
Disable content/media/test/test_mediarecorder_creation_fail.html which fails on B2G on its own without these patches. I will file another bug to fix the failure.
Attachment #8361572 -
Attachment is obsolete: true
Attachment #8363615 -
Flags: review+
Assignee | ||
Comment 60•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=7782e7b3b910
all green except one orange which is not related to media/webaudio.
Parent rev. = 8f4ecbf938cd on 1/21.
Hi Ryan, can you try to land the patch again? Thanks.
Keywords: checkin-needed
Comment 61•11 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/e4a5cc40929c
https://hg.mozilla.org/integration/b2g-inbound/rev/21c8ab0245dc
https://hg.mozilla.org/integration/b2g-inbound/rev/65a5026da078
Keywords: checkin-needed
Comment 62•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/e4a5cc40929c
https://hg.mozilla.org/mozilla-central/rev/21c8ab0245dc
https://hg.mozilla.org/mozilla-central/rev/65a5026da078
*cues up the Hallelujah chorus* :)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 66•11 years ago
|
||
FYI - I filed a bunch of followups that I noticed from this bug in these bugs:
* bug 962871 - get these tests running on debug emulators
* bug 962874 - get these tests running on b2g desktop builds
* bug 962878 - fix test_mediarecord_4ch_audiocontext.html
* bug 962883 - fix test_mediarecorder_creation_fail.html
You need to log in
before you can comment on or make changes to this bug.
Description
•