Closed Bug 1670258 Opened 4 years ago Closed 4 years ago

Intermittent toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js | Test timed out -

Categories

(Toolkit :: Video/Audio Controls, defect, P5)

defect

Tracking

()

RESOLVED FIXED
85 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox81 --- unaffected
firefox82 --- unaffected
firefox83 --- wontfix
firefox84 --- wontfix
firefox85 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: pehrsons, NeedInfo)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 file)

Filed by: dluca [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=318174083&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/UBi1m1k2SAqP9FSIjL50OQ/runs/0/artifacts/public/logs/live_backing.log


INFO - TEST-START | toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js
[task 2020-10-09T14:04:05.661Z] 14:04:05 INFO - TEST-INFO | started process screentopng
[task 2020-10-09T14:04:06.096Z] 14:04:06 INFO - TEST-INFO | screentopng: exit 0
[task 2020-10-09T14:04:06.096Z] 14:04:06 INFO - Buffered messages logged at 14:02:35
[task 2020-10-09T14:04:06.097Z] 14:04:06 INFO - Entering test bound test_mediaStreamVideos
[task 2020-10-09T14:04:06.097Z] 14:04:06 INFO - Waiting for videos to be ready
[task 2020-10-09T14:04:06.097Z] 14:04:06 INFO - Waiting for 'canplaythrough' for ''
[task 2020-10-09T14:04:06.097Z] 14:04:06 INFO - Buffered messages logged at 14:02:36
[task 2020-10-09T14:04:06.099Z] 14:04:06 INFO - Waiting for 'canplaythrough' for 'with-controls'
[task 2020-10-09T14:04:06.101Z] 14:04:06 INFO - Buffered messages finished
[task 2020-10-09T14:04:06.105Z] 14:04:06 INFO - TEST-UNEXPECTED-FAIL | toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js | Test timed out -
[task 2020-10-09T14:04:06.105Z] 14:04:06 INFO - GECKO(2976) | MEMORY STAT | vsize 20975750MB | residentFast 1118MB
[task 2020-10-09T14:04:06.108Z] 14:04:06 INFO - TEST-OK | toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js | took 90067ms
[task 2020-10-09T14:04:06.109Z] 14:04:06 INFO - Not taking screenshot here: see the one that was previously logged
[task 2020-10-09T14:04:06.110Z] 14:04:06 INFO - TEST-UNEXPECTED-FAIL | toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js | Found a tab after previous test timed out: http://example.com/browser/toolkit/components/pictureinpicture/tests/test-media-stream.html -
[task 2020-10-09T14:04:06.111Z] 14:04:06 INFO - checking window state
[task 2020-10-09T14:04:06.112Z] 14:04:06 INFO - TEST-START | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js
[task 2020-10-09T14:04:06.171Z] 14:04:06 INFO - Not taking screenshot here: see the one that was previously logged
[task 2020-10-09T14:04:06.172Z] 14:04:06 INFO - Buffered messages logged at 14:04:05
[task 2020-10-09T14:04:06.173Z] 14:04:06 INFO - Entering test bound
[task 2020-10-09T14:04:06.174Z] 14:04:06 INFO - Buffered messages finished
[task 2020-10-09T14:04:06.175Z] 14:04:06 INFO - TEST-UNEXPECTED-FAIL | toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js | Uncaught exception received from previously timed out test - AbortError: Actor 'SpecialPowers' destroyed before query 'Spawn' was resolved
[task 2020-10-09T14:04:06.583Z] 14:04:06 INFO - Waiting for videos to be ready
[task 2020-10-09T14:04:06.846Z] 14:04:06 INFO - Waiting for toggle to become visible
[task 2020-10-09T14:04:06.847Z] 14:04:06 INFO - Testing toggle mode 2 for stage hoverVideo in position right, has used: true
[task 2020-10-09T14:04:06.863Z] 14:04:06 INFO - TEST-PASS | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | Expected #pictureInPictureToggleButton to be hidden. - true == true -
[task 2020-10-09T14:04:06.863Z] 14:04:06 INFO - TEST-PASS | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | Expected .pip-expanded to be hidden. - true == true -
[task 2020-10-09T14:04:06.964Z] 14:04:06 INFO - TEST-PASS | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | Toggle reached target opacity. - true == true -
[task 2020-10-09T14:04:06.972Z] 14:04:06 INFO - Clicking on toggle, and expecting a Picture-in-Picture window to open
[task 2020-10-09T14:04:07.131Z] 14:04:07 INFO - TEST-PASS | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | A Picture-in-Picture window opened. -
[task 2020-10-09T14:04:07.359Z] 14:04:07 INFO - TEST-PASS | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | Expected to get the right mouse events. - [] deepEqual [] -
[task 2020-10-09T14:04:07.379Z] 14:04:07 INFO - TEST-PASS | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | Expected to get the right mouse events. - ["pointerdown","mousedown","pointerup","mouseup","click"] deepEqual ["pointerdown","mousedown","pointerup","mouseup","click"] -
[task 2020-10-09T14:04:07.415Z] 14:04:07 INFO - Leaving test bound
[task 2020-10-09T14:04:07.476Z] 14:04:07 INFO - GECKO(2976) | MEMORY STAT | vsize 20975737MB | residentFast 1102MB
[task 2020-10-09T14:04:07.476Z] 14:04:07 INFO - TEST-OK | toolkit/components/pictureinpicture/tests/browser_mouseButtonVariation.js | took 1509ms
[task 2020-10-09T14:04:07.552Z] 14:04:07 INFO - checking window state
[task 2020-10-09T14:04:07.610Z] 14:04:07 INFO - TEST-START | toolkit/components/pictureinpicture/tests/browser_noToggleOnAudio.js

This tested started to fail frequently on Linux x64 optimized with the AddressSanitizer checks enabled after bug 1653496 landed - there is likely a race in the test.

Reid, can you take a look at this?

Flags: needinfo?(reidshina6)
Keywords: regression
Regressed by: 1653496
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1653496

Bryce, are you able to have a look at what might be causing this?

We're checking readyState on some videos that get a source stream assigned, at https://searchfox.org/mozilla-central/rev/9c72508fcf2bba709a5b5b9eae9da35e0c707baa/toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js#36-42, and if readyState < HAVE_ENOUGH_DATA, we wait for the canplaythrough event - but that event doesn't arrive:

[task 2020-10-12T23:47:30.011Z] 23:47:30     INFO - Buffered messages logged at 23:45:59
[task 2020-10-12T23:47:30.012Z] 23:47:30     INFO - Entering test bound test_mediaStreamVideos
[task 2020-10-12T23:47:30.012Z] 23:47:30     INFO - Waiting for videos to be ready
[task 2020-10-12T23:47:30.013Z] 23:47:30     INFO - Waiting for 'canplaythrough' for ''
[task 2020-10-12T23:47:30.013Z] 23:47:30     INFO - Waiting for 'canplaythrough' for 'with-controls'
[task 2020-10-12T23:47:30.014Z] 23:47:30     INFO - Buffered messages finished
[task 2020-10-12T23:47:30.014Z] 23:47:30     INFO - TEST-UNEXPECTED-FAIL | toolkit/components/pictureinpicture/tests/browser_mediaStreamVideos.js | Test timed out - 

and looking at the screenshot, it does appear that the video does not load the stream. I know it's asan so it's a little slower, but 90s should be enough to load this very small video, I think? Are we doing something wrong in terms of how we assign the source stream or something?

Flags: needinfo?(bvandyk)

Nothing is jumping out at me here. That the failure is ASan only is interesting. I'm going to hold the NI and see if I can get a repro on Linux under rr.

I'm able to repro a different error:

Uncaught exception - Toggle element .pip-wrapper should have eventually reached target opacity 0.8 - timed out after 100 tries.
 0:25.80 INFO Leaving test bound test_mediaStreamVideos
 0:25.81 GECKO(124682) MEMORY STAT | vsize 20975467MB | residentFast 1182MB
 0:25.81 TEST_END: Test OK. Subtests passed 3/4. Unexpected 1
 0:25.81 FAIL Found an unexpected tab at the end of test run: http://example.com/browser/toolkit/components/pictureinpicture/tests/test-media-stream.html - 

But that doesn't seem like the issue above. I suspect this is an issue of me manually running the tests and breaking the checks by touching stuff while tests run -- it doesn't happen in headless mode which suggests me meddling.


Looking at failure logs in orange factor they seem consistent with comment 6. We end up waiting for canplaythrough, sometimes on the first media element we create, sometimes on the second. Screenshots reflect this too. The screenshots also show the non-clone video at different frames, suggesting it is playing when we fail.

I'm trying with MOZ_CHAOSMODE=0 now to see if that helps repro.

No luck with MOZ_CHAOSMODE. I've been trying under rr with chaos and can get the test to timeout, but not in this location. It's not clear to me that the rr chaos mode timeouts are something worth chasing. It may just be chaos mode making various things slow enough that we eventually time out, as even in the timeout cases the tests do not seem to timeout at a single reliable point.

Debugging via try seems like the last avenue to explore.

These failures seem to have dropped off a cliff? Unsure why...

Clearing NI since this isn't happening anymore and I don't expect I'll be able to repro.

Flags: needinfo?(bvandyk)

Seems to have spiked again more recently.

Andreas, do you see anything around the capture stream here that could cause issues? I think some of the logic around ready state could simplified in the test, but looking at the failures it looks to me like elements with the captured and cloned sources are not becoming ready, so I wonder if the problem is in their source.

Flags: needinfo?(apehrson)

I suspect there might be races because it's a very short file and the way the test is written we have to wait for tracks to be signaled from the decoder to main before we can hook up the tested video elements. And those elements need a video frame to advance readyState, but that is an async step dependent on the decoder thread not reaching end of playback until they've been hooked up on main.

And since the captureStream is cloned we'll also never see future live tracks in the stream once the tracks from the first loop have ended.

I can improve this speculatively and we'll see. It should improve. If some deeper issue lingers we might need to look at better logs. Or use tracks from another source so we know they don't end under our feet, given that this test framework relies on HAVE_ENOUGH_DATA.


Having looked at this a bit I'm confounded as to why this test works so well normally. I think it's an artifact of gapless looping -- reaching the end of the resource with the loop attribute set means there's an implicit seek to the beginning, and on seek tracks in captureStreams must be recreated. With gapless looping that doesn't seem to happen. That is against spec, but hey, mozCaptureStream is still prefixed so it has its quirks.

I'll still see what I can do.

Assignee: nobody → apehrson
Flags: needinfo?(apehrson)
Pushed by pehrsons@gmail.com: https://hg.mozilla.org/integration/autoland/rev/4990382a7e53 Make PiP test browser_mediaStreamVideos.js more deterministic. r=mconley
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 85 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: