Closed Bug 1640496 Opened 4 years ago Closed 4 years ago

Youtube video starts to glitch; black screen seen until restarted

Categories

(GeckoView :: Media, defect, P1)

Unspecified
All

Tracking

(firefox78+ fixed, firefox79 fixed)

RESOLVED FIXED
mozilla79
Tracking Status
firefox78 + fixed
firefox79 --- fixed

People

(Reporter: jonalmeida, Assigned: gw)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Steps to reproduce

  1. Navigate to a youtube video; play the video.
  2. Scroll down the page to expand the comments (it seems to reproduce easier that way).

Alternative STR

  1. Open video in R-B/Fenix.
  2. Fullscreen the video.
  3. Press the system home button to enable PiP mode.
  4. Click on the PiP mode video to return to the browser.

Expected

  • The video should play without any hiccups.

Actual

GVE: 78.0.20200521093657; reproducible on Fenix and Reference Browser as well.
Device: Android 10, Pixel 4

I can reproduce this on Fenix Nightly on a Sony XZ1 Compact, this phone uses WebRender by default. The glitches I see is a few frames being repeated in a loop.

Kris, is this something that you can look into as fallout from WebRender?

Flags: needinfo?(ktaeleman)

@sotaro: Could you take a look at this?

Flags: needinfo?(ktaeleman) → needinfo?(sotaro.ikeda.g)
Severity: -- → S2
Priority: -- → P2

(In reply to Kris Taeleman (:ktaeleman) from comment #3)

@sotaro: Could you take a look at this?

OK, I take a look.

Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)

I confirmed that the problem could happen with Firefox preview and with gecko view example app on Pixel 3a.

Could we get some flags for the Gecko versions affected/targeted for fixes, so we know what versions this is affecting? PiP is one of our new features that we're highlighting and we'll want to track what versions this affects when we make release decisions for Fenix.

We're aiming tentatively for a 79 Gecko version release for Fenix right now, so we'd like to have this fixed for that.

Does this problem occur without WebRender as well?

Flags: needinfo?(ktaeleman)

I did not see the problem without WebRender, though I still do not know why the problem happens:(

I don't know much about this PiP feature - is there someone on your end who worked on it that could help investigate here?

Flags: needinfo?(liuche)

This happens without PiP, just doing full screen on a video and waiting will repro. The regression range is about two-three weeks max I'd say.

While trying to reproduce this on GVE, I ran to an issue that might be related to this, but not sure.
https://bugzilla.mozilla.org/show_bug.cgi?id=1643435 - Youtube video goes black when video controls appear on Android

Bisecting this has pointed to the problem being introduced by
https://bugzilla.mozilla.org/show_bug.cgi?id=1637953

I'll put a dependency on this issue for now and we can revisit when that's fixed.

Depends on: 1637953
Flags: needinfo?(ktaeleman)
Depends on: 1643435
No longer depends on: 1637953

STR without PiP as reported here: https://github.com/mozilla-mobile/fenix/issues/11017

Steps to reproduce

visit youtube video m.youtube.com/watch?v=n6QwnzbRUyA

Expected behavior
video moves

Actual behavior
sound goes on but video either freezes or loops like a gif of half a second
this started about a week ago

More STR from https://github.com/mozilla-mobile/fenix/issues/11054

Steps to reproduce

I can't get this to reproduce consistently.

  • Open a video in YouTube, wait maybe a minute
  • (if it doesn't reproduce, open a new video and try again)
    • I can't seem to make it reproduce when I'm trying to do so – it comes up in my day-to-day use when I'm watching YouTube in the browser. When I get it to reproduce accidentally, it's usually within the first 5 minutes of opening the browser on the 3rd-5th video I click. Note that I frequently use the double-tap to advance 10s feature of YouTube and this could be related

Expected behavior

Nothing: video plays

Actual behavior

The video will start looping a few milliseconds of content (audio is fine). Subsequent interactions with the browser will make it unusable (e.g. content is all black) until I swipe it closed.

Video: https://drive.google.com/file/d/1Wgt6mBdQmRtlT7vIoPdWiYU65ushguaU/view?usp=sharing

Device information

  • Android device: P2
  • Fenix version: Nightly 200529
    • I haven't seen this in the migrated Beta build in the play store
    • I've seen this in local Nightly builds from at least few days ago, if not a week or two
    • It reproduces in the Preview Nightly channel

Did some more digging and earliest report I see is using Fenix Nightly 05/25, which would be GV 78.0a1-20200520033931 from 05/20.
@sotaro: Going manually through the changes, do you think https://hg.mozilla.org/integration/autoland/rev/5ad0aa8b8d96 (Bug 1637873) could have something to do with this?

The Fenix team is going to cut a new version for release on Tuesday June 9th. I'll start preparing a change that turns off Webrender for release in case we can't get this fixed before that.

Flags: needinfo?(liuche) → needinfo?(sotaro.ikeda.g)
Priority: P2 → P1

I tried "./mach mozregression -n gve" to find a point of regression. But it did not work :( I created Bug 1644066 for it.

(In reply to Kris Taeleman (:ktaeleman) from comment #15)

Did some more digging and earliest report I see is using Fenix Nightly 05/25, which would be GV 78.0a1-20200520033931 from 05/20.
@sotaro: Going manually through the changes, do you think https://hg.mozilla.org/integration/autoland/rev/5ad0aa8b8d96 (Bug 1637873) could have something to do with this?

I looked into it. But it seems not related to this bug. It could change an order of calling mSurfTex->DecrementUse() at SurfaceTextureHost::DeallocateDeviceData() on compositor thread and RenderAndroidSurfaceTextureHostOGL::~RenderAndroidSurfaceTextureHostOGL(). But it should not affect to how GeckoSurfaceTexture is destroyed.

Since, GeckoSurfaceTexture is always destroyed on Render thread by calling GeckoSurfaceTexture.destroyUnused()

Flags: needinfo?(sotaro.ikeda.g)

I was able to reproduce the problem on Pixel 3a with GeckoView example app. But with today's m-c, I could not reproduce the problem any more with GeckoView example app.

Are ya'll still able to reproduce this issue? We can't...which is a bit odd.

Flags: needinfo?(padenot)
Flags: needinfo?(jonalmeida942)

Andrei - could we get a second look from QA to see if this issue can be reproduced?

Flags: needinfo?(andrei.vaida)

I make Fenix build with local gecko. With it, the problem still happned, though it did not happn with GeckoView example app anymore.

It seems that the problem did not happen with https://www.youtube.com/watch?v=wZZ7oFKsKzY. Or when picture caching was disabled by pref gfx.webrender.picture-caching=false.

During looking into the bug. I found Bug 1644325.

It is interesting that black webrender rendering happned even when RenderAndroidSurfaceTextureHostOGL does not exist. For RenderAndroidSurfaceTextureHostOGL, InvalidToWrExternalImage() is used for rendering.

The reset_state method being called after locking external textures
was resetting the cached state, but not actually forcing the GL
state to match.

This could result in cached state getting out of sync with GL. In
this case, we'd end up trying to composite to the external texture
FBO rather than the main framebuffer.

The attached patch above fixes the issue for me using the PiP STR, which was 100% reproducible for me.

It's possible there is a different bug causing the other symptoms mentioned here, but I'm fairly confident this will fix it in all of the reported STR cases.

Pending try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=28b9dd4ede3bb9ead359297fdbb66baaed32a493

Once reviewed, please feel free to land this on my behalf, to get it into m-c as quickly as possible.

Assignee: sotaro.ikeda.g → gwatson

Assign the bug to :gw. He created a fixing patch.

Pushed by gwatson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3212a1536cbd Fix incorrect FBO state caching when updating external textures. r=sotaro

Thank you! We'll be making another dot release onto Firefox Beta cut from a June 22 branch that will be using GV 78. Can we get a beta uplift?

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79

Comment on attachment 9155186 [details]
Bug 1640496 - Fix incorrect FBO state caching when updating external textures.

Beta/Release Uplift Approval Request

  • User impact if declined: Major video rendering artifacts on GeckoView with Picture in Picture
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: If possible, it would be good to verify by following the Picture in Picture repro steps mentioned in the bug. However, if it's too difficult to manually test (e.g. requires specific hardware) it's still fine to take this low-risk patch, it's definitely a correct fix.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Small / simple patch, easy to back out, fixes a clear logic bug in the code.
  • String changes made/needed:
Attachment #9155186 - Flags: approval-mozilla-beta?
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

I was able to repro, yes. I'll check with a build that has the fix.

Flags: needinfo?(padenot)

When are the GV Beta spins, and will this be in the next GV78 Beta? Is the spin once a week on Tuesday, or twice a week?

Verified on GV 79.0a1-20200610043607 that I can no longer reproduce this bug.

Flags: needinfo?(jonalmeida942)

I can still reproduce the issue on Firefox Preview Nightly 200611 GV: 79.0a1-20200609092134 using a OnePlus 6T (Android 9).

Flags: needinfo?(andrei.vaida)

(In reply to Laurentiu Apahidean from comment #36)

I can still reproduce the issue on Firefox Preview Nightly 200611 GV: 79.0a1-20200609092134 using a OnePlus 6T (Android 9).

This fix wasn't in that build.

Flags: needinfo?(laurentiu.apahidean)

Comment on attachment 9155186 [details]
Bug 1640496 - Fix incorrect FBO state caching when updating external textures.

approved for 78.0b7

Attachment #9155186 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

(In reply to Chenxia Liu [:liuche] from comment #34)

When are the GV Beta spins, and will this be in the next GV78 Beta? Is the spin once a week on Tuesday, or twice a week?

3 times a week, on Monday, Wednesday and Friday, except during RC week. So the next build will start around 02:00 UTC on Monday.

I retested the issue on Firefox Preview Nightly 200619 (Build #2171067) and Firefox Preview 5.2.0-beta.2 using a OnePlus 6T (Android 9) and the issue no longer occurs.

Flags: needinfo?(laurentiu.apahidean)

Moving some media bugs to the new GeckoView::Media component.

Component: General → Media
Flags: qe-verify+

Removing regressionwindow-wanted keyword because this bug has been resolved.

Removing regressionwindow-wanted keyword because this bug has been resolved.

Removing regressionwindow-wanted keyword because this bug has been resolved.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: