Black Flash When Navigating Away From GV in Browser Fragment to other Fragment
Categories
(Core :: Graphics, defect, P1)
Tracking
()
People
(Reporter: ekager, Assigned: snorp)
References
Details
Reproducible in Fenix, RB, AC samples-browser
STR:
- (?) Have EngineView hosted in fragment layout
- Navigate to other fragment from Browser Fragment
Expected:
No black flash
Actual:
Black flash when navigating away
See slowed down gif of Fenix flash and more Fenix context
I think this may have to do with how the SurfaceView lifecycle is handled in fragments vs in activities maybe?
Comment 1•5 years ago
|
||
P1
snorp is investigating. He can reproduce in Fenix, but not in other apps.
Reporter | ||
Comment 2•5 years ago
|
||
"Other apps" as in RB and samples-browser too? I see it in those as well when clicking into the tabs fragment but it's fast.
Updated•5 years ago
|
Comment 3•5 years ago
|
||
Fenix issue: https://github.com/mozilla-mobile/fenix/issues/1901
Assignee | ||
Comment 4•5 years ago
|
||
OK, I think I've figured out what's going on here. Currently, when the GeckView
loses the Surface
, we pause the compositor but also call GeckoSession.setActive(false)
. This in turn sets the DocShell inactive. I think this is causing us to also throw away layers. When the Surface
comes back later, we synchronously resume the compositor, but it doesn't have anything to draw, resulting in a flash of blank content. I verified that removing the setActive(false)
calls in GeckoView
mitigated the problem.
Jeff, am I on the right track here? I didn't print the layer tree or anything. This is with traditional layers, not WR. If my guess is correct, do you think we could hold on to the layer tree for a while? Maybe only evict if we need to due to memory pressure?
Comment 5•5 years ago
|
||
What your saying sounds plausable. I'm ni'd Jamie and Kris perhaps they can provide more useful input.
Comment 6•5 years ago
|
||
Yes, that sounds like another issue I'm looking into with WR enabled where when in about:support (or other parent process pages), switching away and back to the app results in a black screen.
I have to force a paint when PresShell becomes active again, else we end up with a black page until we scroll.
Bug 1581868 - Black page sometimes when restoring geckoview_example/fenix with webrender enabled
I'll take a closer look at this too in that context too as even with the repaint I still see a black flash when switching back to the app.
Comment 7•5 years ago
|
||
Moving to the graphics component as this sounds
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 8•5 years ago
|
||
https://searchfox.org/mozilla-central/rev/923eec8d2fb8078ebc7a33a9e1ce73eac01f7446/gfx/webrender_bindings/src/bindings.rs#1400
-> Looks like this was encountered before
Assignee | ||
Comment 9•5 years ago
|
||
(In reply to Kris Taeleman (:kris/kris) from comment #8)
https://searchfox.org/mozilla-central/rev/923eec8d2fb8078ebc7a33a9e1ce73eac01f7446/gfx/webrender_bindings/src/bindings.rs#1400
-> Looks like this was encountered before
Well it's not like white is much better :)
Comment 10•5 years ago
|
||
Maybe I'm misunderstanding, but this bug seems to be about a black flash when navigating away from the browser, to the settings page or tab switcher. Comment 4 seems to be about resuming, ie switching back to the browser, away from the tab drawer or settings page.
I'm not convinced the black flash when navigating away is graphics' fault. We set everything inactive in GeckoView.Display.onSurfaceDestroyed(), which I presume happens when the SurfaceView is destroyed by navigating away from the browser fragment. No rendering can occur after this point. I'm not sure what we could do, when we don't have a surface to render in to. Isn't the problem that there is a delay between the SurfaceView being destroyed and the TabsFragment being created and shown? So the android UI displays black during this time.
FWIW I can no longer reproduce the behaviour in the screen recording on Fenix. However, I can easily still do so in Samples Browser. In the original fenix fix, as well as taking a screenshot and setting swipeRefresh?.background
to display it, also set engineView.asView().visibility = View.GONE
. Because the screenshot code wasn't yet implemented in webrender, this was removed, but we continued to set the visibility to GONE
. I guess this means we get a white flash rather than a black flash, which is a big improvement, but better still would be if the SurfaceView wasn't destroyed until the next fragment is ready?
Updated•5 years ago
|
Comment 11•5 years ago
|
||
There was a small bandaid added on the Fenix side that switched the flash color to be black or white, to match the Light/Dark theme on Fenix.
Jamie, do you think the problem is on the Fenix side? If so, we can investigate that further, but the repro steps have changed slightly (and I've updated the comment 0 to clear up confusion).
Comment 12•5 years ago
|
||
I think it might be, yes. I'm certainly not 100% sure, as the interaction between Fenix/Android, geckoview, and graphics code here is complicated and I definitely don't understand it fully. But it seems to me that graphics code can't do anything after GeckoSession.onSurfaceDestroyed()
(as we no longer have a target to render in to), and that the problem is that is being called too early, before the tab switcher / settings menu fragment is visible.
Perhaps bug 1594860 will fix this longer-term, however, by not using a SurfaceView.
Comment 13•5 years ago
|
||
It looks like https://bugzilla.mozilla.org/show_bug.cgi?id=1547737 should have fixed this. Unfortunately, we are now running into this bug on webrender devices:
When a captureThumbnail
is requested while the engine is still doing an initial page load (engineView is still completely white) the captureThumbnail
never goes inside its block. For example, in this code snippet, "captureThumbnail" never gets printed
This leads to the home button press being ignored in Fenix. A demo GIF of this behavior can be seen here
engineView.captureThumbnail { bitmap ->
Log.d("Sawyer", "captureThumbnail")
}
Any ideas what's going on here? Should I reopen https://bugzilla.mozilla.org/show_bug.cgi?id=1547737 or is this a different issue?
Comment 14•5 years ago
|
||
That sounds like a bug in the webrender implementation of RecvRequestScreenPixels
. I believe it works by reading the pixels from the next composite, and I guess because the page is loading, there isn't a composite scheduled.
Please file a new bug about that.
I really don't think we want to reintroduce Fenix bug #1901 as a workaround for this flash, however. Reading pixels from the GPU is unavoidably slow. Delaying opening a new screen until that completes will be laggy. Do we understand why this flash occurs in the first place? Why is the surface destroyed before the new fragment is visible?
Comment 15•5 years ago
|
||
You can see more of my investigation here: https://github.com/mozilla-mobile/fenix/issues/1901#issuecomment-575263206. Basically the destroying of the SurfaceView is handled by the Navigation Library (a fragment gets destroyed while the transition is happening). We cannot force the fragment or view to stick around longer as its not our code (you can see I tried to do this in one of those comments).
I agree screenshotting is not an ideal solution, but it seems to be the only one that actually works until we move away from SurfaceView which is known by Google to have this bug.
Comment 16•5 years ago
|
||
Ah, great work figuring that out. This is an unfortunate situation :(
Okay, I can fix the RecvRequestScreenPixels
during page load issue. That needs fixed in any case.
Updated•5 years ago
|
Comment 18•5 years ago
|
||
Are you sure you only see this behaviour with webrender enabled? I can reproduce with webrender enabled or disabled (after applying your pull-request to my local fenix build).
Adding some logging to geckoview and graphics code, not even the geckoview functions are being called while the page is being loaded, so the request is being lost between Fenix and GeckoView. Looking at the android-components source I would guess that it is because of this - firstContentfulPaint
will be false during page load.
I'll try to build fenix with a local a-c with that statement removed, and see if it works.
Comment 19•5 years ago
|
||
Ah, that seems like a good lead! Sorry for taking us down the wrong path here 😅
Assignee | ||
Updated•5 years ago
|
Comment 20•4 years ago
|
||
Moved back to Github issue https://github.com/mozilla-mobile/fenix/issues/1901
Description
•