Recycle IOSurfaces + framebuffers
Categories
(Core :: Graphics: WebRender, enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox73 | --- | fixed |
People
(Reporter: mstange, Assigned: mstange)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 1 obsolete file)
In the WebRender + OS compositor configuration, during fast scrolling we destroy and allocate a lot of surfaces.
Here's a profile: https://perfht.ml/2JFbXJl
Of the non-idle time, the renderer thread time breaks down as follows:
29% glFlush
15% CA commit
7.8% IOSurface creation
7.1% Binding IOSurface to new texture
6.7% Framebuffer destruction
6.3% Framebuffer creation
3.2% glClientWaitSync
So that's 32% from too much GPU work (probably mostly due to bug 1591529), 15% because we have too many CALayers, and 28% from IOSurface/texture/framebuffer/renderbuffer creation and destruction.
Recycling those resources should allow us to cut down those 28% a lot.
Updated•5 years ago
|
Comment 1•5 years ago
|
||
Definitely cache framebuffers, ideally immutably!
FWIW in WebGL we cache some validation at the framebuffer level, and additionally for new framebuffers GL has to evaluate for framebuffer completeness before use, which is likely to add overhead.
Assignee | ||
Comment 2•5 years ago
|
||
Depends on D54857
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
Depends on D54858
Comment 4•5 years ago
|
||
This is the opposite of caching framebuffers immutably. :)
Why do you want to add depth/stencil planes after creation?
Assignee | ||
Comment 5•5 years ago
|
||
Oh right! I had forgotten about your comment about immutability. I can definitely change it to do that.
The API I have allows calling GetFramebufferForSurface
for the same IOSurface at different times with different values for aNeedsDepthBuffer
. I don't want to create a depth buffer unnecessarily if somebody calls it with aNeedsDepthBuffer == false
, and it seemed wasteful to destroy the existing framebuffer if another call with aNeedsDepthBuffer == true
comes in later for the same surface. But in practice this is a really unlikely scenario, so I could just as well recreate the framebuffers. Basically, CompositorOGL doesn't need a depth buffer but WebRender does, and they both use the SurfacePool. However, we don't really want to mix CompositorOGL and WebRender in the same instance of Firefox anyway. I think at the moment we can still get into a mixed state because we don't use WebRender for WebExtension arrow panels, but I'm not sure about that.
Assignee | ||
Comment 6•5 years ago
|
||
Depends on D54857
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 7•5 years ago
|
||
I've hit a snag with my threading model, exposed by assertions in debug builds on try:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=281024646&repo=try&lineNumber=8692
The SurfacePool is shared between all windows, but some windows paint on the main thread (context menus, tooltips) and some on the compositor or renderer thread (regular browser windows). Eviction is done in OnEndFrame. So a call to OnEndFrame on the main thread can evict entries that were created on the compositor / renderer thread. This means that we destroy framebuffers and drop references to GLContexts . But GLContext inherits from the non-thread-safe SupportsWeakPtr, which has asserts that make sure that GLContext is only used on a single thread. These things are at odds with one another.
I can see three ways of dealing with this:
- Keep the current architecture and allow using GLContext and MozFramebuffer from multiple threads (but only one thread at a time, with the caller being responsible for proper locking).
- Keep the current architecture, but put mechanisms in place that ensure that GL resources are only touched on the thread that they were created on.
- Have two thread pools: one for "main-thread windows" and one for "compositor/renderer thread windows".
I'm going to explore option 2 for a bit. Jeff, do you have opinions on this?
Assignee | ||
Comment 8•5 years ago
|
||
Option 3 works better. I have a solution that replaces SurfacePool::Get() with SurfacePool::Create(size_t aPoolSizeLimit). I'm giving main thread rendering and non-WebRender one pool per window, with a limit of 0, i.e. "discard everything", which preserves the current (pre-patch) behavior. And WebRender gets one shared SurfacePool in its RenderThread whose lifetime it manages.
Comment 10•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/1b4c3b317568
https://hg.mozilla.org/mozilla-central/rev/f97710c9392e
Updated•5 years ago
|
Description
•