Open Bug 1768984 Opened 2 years ago Updated 2 years ago

Very slow plane splitting and poor batching on https://themaninblue.com/experiment/slashGlobe/

Categories

(Core :: Graphics: WebRender, defect, P3)

Firefox 102
x86_64
Linux
defect

Tracking

()

REOPENED
108 Branch
Tracking Status
firefox102 --- affected
firefox108 --- affected

People

(Reporter: gregp, Assigned: nical)

References

(Blocks 3 open bugs)

Details

Attachments

(6 files)

Attached image nightly.png (deleted) —

Steps to reproduce:

  1. Navigate to https://themaninblue.com/experiment/slashGlobe/
  2. Move cursor around the globe

Actual results: Lag

Expected results: Less lag

https://share.firefox.dev/3stUxFV

Looks like we are spending half the RenderBackend time inserting into the BSP tree for plane splitting. gw, is that normal or excessive?

Blocks: wr-perf
Severity: -- → S3
Flags: needinfo?(gwatson)
Priority: -- → P3

No, that's not normal - seems like a bug worthy of investigation when someone has time.

Flags: needinfo?(gwatson)

This is the plane-splitting code allocating memory at each insertion. I had a similar profile on the test case of bug 1493359.

Summary: https://themaninblue.com/experiment/slashGlobe/ is very slow → Very slow plane splitting on https://themaninblue.com/experiment/slashGlobe/
Blocks: 1782834
Blocks: 1795885

I have a work in progress rewrite of plane_split's BSP tree which yields pretty large speedups for the plane splitter (~3x) on this test case and the one in bug 1493359, which leads to ~25% improvement to frame building time on my laptop for these test cases. It won't get a solid 60fps on mid-range laptop but plane splitting goes down to about 10% of the profile and further improvements will probably have to be found elsewhere.

Assignee: nobody → nical.bugzilla

New plane_split version with much faster BSP tree.
The main source of churn is plane_split not being generic over the unit tag anymore.

Currently splane splitters are stored in the built scene and reallocated with every new scene.
This patch moves the responsibility of storing/recycling the plane splitters to the frame builder so that they can be reused in more cases. The scene builder only needs to track splitter indices, it doesn't use the splitter objects themselves.

Alone this patch does not make a large difference because the current version of the plane_split crate reallocates everything each frame. The next version of plane_split does a much better job of recycling allocations, and applying this patch on top of it makes a large difference.

Depends on D160362

Depends on D161584

Pushed by nsilva@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/296dca6ce0fa Plane-split 0.18 update. r=gfx-reviewers,gw https://hg.mozilla.org/integration/autoland/rev/ebdd5938e8ab Audit plane-split 0.18. r=bholley https://hg.mozilla.org/integration/autoland/rev/b18532ba7066 Vendor the plane-split update. r=gfx-reviewers,gw https://hg.mozilla.org/integration/autoland/rev/8d17749400ba Reuse plane splitters across scenes. r=gfx-reviewers,gw

Reopening as the changes bring a large perf boost on the plane-splitting part of the profile but we are still choking on thousands of draw calls with this test case.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: Very slow plane splitting on https://themaninblue.com/experiment/slashGlobe/ → Very slow plane splitting and poor batching on https://themaninblue.com/experiment/slashGlobe/

The large number of draw calls mostly comes from many split composite instances (the 3d transformed thingies), sampling a relatively low number of different textures but alternating between them frequently. The input texture that is changing is the child picture task of the primitive.

Attached image Dump of the render task graph (deleted) —

The render task graph is not a pretty sight. There is a very large number of child pictures depending on an even larger number of clip tasks. The main issue for batching is the number of child pictures though (second column) and how they end up interleaved in the drawing order.

Hopefully there's something we can do about nudging the packing of the child tasks so that they better match the batching order.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: