Open Bug 1424968 Opened 7 years ago Updated 2 years ago

Retrieving auPerDevPixel is one of the most expensive parts of ScrollingLayersHelper::BeginItem

Categories

(Core :: Graphics: WebRender, defect, P2)

defect

Tracking

()

People

(Reporter: jrmuizel, Unassigned)

References

(Blocks 2 open bugs)

Details

In a profile of facebook-refresh.html almost 60% of the time is spent doing this. This is presumably because of the cache misses that occur accessing the frame and style context.
Blocks: 1422039
I talked with mstange about this kind of problem and we could probably track auPerDevPixel in the builder, only updating it when it changes.
To be clear this is 60% of self time.
Track it in the nsDisplayListBuilder? Or the WebRenderCommandBuilder? #toomanybuilders

Can we assume the auPerDevPixel only changes if and only if we are traversing a nsDisplayZoom item? Otherwise I'm not sure how we would know it changed without at least checking the prescontext which will be almost as expensive.
Flags: needinfo?(mstange)
Blocks: 1426770
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> Can we assume the auPerDevPixel only changes if and only if we are
> traversing a nsDisplayZoom item?

Yes.
Thanks!
Flags: needinfo?(mstange)
For the record Miko is undoing bug 1424637 in bug 1434243, so as part of this bug we should probably do a higher-level cleanup of getting the auPerDevPixel. Maybe do it in the WebRenderCommandBuilder and then pass it through all the CreateWebRenderCommands functions.
Blocks: stage-wr-next
No longer blocks: stage-wr-trains
FrameLayerBuilder computes this once per container item and uses that for all items within the container. That avoids needing any special handling for detecting zoom items at all.

I experimented with commenting this out and hardcoding 60, it makes BeginItem faster, but has little to no impact on actual performance afaict.

I believe the slowness is because we're dereferencing the nsIFrame, but since we also do that in every CreateWebRenderCommands function, then fixing this just moves the time to the next location that touches it.

I tested hacking nsDisplayBackgroundColor to also not touch the frame (or style context), and then we get ~1.5% win on the first subtest of displaylist_mutate (the one with only colours). Looks to be around 20% faster for total WebRender commands creation, saving about 0.5ms.

If we could do this for a decent percentage of common items, then it would reduce memory throughput a fair bit and probably show wins on real sites a bit.

I also see that every item is checking nsIFrame::IsBackfaceHidden(), which is *not* what we check for the non-WR path (nsIFrame::In3DContextAndBackfaceHidden is cached on the item already!). That seems like it might be a bug. Dzmitry, is that something you're looking at with your 3d work?
Flags: needinfo?(dmalyshau)
No, Matt, I wasn't messing with Gecko code when working on WR's backface visibility.
Flags: needinfo?(dmalyshau)
No longer blocks: 1422039
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.