Closed Bug 1485420 Opened 6 years ago Closed 6 years ago

Remove extra clips on items

Categories

(Core :: Graphics: WebRender, enhancement, P2)

63 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: jrmuizel, Assigned: aosmond)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 5 obsolete files)

This causes a bunch of display list bloat and work down the line for WebRender. For example, each background image has it's own clip chain instead of using the internal clip.
Assignee: nobody → aosmond
Blocks: 1484365
No longer blocks: 1484365
Blocks: 1485431
Distilled from the discussion between mstange and jrmuizel: Each WebRender display item already has a clip rect associated with it. Currently this may be duplicated in the clip chain. We should store the leaf of the clip chain in the display item instead, assuming its ASR matches the item, and it isn't a rounded rect; each display item will need to be updated to take this into consideration. Some items may already be using the clip rect, in which case it perform an intersection with the leaf. If there is no leaf, it should just use the bounds (like it does today). The clip chain itself should have excluded the leaf. If the ASR doesn't match or it is rounded, it should work as it does today. In practical terms this means we will change ClipManager::BeginClip to filter out the leaf from the clip chain if possible, and call a new method to get the leaf of the clip chain directly from the display item, yielding it if it would have been taken out of the clip chain already. Special attention will need to be paid to the container display items (e.g. nsDisplayTransform, Opaque, etc) because instead of a normal display item, they have a stacking context that will also need to take the clip chain leaf.
Comment 2 continued, reasoning behind this: 1) WebRender is better at dealing with inline clips, vs an associated clip chain 2) It will make the display lists smaller, because the clip chain shrunk 3) We won't have to iterate over as many clips later on as the clip chain is smaller
Attached patch WIP v1 (obsolete) (deleted) — Splinter Review
Still missing all of the container display items.
Comment on attachment 9003487 [details] [diff] [review] WIP v1 Review of attachment 9003487 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/wr/ClipManager.h @@ +72,5 @@ > + static LayoutDeviceRect GetItemClipRect(nsDisplayItem* aItem, > + const LayoutDeviceRect& aBounds); > + > + static wr::LayoutRect GetItemClipRoundedRect(nsDisplayItem* aItem, > + const LayoutDeviceRect& aBounds) GetRoundedItempClipRect() avoids the reader thinking this is returning a rect with rounded corners. ::: layout/tables/nsTableFrame.cpp @@ +7550,5 @@ > LayoutDeviceRect borderRect = LayoutDeviceRect::FromUnknownRect(NSRectToRect(param->mBorderRect + aOffset, > param->mAppUnitsPerDevPixel)); > > wr::LayoutRect roundedRect = wr::ToRoundedLayoutRect(borderRect); > + wr::LayoutRect clipRect = layers::ClipManager::GetItemClipRoundedRect(aItem, borderRect); Instead of passing around aItem here and calling GetItemClip() multiple times we can probably just do it up front and pass around the clip rect.
Comment on attachment 9003487 [details] [diff] [review] WIP v1 Kats, if you have the time, it'd be nice to know if you have any opposition to this.
Flags: needinfo?(kats)
Do we have numbers that give us a sense of how much perf improvement we get from this? The change can be described in simple conceptual terms but I suspect this is going to increase the code complexity dramatically and I'd like to avoid that if the perf win is small. Leaving ni on me since I'd like to look at Andrew's patch in more detail as well.
Blocks: 1455921
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #7) > Do we have numbers that give us a sense of how much perf improvement we get > from this? The change can be described in simple conceptual terms but I > suspect this is going to increase the code complexity dramatically and I'd > like to avoid that if the perf win is small. Leaving ni on me since I'd like > to look at Andrew's patch in more detail as well. I should have something more complete in the next day or so. try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fdab7d65aeb2cbd5862e334c9aab6f254cfc8ab6 With a bit of extra code to disable this path on the container display items. Only a couple of reftest failures so hopefully not making a meaningful difference on the results. Going to my gmail inbox, my scene-1.0.ron file from the capture went from 4.1 MB to 1.7 MB. I'll collect some proper performance numbers once I fix the reftests.
Comment on attachment 9003487 [details] [diff] [review] WIP v1 Review of attachment 9003487 [details] [diff] [review]: ----------------------------------------------------------------- The way I would have done this patch is in ClipManager::BeginItem, compute the item clip rect as a Maybe<LayoutDeviceRect> and then push it onto a stack in the wr::DisplayListBuilder (the call sites for BeginItem can provide a handle to it). Then, instead of changing the call sites for every display item, I would do the clip computation inside the various wr::DisplayListBuilder::PushXXX functions using the topmost thing on the stack. If the topmost item is a Nothing then just use the rect passed in, otherwise intersect the provided rect with the topmost. That would make the change less brittle in that it's easier to enumerate all the PushXXX functions in wr::DisplayListBuilder than it is to enumerate all their call sites and make sure you don't miss anything. Also, for gecko display items that generate multiple WR display items, you're currently calling GetItemClipRect multiple times with the exact same args which is kind of inefficient. BeginItem already has the clip in question so it should cheap/free to do there.
But assuming the patch doesn't get much more complicated than this, it seems like it should be worth doing. The scene file reduction probably won't translate to an equivalent perf improvement but even so reducing the display list size that drastically seems worthwhile.
Flags: needinfo?(kats)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #9) > Comment on attachment 9003487 [details] [diff] [review] > WIP v1 > > Review of attachment 9003487 [details] [diff] [review]: > ----------------------------------------------------------------- > > The way I would have done this patch is in ClipManager::BeginItem, compute > the item clip rect as a Maybe<LayoutDeviceRect> and then push it onto a > stack in the wr::DisplayListBuilder (the call sites for BeginItem can > provide a handle to it). Then, instead of changing the call sites for every > display item, I would do the clip computation inside the various > wr::DisplayListBuilder::PushXXX functions using the topmost thing on the > stack. If the topmost item is a Nothing then just use the rect passed in, > otherwise intersect the provided rect with the topmost. > > That would make the change less brittle in that it's easier to enumerate all > the PushXXX functions in wr::DisplayListBuilder than it is to enumerate all > their call sites and make sure you don't miss anything. Also, for gecko > display items that generate multiple WR display items, you're currently > calling GetItemClipRect multiple times with the exact same args which is > kind of inefficient. BeginItem already has the clip in question so it should > cheap/free to do there. Yeah for the multiple calls, it was just the easiest way to get this working quickly :). I like your overall approach better, I'll see about rewriting it that way.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #9) > The way I would have done this patch is in ClipManager::BeginItem, compute > the item clip rect as a Maybe<LayoutDeviceRect> and then push it onto a > stack in the wr::DisplayListBuilder (the call sites for BeginItem can > provide a handle to it). Then, instead of changing the call sites for every > display item, I would do the clip computation inside the various > wr::DisplayListBuilder::PushXXX functions using the topmost thing on the > stack. If the topmost item is a Nothing then just use the rect passed in, > otherwise intersect the provided rect with the topmost. I really like this suggestion.
Attached patch WIP v2 (obsolete) (deleted) — Splinter Review
This should pass all the reftests. I'll begin rewriting it as kats suggested. try: https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=3d2e8cfa4d62456801a5d5e4f971b49f66c1d790
Attachment #9003487 - Attachment is obsolete: true
Attachment #9004206 - Attachment is obsolete: true
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fcd7e7c5b56687dd25a076aa1857004e3ea80819 This should be green on try for real this time. I should rejig if possible to make it work more like ClipManager::mItemClipStack does today (I had to add ClipManager::EndItem as an example, ideally I would not).
Priority: -- → P2
Attachment #9004624 - Attachment is obsolete: true
green try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ce78b2c328136b7757f0d55765ea777e44c094f I merged the leaf extraction with ItemClips rather than having it broken out separately.
Attachment #9004625 - Attachment is obsolete: true
Although the scene.rson file size can be cut in half, the actual binary representation of the display list appears to be reduced (caveat: on the few pages I tried) by around 6-7%. The rate of us merging in the clip rect leaf seems to be high, anywhere from 66% in general to 95% on gmail. The reuse of the clip chain on the clip stack (clips.HasSameInputs) appears to be reduced slightly, by perhaps a few percent from say 70%ish to 65%ish. There may be opportunities to improve this -- the leaf could be different, while the clip chain after stripping the leaf is the same, so we just need to regenerate the leaf rect calculation.
Attachment #9004839 - Flags: review?(mstange)
Attachment #9004840 - Flags: review?(mstange)
Comment on attachment 9004839 [details] [diff] [review] 0001-Bug-1485420-Part-1.-Expose-WebRender-bindings-to-mer.patch, v2 Review of attachment 9004839 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/webrender_bindings/WebRenderAPI.h @@ +540,5 @@ > std::unordered_map<layers::FrameMetrics::ViewID, wr::WrClipId> mScrollIds; > > + // Contains the current leaf of the clip chain to be merged with the > + // display item's clip rect when pushing an item. May be set to Nothing() if > + // there is clip rect to merge with. Should this say "if there is *no* clip rect to merge with"? ::: gfx/webrender_bindings/WebRenderTypes.h @@ +359,5 @@ > + r.origin.y = std::max(aRect.origin.y, aOther.origin.y); > + r.size.width = std::min(aRect.origin.x - r.origin.x + aRect.size.width, > + aOther.origin.x - r.origin.x + aOther.size.width); > + r.size.height = std::min(aRect.origin.y - r.origin.y + aRect.size.height, > + aOther.origin.y - r.origin.y + aOther.size.height); I think this would be clearer if the " - r.origin.x/y" was outside the std::min(). Then it'll be more obvious that we're taking the minimum of the right/bottom edge, and then convert the edge to the size.
Attachment #9004839 - Flags: review?(mstange) → review+
(In reply to Markus Stange [:mstange] from comment #20) > Comment on attachment 9004839 [details] [diff] [review] > 0001-Bug-1485420-Part-1.-Expose-WebRender-bindings-to-mer.patch, v2 > > Review of attachment 9004839 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: gfx/webrender_bindings/WebRenderAPI.h > @@ +540,5 @@ > > std::unordered_map<layers::FrameMetrics::ViewID, wr::WrClipId> mScrollIds; > > > > + // Contains the current leaf of the clip chain to be merged with the > > + // display item's clip rect when pushing an item. May be set to Nothing() if > > + // there is clip rect to merge with. > > Should this say "if there is *no* clip rect to merge with"? > Yes, fixed. > ::: gfx/webrender_bindings/WebRenderTypes.h > @@ +359,5 @@ > > + r.origin.y = std::max(aRect.origin.y, aOther.origin.y); > > + r.size.width = std::min(aRect.origin.x - r.origin.x + aRect.size.width, > > + aOther.origin.x - r.origin.x + aOther.size.width); > > + r.size.height = std::min(aRect.origin.y - r.origin.y + aRect.size.height, > > + aOther.origin.y - r.origin.y + aOther.size.height); > > I think this would be clearer if the " - r.origin.x/y" was outside the > std::min(). Then it'll be more obvious that we're taking the minimum of the > right/bottom edge, and then convert the edge to the size. I have done so, but for reference, I copied this from: https://searchfox.org/mozilla-central/rev/d4ef4e9747133aa2914aca2a15cf9df1e42a6aa0/gfx/2d/BaseRect.h#144
Attachment #9004839 - Attachment is obsolete: true
Attachment #9007784 - Flags: review+
Comment on attachment 9004840 [details] [diff] [review] 0002-Bug-1485420-Part-2.-Make-ClipManager-trim-clip-chain.patch, v2 Review of attachment 9004840 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/wr/ClipManager.cpp @@ +158,5 @@ > } > > + // We may be able to reduce the number of display items (e.g. clips) by > + // combining the leaf of the clip chain with the clip rect of the display > + // item. Container display items are not currently supported because the clip To me, the phrasing "We may be able to" makes it sound like this is a theoretical future optimization, and not something that we're doing here. How about phrasing it as "In most cases we can combine the leaf of the clip chain with the clip rect of the display item. This reduces the number of clip items, which avoids some overhead further down the pipeline." @@ +193,5 @@ > + default: > + separateLeaf = true; > + break; > + } > + } I thought about how to write this in a prettier way but I couldn't come up with anything I liked. So what you have is fine.
Attachment #9004840 - Flags: review?(mstange) → review+
Pushed by aosmond@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/a3ea43b5e6d8 Part 1. Expose WebRender bindings to merge a clip rect with a display item. r=mstange https://hg.mozilla.org/integration/mozilla-inbound/rev/302dcd813942 Part 2. Make ClipManager trim clip chains which can be merged with the display item. r=mstange
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
a talos improvement: == Change summary for alert #15933 (as of Sat, 15 Sep 2018 01:52:35 GMT) == Improvements: 3% displaylist_mutate linux64-qr opt e10s stylo 7,790.08 -> 7,551.09 3% displaylist_mutate windows10-64-qr opt e10s stylo 6,867.71 -> 6,695.37 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15933
Depends on: 1496726
Depends on: 1499015
Depends on: 1507373
No longer depends on: 1508203
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: