Closed
Bug 1220466
Opened 9 years ago
Closed 8 years ago
[APZ] 73% performance regression on IE mazesolver(40x40)
Categories
(Core :: Layout, defect)
Tracking
()
VERIFIED
FIXED
mozilla48
People
(Reporter: alice0775, Assigned: mattwoodrow)
References
(Depends on 1 open bug, Blocks 3 open bugs, )
Details
(Keywords: perf, regression)
Attachments
(3 files)
(deleted),
application/x-zip
|
Details | |
(deleted),
patch
|
mstange
:
review+
ritu
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
mstange
:
review+
|
Details | Diff | Splinter Review |
Build Identifier: https://hg.mozilla.org/mozilla-central/rev/c1a94d5365e48f86de6a7bb6a3c4c7d82b9d660e Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0 ID:20151031030207 Steps to reproduce: 1. Exstract attached zip 2. Open Default.html 3. Select 40x40 and Click [Start] Actual Results: With e10s + APZ : 116s With e10s + disabled APZ : 67s
Reporter | ||
Updated•9 years ago
|
Summary: [APZ] 73% performance regression on IE mazesolver(40x40) since Firefox 39 → [APZ] 73% performance regression on IE mazesolver(40x40)
Reporter | ||
Comment 1•9 years ago
|
||
Updated•9 years ago
|
Blocks: apz-desktop
Updated•9 years ago
|
Comment 2•8 years ago
|
||
Build ID 20160201030241 User Agent Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0 I have tested this on Win 7 x64 with Nightly 47.0a1 and this is are the results: First time running the test: With e10s + APZ = 246 sec With e10s + APZ disabled = 88 sec Second time running the test: With e10s + APZ = 121 sec With e10s + APZ disabled = 95 sec Third time running the test: With e10s + APZ = 107 sec With e10s + APZ disabled = 94 sec Fourth time running the test: With e10s + APZ = 123 sec With e10s + APZ disabled = 85 sec
Comment 3•8 years ago
|
||
I got a profile (from OS X, not Windows), at https://cleopatra.io/#report=2b6548491191610c8685ed33fe1484b9ecd69480 - it definitely shows the content thread is flat-out busy painting so maybe it can point to some worthwhile optimization targets.
Blocks: paint-fast
Component: Panning and Zooming → Layout
Comment 4•8 years ago
|
||
[Tracking Requested - why for this release]:
status-firefox46:
--- → affected
status-firefox47:
--- → affected
tracking-firefox46:
--- → ?
tracking-firefox47:
--- → ?
Comment 5•8 years ago
|
||
Milan, can you help coordinate a fix for this for 46 assuming APZ ships with 46? This is marked as a regression tracking 46, which means we won't be shipping 46 with this...
Comment 6•8 years ago
|
||
Is this a regression from the bigger display port or from the event region display items?
Assignee | ||
Comment 7•8 years ago
|
||
The single biggest stack in the content process ends up with nsLayoutUtils::GetDisplayPort with 6.3%. Total actual rasterization time is 0.9%. We're getting hammered on display list and layer builder here. APZ is probably adding more display items to process (thanks to display ports), and increasing the total work per item (since we have GetDisplayPort calls and event handling etc). We can probably do a bunch of small fixes to help reduce the pain here, or we could do something specific to really optimize this particular case. I doubt the latter is useful though, since this page is fairly unique and I don't think we care about winning on it specifically.
Comment 8•8 years ago
|
||
From my testing this regression is mostly from event regions. The regression persists if I disable APZ but enable event regions. The page isn't very large, it's just barely scrollable so having a page-sized displayport doesn't hurt us too much here.
Comment 9•8 years ago
|
||
Commenting out the hunk at [1] makes most of the regression go away. I'm guessing there's a lot of pseudo-stacking-context items on this page so this code is getting really hot. [1] http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsFrame.cpp?rev=7c7fe4da388c&mark=2557-2563#2557
Comment 10•8 years ago
|
||
With the 40x40 grid we end up creating ~8000 event regions display items per paint. That number goes down steadily as the maze finding progresses, ending at around ~7500. That's still a lot. There's a chunk of the display list which is just reams and reams of adjacent LayerEventRegions items that looks like this: LayerEventRegions p=0x163144680 f=0x1627304d0( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x163144ab0 f=0x162730200( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x16476fa88 f=0x162729d30( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x16476feb8 f=0x162729880( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x164770350 f=0x1627294c0( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x164770780 f=0x162729010( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x164770bb0 f=0x162728c50( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) LayerEventRegions p=0x164771020 f=0x1627287a0( class:marker) bounds(0,0,0,0) layerBounds(0,0,0,0) visible(0,0,0,0) componentAlpha(0,0,0,0) clip() scrollClip() ref=0x119e691d8 agr=0x119e691d8 (maybeHitRegion < (x=0, y=0, w=360, h=360); >) We should maybe not create so many of the same thing...
Comment 11•8 years ago
|
||
Reaching the limit of my layout knowledge here. Matt/Markus - can you think of any conditions where we can combine some of these pseudo stacking context event regions instead of creating a new one for each frame?
Recent performance regression, tracking
Assignee | ||
Comment 13•8 years ago
|
||
My understanding is that we need a new event regions display item each time we might have a new layer, so that each item can be assigned to a single layer. pseudo stacking contexts only exist at the layout level, they shouldn't be visible to FrameLayerBuilder, so I can't see why we need a new event regions item for each one. It seems like this code can be conditioned with buildingForChild.IsAnimatedGeometryRoot(), just like the non-pseudo stacking context path. Adding ni? for Markus and tn, since it's been too long since I thought about this code.
Flags: needinfo?(mstange)
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(tnikkel)
Comment 14•8 years ago
|
||
Hmm, wouldn't we need to make sure we get a new layer event regions for anything that could build a container layer? So transform/opacity create pseudo stacking contexts so they have been getting a new layer event regions item, but won't after that change.
Flags: needinfo?(tnikkel)
Comment 15•8 years ago
|
||
Talked to matt on irc. transform/opacity create full stacking contexts, so they'd still get a layer event regions item. We could likely avoid even more layer event regions items if we just created them for things that created container layers, instead of all stacking contexts, but that can be something for another bug.
Assignee | ||
Comment 16•8 years ago
|
||
Assignee | ||
Comment 17•8 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #15) > We could likely avoid even more layer event regions items if we just created > them for things that created container layers, instead of all stacking > contexts, but that can be something for another bug. That seems like a good idea, assuming this bug goes well. Can we also skip creating new event region items for inactive container layers? It seems like as soon as we enter an inactive container, we can just block all new event region items for descendants. Unfortunately we don't really know when that happens at display list building time, but I think we could (bug 1205129?).
Comment 18•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #17) > Can we also skip creating new event region items for inactive container > layers? Currently AddFrame assumes the frames it adds are in the same coordinate system as the display items underlying frame. FrameLayerBuilder handles the task of transform the event regions when pulling them out of inactive container layers. So we'd need a solution to that. Either make AddFrame more complicated and handle transforms, or skip this optimization for transforms.
Assignee | ||
Comment 19•8 years ago
|
||
Assignee: nobody → matt.woodrow
Attachment #8728181 -
Flags: review?(mstange)
Assignee | ||
Comment 20•8 years ago
|
||
Comment on attachment 8728152 [details] [diff] [review] less-event-regions These patches take my local run time from 29s to 16s.
Attachment #8728152 -
Flags: review?(mstange)
Comment 21•8 years ago
|
||
That sounds like a nice win - should be comparable to the non-APZ score \o/
Comment 22•8 years ago
|
||
Comment on attachment 8728152 [details] [diff] [review] less-event-regions Review of attachment 8728152 [details] [diff] [review]: ----------------------------------------------------------------- How does this ensure that we build a new event regions item when we enter a transform?
Attachment #8728152 -
Flags: review?(mstange) → review+
Comment 23•8 years ago
|
||
Comment on attachment 8728181 [details] [diff] [review] Avoid work for inactive layers Review of attachment 8728181 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/base/FrameLayerBuilder.cpp @@ +3984,5 @@ > // This is the case for scrollbar thumbs, for example. In that case the > // clip we care about is the overflow:hidden clip on the scrollbar. > mPaintedLayerDataTree.AddingOwnLayer(animatedGeometryRoot->mParentAGR, > clipPtr, > + IsInInactiveLayer() ? nullptr : uniformColorPtr); Please do this in the initialization of the uniformColorPtr variable, so that you don't have to repeat it.
Attachment #8728181 -
Flags: review?(mstange) → review+
Assignee | ||
Comment 24•8 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #22) > Comment on attachment 8728152 [details] [diff] [review] > less-event-regions > > Review of attachment 8728152 [details] [diff] [review]: > ----------------------------------------------------------------- > > How does this ensure that we build a new event regions item when we enter a > transform? Transforms force a real stacking context, so we take the BuildDisplayListForStackingContext branch above. That function has an unconditional creation of a new event regions item.
Comment 26•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/09b94960086e https://hg.mozilla.org/integration/mozilla-inbound/rev/1b7d730f11bf
Comment 27•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/09b94960086e https://hg.mozilla.org/mozilla-central/rev/1b7d730f11bf
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Comment 28•8 years ago
|
||
I ran the 40x40 maze on OS X March 10 nightly which has the above fixes. I ran it 5 times each in three different configurations: Config no-e10s e10s+APZ e10s no-APZ Samples 25 23 21 24 23 22 24 25 21 24 24 24 25 24 21 -------------------------------------------- Average 24.4s 23.8s 21.6s -------------------------------------------- So APZ is better than no-e10s, but still ~10% worse than e10s without APZ. This is hugely better than the 70% regression so I'm verifying this as fixed. We should uplift this fix to 47 and 46.
Status: RESOLVED → VERIFIED
Updated•8 years ago
|
Whiteboard: [Dupe me]
Comment 29•8 years ago
|
||
Excellent! Is the remainder of this regression tracked somewhere else?
Flags: needinfo?(bugmail.mozilla)
Comment 30•8 years ago
|
||
Bug 1220459 covers the general regression in this benchmark. I can file another one for the 10% regression that comes with APZ assuming e10s is enabled.
Flags: needinfo?(bugmail.mozilla)
Comment 31•8 years ago
|
||
Great! What I care about is that this is tracked *somewhere* so we don't ship APZ, or any other features for that matter, with known regressions we didn't intend to ship with, so please ensure this gets tracked, either as part of the existing bug for this this benchmark, or separately.
Can you request uplift to aurora and beta as suggested in comment 28? Thanks!
Flags: needinfo?(matt.woodrow)
Though we may want to stick with only uplifting to 47, since I don't think we are actually planning to enable apz with 46.
Assignee | ||
Comment 34•8 years ago
|
||
Comment on attachment 8728152 [details] [diff] [review] less-event-regions Approval Request Comment [Feature/regressing bug #]: APZ [User impact if declined]: Poor performance on mazesolver (and other websites with similarly complex dynamic HTML changes). [Describe test coverage new/current, TreeHerder]: Manually tested. [Risks and why]: Low risk, just reduces unnecessary work. [String/UUID change made/needed]: None
Flags: needinfo?(matt.woodrow)
Attachment #8728152 -
Flags: approval-mozilla-aurora?
Comment on attachment 8728152 [details] [diff] [review] less-event-regions Perf improvement that was verified, Aurora47+
Attachment #8728152 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 36•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/fbfbe7b03e51
You need to log in
before you can comment on or make changes to this bug.
Description
•