Closed Bug 1253860 Opened 8 years ago Closed 8 years ago

Add a lightweight mechanism to update APZ scroll offset without a full repaint

Categories

(Core :: Panning and Zooming, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- fixed

People

(Reporter: kats, Assigned: kats)

References

(Depends on 1 open bug)

Details

(Keywords: perf, Whiteboard: [gfx-noted])

Attachments

(5 files, 2 obsolete files)

If APZ is enabled and content does a scrollBy(0, 1) we currently schedule a paint which will do a display list build before it decides there's actually nothing that needs to be done. Then it sends a layer transaction over to the compositor which updates the APZ scroll offset. Bug 1187432 added some code to skip this unnecessary step for APZ scrolls and we can extend it for non-apz scrolls, provided we add another message to get the scroll offset over to APZ. I have this working locally although there ate still some edge cases to work out. It seems to help a lot, on my local Linux opt build the tp5o_scroll went down from 7.something to 2.something. There are still some cases where we need to schedule a paint but it should help in a lot of main-thread scrolling scenarios.

Also it basically is what we want in terms of APZ support for keyboard scrolling, since it's unlikely we will route those events directly to APZ the way we do for wheel events (although I'm not yet ruling that out).

Anyhow, filling this bug to track the issue, I'll put up patches once it's more baked.
Attached patch Part 1 - WIP (obsolete) (deleted) — — Splinter Review
Machinery to update APZ with a new scroll offset. The APZ-side code is basically stuff lifted from NotifyLayersUpdated assuming aLayerMetrics just has an updated scroll offset.
Attached patch Part 2 - WIP (obsolete) (deleted) — — Splinter Review
This uses the machinery on the layout side to send scroll updates to APZ instead of scheduling a full paint (which involves doing a display list build). This basically gives us APZ-ish behaviour for all sorts of main-thread scrolling mechanisms including keyboard, JS scrollTo, etc. APZ-ish because it still triggers from the main-thread, but it avoids a paint where possible.

I had to do some shenanigans to suppress the scrollbar from also triggering a paint, but really it's just a more general version of what Matt had already added in bug 1187432.

Seems to work well locally, and massively improves our tp5o_scroll score (again, locally). I think the case where APZ has a scrollinfo layer will have higher latency, but should still work. In that case, layout will send this message to APZ, and APZ will send a repaint request back to the main thread, and that will schedule a paint at [1]. As far as I can see this would be the only regression from this patch, and we can fix that if we set a flag on the scrollframe indicating that it will generate a scrollinfo layer, and just do a full repaint (i.e. skip using this machinery) in that scenario.

Try push with everything at [2].

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/util/APZCCallbackHelper.cpp?rev=427fce68f67a#132
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=048d222536a0
Also I've been working on this on top of bug 1192910; without that we'll still schedule a bunch of unnecessary paints.
Depends on: 1192910
With these patches, is it possible for a main thread scroll to cause checkerboarding?
I don't think so because any scroll that would induce checkerboarding should also cause the displayport to move, and that would fall back to the regular paint mechanism. If we are already in a checkerboarded state due to APZ then additional main thread scrolling might prolong it, but I'm not sure that's worth worrying about.
Also the first paragraph of comment 2 actually is for the first patch; the second patch is exclusively to deal with the scrollbar repaints. I should maybe split the first patch into two parts.
There were some test failures in the try push in comment 2. They all seemed to be caused by an unexpected consequence of the new message. Basically, the new UpdateScrollOffset message would update the APZC as one would expect. However, we could later get another call to NotifyLayersUpdated with "stale" metrics, which would clobber the new state set by UpdateScrollOffset. The call to NLU seems to come from updates in other layer trees which trigger a full hit testing tree update, but use the old layer tree for the any layers id that wasn't just updated.

This condition is relatively easy to detect, and short-circuiting out of NLU when it happens fixes the test failures (and is a nice optimization to boot).

There was another error that I was seeing in the tests (although it wasn't actually causing failures) which was triggered by us sending the UpdateScrollOffset message to the compositor before the compositor had actually built an APZC for that scroll id. I hacked around it but I'll fix it up properly and take care of the scrollinfo case I described in comment 2 as well.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=74879d6034bc
New try push with what I hope are final patches:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=de58c3f164df

If that's green I'll put them up for review.
I discussed with avi the implications of this change on the scrolling tests. The main conclusions we came to were that:

1) Initially the tests were intended to represent user-observed scrolling behaviour, but also tested layout performance, because the former was dependent on the latter.

2) With APZ user-observed scrolling behaviour is no longer as dependent on layout performance.

3) It's still useful to catch regressions in layout performance.

4) Therefore, we should continue to use these tests to monitor layout performance. We can do this by turning on the paint-skipping pref for the tests, which will fall back to the repainting codepath.

5) This will change the semantics of the test from being primarily representative of user-observed scrolling behaviour to being primarily representative of layout performance, which doesn't necessarily translate to observed user behaviour.

6) (Of particular importance to me) this means that the tp5o_scroll regression should not really block e10s because it doesn't represent the actual behaviour observed by the user (which will be strictly better than it was before, even though the test will continue to show a regression).
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> We can do this by turning on the paint-skipping pref for the
> tests, which will fall back to the repainting codepath.

As botond pointed out, I really mean turning OFF the pref, since it's on by default. i.e. apz.paint_skipping.enabled -> false for the test.
Blocks: 1255129
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> 1) Initially the tests were intended to represent user-observed scrolling
> behaviour, but also tested layout performance, because the former was
> dependent on the latter.
> 
> 2) With APZ user-observed scrolling behaviour is no longer as dependent on
> layout performance.

The results were and still are theoretically also dependent on CT perf, since ultimately the CT throttles the MT (once it queues 2 frames), but since they run in parallel, it would only affect the results if the CT takes longer to iterate than the MT - which, by hunches of kats, mstange and myself - is typically not the case (i.e. CT can iterate faster than MT).

As for no longer dependent on layout in e10s - it's "only" much less frequently dependent on layout perf - we still need MT layout whenever we need a new tile, which could mean that for some cases it could be a different kind of bad in e10s.

E.g. (by kats) if layout takes 20ms then in non e10s we'd iterate consistently at 30fps, but in e10s might be able to do 60fps - except when we need a new tile every few frames - which would be a visible missed frame.

It's a subjective thing which is better, but it is different.


> 6) (Of particular importance to me) this means that the tp5o_scroll
> regression should not really block e10s because it doesn't represent the
> actual behaviour observed by the user (which will be strictly better than it
> was before, even though the test will continue to show a regression).

Overall, they would now have different semantics WRT user experience, and (much?) harder to compare/evaluate, but we should still try to if it shows meaningful diff.
Try push is looking good, except for the many still-pending windows jobs.
Attachment #8727217 - Attachment is obsolete: true
Comment on attachment 8728676 [details]
MozReview Request: Bug 1253860 - Stop APZC from reprocessing stale metrics on unrelated layer tree updates. r?botond

https://reviewboard.mozilla.org/r/39023/#review35709
Attachment #8728676 - Flags: review?(botond) → review+
Comment on attachment 8728677 [details]
MozReview Request: Bug 1253860 - Add machinery to update APZ's scroll offset without a main-thread paint. r?botond

https://reviewboard.mozilla.org/r/39025/#review35715

::: gfx/layers/apz/src/AsyncPanZoomController.h:194
(Diff revision 1)
> +  void UpdateScrollOffset(uint32_t aScrollGeneration,

Can we call this NotifyScrollOffsetUpdated for symmetry?

::: gfx/layers/apz/src/AsyncPanZoomController.cpp:3491
(Diff revision 1)
> +  RequestContentRepaint();

I guess we need this to compute new display port margins, but the expectation is that most of the time it would not actually cause a repaint?
Attachment #8728677 - Flags: review?(botond) → review+
Comment on attachment 8728678 [details]
MozReview Request: Bug 1253860 - Add a flag on scroll frames indicating if they have an APZ counterpart. r?mstange

https://reviewboard.mozilla.org/r/39027/#review35733
Attachment #8728678 - Flags: review?(mstange) → review+
Attachment #8728679 - Flags: review?(mstange) → review+
Comment on attachment 8728679 [details]
MozReview Request: Bug 1253860 - Skip paints for main thread scrolls if we can ask APZ to handle the scrolling for us. r?mstange

https://reviewboard.mozilla.org/r/39029/#review35737

::: layout/generic/nsGfxScrollFrame.cpp:2703
(Diff revision 1)
> -        presContext->PresShell()->ScheduleImageVisibilityUpdate();
> +        nsIWidget* widget = presContext->GetNearestWidget();
> +        LayerManager* manager = widget ? widget->GetLayerManager() : nullptr;
> +        ShadowLayerForwarder* forwarder = manager ? manager->AsShadowForwarder() : nullptr;

Reaching this far into gfx/layers from layout feels dirty, but that's probably the least of our worries right now.
Comment on attachment 8728680 [details]
MozReview Request: Bug 1253860 - Don't update the scrollbar unless we're actually painting. r?mstange

https://reviewboard.mozilla.org/r/39031/#review35739

::: layout/xul/nsIScrollbarMediator.h:87
(Diff revision 1)
> +  virtual bool SuppressScrollbarRepaints() const = 0;

Maybe call this ShouldSuppressScrollbarRepaints? The current name sounds like a command, and not like an information getter.
Attachment #8728680 - Flags: review?(mstange) → review+
(In reply to Botond Ballo [:botond] from comment #19)
> > +  void UpdateScrollOffset(uint32_t aScrollGeneration,
> 
> Can we call this NotifyScrollOffsetUpdated for symmetry?

I renamed it to NotifyScrollUpdated, just because it was a little shorter. Good idea on the symmetry though.

> ::: gfx/layers/apz/src/AsyncPanZoomController.cpp:3491
> (Diff revision 1)
> > +  RequestContentRepaint();
> 
> I guess we need this to compute new display port margins, but the
> expectation is that most of the time it would not actually cause a repaint?

Yup, exactly.

(In reply to Markus Stange [:mstange] from comment #21)
> > +        LayerManager* manager = widget ? widget->GetLayerManager() : nullptr;
> > +        ShadowLayerForwarder* forwarder = manager ? manager->AsShadowForwarder() : nullptr;
> 
> Reaching this far into gfx/layers from layout feels dirty, but that's
> probably the least of our worries right now.

Yeah I wasn't too happy with that either. I was thinking about moving this goop into nsLayoutUtils but nothing there touches the ShadowLayerForward either so it felt just as dirty...

(In reply to Markus Stange [:mstange] from comment #22)
> > +  virtual bool SuppressScrollbarRepaints() const = 0;
> 
> Maybe call this ShouldSuppressScrollbarRepaints? The current name sounds
> like a command, and not like an information getter.

Done.
Does this depend on "enabling APZ"? if yes, and in case e10s will ship without APZ initially, can this somehow be enabled on its own without inducing APZ perf regressions elsewhere?
Yes, this depends on enabling APZ. It requires the APZ machinery, so you can't do it without APZ.
Depends on: 1255705
Depends on: 1255856
Depends on: 1256727
Given the number of regressions that this bug has introduced I don't think it's safe to uplift, even though it provides a nice win in some conditions.
Depends on: 1256323
Depends on: 1273142
Depends on: 1280344
Depends on: 1327095
Depends on: 1355372
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: