Closed Bug 1571758 Opened 5 years ago Closed 3 years ago

(un)smooth scroll bug

Tracking

()

Status:

VERIFIED FIXED

Milestone:

98 Branch

Tracking Flags:

Tracking

Status

firefox98

---

verified

People

(Reporter: 1.1.1998, Assigned: hiro)

References

(Blocks 3 open bugs, Regressed 1 open bug)

Details

Attachments

(11 files, 5 obsolete files)

bug.html 5 years ago Long (deleted), text/html		Details
Push a new SampledAPZState result after sampling an animation 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), patch		Details \| Diff \| Splinter Review
Dump scroll linked effects 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), patch		Details \| Diff \| Splinter Review
Bug 1571758 - Split out ScrollGeneration into a new header. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Skip generating 0-value ScrollGeneration in the New() method and add Reset() method to generate 0-value ScrollGeneration. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Introduce scroll generation in APZC and SampledAPZCState and inform it to the scrollable frame on the main-thread via RepaintRequest. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Factor out GetAsyncScrollDeltaForSampling. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Add an optional index argument to some APZC methods to be able to get an arbirary sampled metrics. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Inform apz scroll generation to WebRender's ScrollFrame from the main-thread. r?botond! 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Drop scroll method from SpatialNode. r?gw 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Inform multiple sampled scroll offsets to WR and pick the most appropriate one in WR. r?botond! 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Wrench reftests for scroll offset generation. r?botond! 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
Bug 1571758 - Rename GetCurrentAsyncTransformWithOverscroll. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
WIP: Bug 1571758 - Define ScrollGeneration's operator<< out of line 3 years ago Botond Ballo [:botond] (deleted), text/x-phabricator-request		Details
Bug 1571758 - Add a mochitest for scroll linked effects. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details
WIP: Bug 1571758 - Apply ClampAndAlignWithPixels in FrameMetrics::SetVisualScrollOffset. r?botond 3 years ago Hiroyuki Ikezoe (:hiro) (deleted), text/x-phabricator-request		Details

Long

Reporter

Description

•

5 years ago

Attached file bug.html (deleted) — Details

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.143 YaBrowser/19.7.2.455 Yowser/2.5 Safari/537.36

Steps to reproduce:

i scroll content on page like(attached bug.html)

Actual results:

element with position absolute jumping on new position.

Expected results:

element with position absolute moving synchronously with scroll without jumping.

Long

Reporter

Updated

•

5 years ago

Component: Untriaged → DOM: CSS Object Model

Product: Firefox → Core

Emilio Cobos Álvarez (:emilio)

Comment 1

•

5 years ago

I also see flickering (though less) on other browsers. In general I don't think you're guaranteed getting async scroll events before paint, though Botond can confirm.

Why not using position: sticky for this?

Component: DOM: CSS Object Model → Layout: Scrolling and Overflow

Botond Ballo [:botond]

Comment 2

•

5 years ago

(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)

In general I don't think you're guaranteed getting async scroll events before paint, though Botond can confirm.

That's correct.

The page is using what we call a scroll-linked effect - please see this MDN page for more info about those.

Long

Reporter

Comment 3

•

5 years ago

(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)

I also see flickering (though less) on other browsers. In general I don't think you're guaranteed getting async scroll events before paint, though Botond can confirm.

Why not using position: sticky for this?

1)In Chromium browsers i don't see it(maybe my eyes liying me))
2)this example enough for demonstration bug and for my task need compatibility with old browsers(for now) sticky don't work everywhere

Emilio Cobos Álvarez (:emilio)

Comment 4

•

5 years ago

(In reply to Long from comment #3)

1)In Chromium browsers i don't see it(maybe my eyes liying me))

I definitely see some flickering in Chrome when I move back and forth close to the top.

2)this example enough for demonstration bug and for my task need compatibility with old browsers(for now) sticky don't work everywhere

There's various ways to support older browsers while using (faster, more correct, and power-efficent) scrolling on newer browsers.

An example, if you change .abs to be position: sticky, you can use the following as a fallback:

<script>
window.addEventListener("load", function() {
  if (window.CSS && window.CSS.supports && window.CSS.supports("position: sticky"))
    return; //Nothing to do, position: sticky does it for us.
  let abs = document.querySelector(".abs");
  abs.style.position = "absolute";
  abs.parentNode.addEventListener("scroll", function() {
    abs.style.top = this.scrollTop + "px";
  }, false);
}, false);
</script>

Long

Reporter

Comment 5

•

5 years ago

(In reply to Emilio Cobos Álvarez (:emilio) from comment #4)

(In reply to Long from comment #3)

1)In Chromium browsers i don't see it(maybe my eyes liying me))

I definitely see some flickering in Chrome when I move back and forth close to the top.

I tested in Chrome version 76. Maybe my GPU or CPU helps paint right.

2)this example enough for demonstration bug and for my task need compatibility with old browsers(for now) sticky don't work everywhere

There's various ways to support older browsers while using (faster, more correct, and power-efficent) scrolling on newer browsers.

An example, if you change .abs to be position: sticky, you can use the following as a fallback:
<script>
window.addEventListener("load", function() {
  if (window.CSS && window.CSS.supports && window.CSS.supports("position: sticky"))
    return; //Nothing to do, position: sticky does it for us.
  let abs = document.querySelector(".abs");
  abs.style.position = "absolute";
  abs.parentNode.addEventListener("scroll", function() {
    abs.style.top = this.scrollTop + "px";
  }, false);
}, false);
</script>

Task consists in fixing position <thead> in <table>.

Long

Reporter

Comment 6

•

5 years ago

(In reply to Botond Ballo [:botond] from comment #2)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)

In general I don't think you're guaranteed getting async scroll events before paint, though Botond can confirm.

That's correct.

The page is using what we call a scroll-linked effect - please see this MDN page for more info about those.

Thanks for article.

Emilio Cobos Álvarez (:emilio)

Comment 7

•

5 years ago

You can test for sticky support of a <thead> in a <table>, though I agree it's more elaborate (you need to create a scrollable table with a sticky header, then read out the position of the header).

I'm not sure whether we can do better at this... I suspect the fact that with devtools open this flickers way less means that paint-skipping may be involved somehow?

Botond (sorry for the continuous ni? storm), in this case paint-skipping seems to be doing wrong, isn't it? Or maybe I'm completely off and there's another reason for this.

Flags: needinfo?(botond)

Botond Ballo [:botond]

Comment 8

•

5 years ago

(In reply to Emilio Cobos Álvarez (:emilio) from comment #7)

I suspect the fact that with devtools open this flickers way less means that paint-skipping may be involved somehow?

Paint skipping shouldn't make a difference here, because while we may not schedule a paint in response to the scroll itself, we still should in response to the style.top change.

In principle, with the APZ frame delay in place, scroll-linked effects should remain in sync if we can paint the page within the frame budget.

The next step here would be to investigate whether, on this page, we (a) can't paint the page within the frame budget, or (b) there's something else going on that's causing the effect to be out of sync in spite of the frame delay and being able to paint within budget.

Bug 1367770 tracks investigating cases like this, and the page here makes a good test case.

Blocks: 1367770

Flags: needinfo?(botond)

Long

Reporter

Updated

•

5 years ago

OS: Unspecified → All

Hardware: Unspecified → All

Version: unspecified → 68 Branch

BugBot [:suhaib / :marco/ :calixte]

Comment 9

•

5 years ago

The priority flag is not set for this bug.
:TYLin, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(aethanyc)

Ting-Yu Lin [:TYLin] (UTC-8)

Updated

•

5 years ago

Flags: needinfo?(aethanyc)

Priority: -- → P3

Long

Reporter

Updated

•

5 years ago

Version: 68 Branch → unspecified

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 10

•

3 years ago

I did track down this issue (since it's somewhat related to bug 1692708), as far as I can tell the one-frame delay isn't working as expected, in fact it's two-frames delay. That's because we do invoke SampleCompositedAsyncTransform before sampling an animation, whereas a RepaintRequest will be based on the metric after the animation sampling, thus scrollTop value on the main-thread will be actually an one-frame ahead value.

For example, given that there's a queued sampled offset (0, 100) after this AdvanceToNextSample call,

SampleCompositedAsyncTransform adds a new offset, say (0, 110)
sampling an animation produces the latest scroll offset (i.e. the APZC's metric), say (0, 120)
APZCTreeManager::SampleForWebRender uses the (0, 100) offset for the composition
in the meantime the (0, 120) offset is delivered to the main-thread and used there and "updated position:absolute position" is reflected to the compositor
in the next APZCTreeManager::SampleForWebRender the (0, 110) is used but the position:absolute element's position is based on the (0, 120)

That's what I've observed. Moving the SampleCompositedAsyncTransform call after sampling an animation fixes this issue at least on my local Linux box. Though as of now I am not sure it's a right thing to do.

Hiroyuki Ikezoe (:hiro)

Assignee

Updated

•

3 years ago

Status: UNCONFIRMED → NEW

Component: Layout: Scrolling and Overflow → Panning and Zooming

Ever confirmed: true

Hiroyuki Ikezoe (:hiro)

Assignee

Updated

•

3 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1616593

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 11

•

3 years ago

Attached patch Push a new SampledAPZState result after sampling an animation (deleted) — Details — Splinter Review

This is what I commented in comment 10. With this change, all test cases blocking bug 1367770 other than bug 1665900 case look better. I don't know why bug 1665900 looks worse. My wild guess is that painting consumes over 16ms in the case, but not sure yet. I'd need some debugging metrics to tell what's going on there easier.

Hiroyuki Ikezoe (:hiro)

Assignee

Updated

•

3 years ago

Attachment #9248348 - Attachment is patch: true

Attachment #9248348 - Attachment mime type: application/octet-stream → text/plain

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 12

•

3 years ago

Attached patch Dump scroll linked effects (deleted) — Details — Splinter Review

I managed to add a (ugly) hack to dump scroll-linked-effects element positions and WebRender's scroll frame metrics. This hack is not perfect since the target scroll frame is not the linked one, it's (probably I guess) ASR.

Here is a typical two frames dump dumped by using a modified test case in comment 0.

--- start --- SystemTime { tv_sec: 1637134429, tv_nsec: 522116809 }
scroll linked effects: Some(SpatialNodeIndex(9))
	 element pos:            Box2D((0.0, 4176.0), (1114.0, 4226.0))
	 viewport:               Box2D((0.0, 0.0), (1114.4, 884.1))
	 scrollable_size:        0.0x7596.9995
	 scroll offset:          (0.0, -4156.317)
	 external_scroll_offset: (0.0, 4176.0)
--- end   --- SystemTime { tv_sec: 1637134429, tv_nsec: 522116809 }

--- start --- SystemTime { tv_sec: 1637134429, tv_nsec: 538789320 }
scroll linked effects: Some(SpatialNodeIndex(9))
	 element pos:            Box2D((0.0, 4176.0), (1114.0, 4226.0))
	 viewport:               Box2D((0.0, 0.0), (1114.4, 884.1))
	 scrollable_size:        0.0x7596.9995
	 scroll offset:          (0.0, -4176.0)
	 external_scroll_offset: (0.0, 4176.0)
--- end   --- SystemTime { tv_sec: 1637134429, tv_nsec: 538789320 }

element pos is representing the abs pos position in the frame, scroll offset is a value changed by APZ. And external_scroll_offset is a value set by display list, it's kinda scroll offset at the time when NotifyLayersUpdated gets called.

At the first frame, the abs-pos element position is slightly off to down from the scroll top position at that moment. At the second frame its position is same as the scroll top.

With the patch I posted in comment 11, the issue just like at the first frame will be mostly eliminated, but I think it is not a fundamental fix.

A problem I can see by the dump is the element pos and external_scroll_offset are basically based on the last metrics when NotifyLayersUpdated gets called (from the perspective of APZ) whereas the scroll offset is a value sampled before.

Though I haven't figured out what a proper solution here is, we should fix this inconsistency (along with the patch in comment 11.

Assignee: nobody → hikezoe.birchill

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 13

•

3 years ago

What I am currently thinking is we will have to have queued SampledAPZState in WebRender (ScrollFrame I assume) as well as AsyncPanZoomController does. And we will have a value representing generation in the SampledAPZState, and we need to pick a proper SampledAPZState in the webrender renderer thread to find the same generation sampled result. Is it worth a shot?

Botond Ballo [:botond]

Comment 14

•

3 years ago

(In reply to Hiroyuki Ikezoe (:hiro) from comment #13)

What I am currently thinking is we will have to have queued SampledAPZState in WebRender (ScrollFrame I assume) as well as AsyncPanZoomController does. And we will have a value representing generation in the SampledAPZState, and we need to pick a proper SampledAPZState in the webrender renderer thread to find the same generation sampled result. Is it worth a shot?

That approach sounds like it should fix the problem. We would want to have some kind of limit in place, such that if the generation from the main thread is "too old", we give up trying to stay in sync and just use a newer APZ offset (otherwise, if the main thread takes a long time to paint we are back to seeming unresponsive / getting "sync" scrolling).

That said, I'm a bit curious what makes this approach necessary. I think the original idea behind the frame delay was that if the main thread consistently paints things within the frame budget, then the display lists and corresponding samples (that would have the same generation with this approach) would just naturally align. I'm curious why this doesn't happen -- does a mis-alignment get introduced because sometimes one of the steps involved takes longer than one frame? Or sometimes a step happens twice in one frame?

Botond Ballo [:botond]

Comment 15

•

3 years ago

Basically, I'm wondering if the described fix might be papering over a bug introduced in bug 1630781 (or not properly fixed by bug 1630781), which may cause other symptoms, and which might have a less invasive fix.

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 16

•

3 years ago

(In reply to Botond Ballo [:botond] from comment #14)

That said, I'm a bit curious what makes this approach necessary. I think the original idea behind the frame delay was that if the main thread consistently paints things within the frame budget, then the display lists and corresponding samples (that would have the same generation with this approach) would just naturally align. I'm curious why this doesn't happen -- does a mis-alignment get introduced because sometimes one of the steps involved takes longer than one frame? Or sometimes a step happens twice in one frame?

No, what happens is that an opposed case. A round trip scroll offset (initiated by an APZC and reported to the main-thread, and the offset has been received in the WebRender's scroll offset), happens within a Vsync tick. (Vsync tick isn't precisely describing the situation because there are a couple of different threads running).

For example, an APZC has two sampled offsets, say 10px and 20px, and the APZC notify the 20px in a RepaintRequest to the main-thread, then the main-thread gets the 20px and update the scroll position in question and an abs-pos element uses 20px, then an display item having 20px is notified to the WebRender, then WebRender renders that the scroller's position 10px, but the abs-pos position is 20px.

Botond Ballo [:botond]

Comment 17

•

3 years ago

Ah, interesting, so it's kind of "two steps in one frame (vsync)". (I wonder if maybe that wasn't the case yet when bug 1630781 was fixed, and then was introduced by a subsequent performance optimization or someting like that.)

Anyways, if I'm understanding correctly that it's actually the main thread position that's ahead of the (composited) async scroll position, then this sounds like pretty good news, in that fixing this will not only put scroll-linked effects into sync, but also increase the responsiveness of async scrolling in general!

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 18

•

3 years ago

FWIW here is a try run with the recent changes I did locally;
https://treeherder.mozilla.org/jobs?repo=try&revision=8ca74557303118556ad153562722aaeaf0a8e81b

There remain a couple of relatively big issues;

test_wheel_scroll.html fails due to some hit testing issues caused by the changes
no automated tests has been made
scroll thumb positions (and fixed/sticky positioned elements on mobile) might be slightly mis-positioned because I haven't changed those transforms, e.g. this transform for scroll thumbs.

I was naively thinking we don't need to change eForHitTesting cases, but given that there's at least one hit testing issue 1), we might need to. I'd think 3) can be deferred in a follow up.

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 19

•

3 years ago

(In reply to Hiroyuki Ikezoe (:hiro) from comment #18)

test_wheel_scroll.html fails due to some hit testing issues caused by the changes

I did address this test failure locally with a very ugly hack. To be honest, I don't understand how the failure was caused exactly, with the original changes WebRender hit tester sometimes fails to return eApzAwareListeners for some reasons. Anyway I found what makes the test failure. In the current m-c SpatialNode::set_scroll_origin has a calculation like this;

  let new_offset = normalized_offset - scrolling.external_scroll_offset;

SpatialNode::set_scroll_origin is triggered by an APZ sampling and it gets called in a frame transaction (in WebRender's term I think). normalized_offset is a scroll offset caused by an APZC, scrolling.external_scroll_offset is a scroll offset came from a display item on the main-thread. My original change deferred this calculation as much as possible, I mean, the calculation is done every time we want the result, e.g. here. I am still believing deferring the calculation as mush as possible is a right thing to do ideally, but given that it introduce an inconsistency between hit tester, I am not going to defer the calculation as much as possible for now, I am going to do the calculation in the frame transaction as what the current code does. The test case in this bug still works with the way.

Also note that I realized the my changes doesn't fix cases where any APZ animation is not involved, e.g. scroll thumb dragging, I hope those cases can be relatively easily fixed by bumping up the scroll generation I will introduce in this bug.

Hiroyuki Ikezoe (:hiro)

Assignee

Updated

•

3 years ago

Depends on: 1744842

Hiroyuki Ikezoe (:hiro)

Assignee

Updated

•

3 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1738838

Hiroyuki Ikezoe (:hiro)

Assignee

Updated

•

3 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1745119

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 20

•

3 years ago

Attached file Bug 1571758 - Split out ScrollGeneration into a new header. r?botond (deleted) — Details

We are going to use it in various places, APZC, SampledAPZState etc.

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 21

•

3 years ago

Attached file Bug 1571758 - Skip generating 0-value ScrollGeneration in the New() method and add Reset() method to generate 0-value ScrollGeneration. r?botond (obsolete) (deleted) — Details

Each APZC will have a ScrollGeneratioin and will use this 0th generation as a
special case where there's no active animation in APZC.

Depends on D133437

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 22

•

3 years ago

Attached file Bug 1571758 - Introduce scroll generation in APZC and SampledAPZCState and inform it to the scrollable frame on the main-thread via RepaintRequest. r?botond (deleted) — Details

Depends on D133438

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 23

•

3 years ago

Attached file Bug 1571758 - Factor out GetAsyncScrollDeltaForSampling. r?botond (deleted) — Details

Depends on D133439

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 24

•

3 years ago

Attached file Bug 1571758 - Add an optional index argument to some APZC methods to be able to get an arbirary sampled metrics. r?botond (deleted) — Details

In a subsequent change, we'd like to use this getting async transform
calculation for each SampledAPZCState.

Depends on D133440

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 25

•

3 years ago

Attached file Bug 1571758 - Inform apz scroll generation to WebRender's ScrollFrame from the main-thread. r?botond! (deleted) — Details

Depends on D133441

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 26

•

3 years ago

Attached file Bug 1571758 - Drop scroll method from SpatialNode. r?gw (deleted) — Details

It's not used for Gecko at all. And in the next commit we are going to have
multiple scroll offsets in a ScrollFrameInfo and encapsulate
ScrollFrameInfo.scroll_offset method so with this unused scroll function,
the next change will be quite messy.

Depends on D133442

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 27

•

3 years ago

Attached file Bug 1571758 - Inform multiple sampled scroll offsets to WR and pick the most appropriate one in WR. r?botond! (deleted) — Details

This change mitigates the gap between the external_scroll_offset informed from
the main-thread and scroll_offset informed from APZ.

The change inside GetAsyncScrollDeltaForSampling (which is renamed to
GetSampledScrollOffsets in this change) is basically doing what
GetCurrentAsyncTransformWithOverscroll does but for each SampledAPZCState,
i.e. getting the async transform and applying overscroll transform if it's
necessary. Unfortunately I don't have any great idea to generalize them.
GetCurrentAsyncTransformWithOverscroll will be limited to eHitTesting in
a later commit.

Some wrench reftests for this change are in the next commit.

Depends on D133443

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 28

•

3 years ago

Attached file Bug 1571758 - Wrench reftests for scroll offset generation. r?botond! (deleted) — Details

Depends on D133444

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 29

•

3 years ago

Attached file Bug 1571758 - Rename GetCurrentAsyncTransformWithOverscroll. r?botond (obsolete) (deleted) — Details

Now the function gets caled only for eHitTesting.

Depends on D133445

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 30

•

3 years ago

Now I realized that the way will introduce jitter scrolling in some cases.

So for example, given that there were two sampled scroll offsets;

10px with generation: 1
20px with generation: 2

and if on the main-thread we had used the 20px offset with generation 2, then rendered it. And in the next sampling we have

20px with generation 2
30px with generation 3

in the meantime if the main-thread is busy for some reasons, then in the next rendering we do re-use 20px offset again. Though in the next rendering we will use 30px with generation 3 or the next sampled offset depending on the main-thread work.

A way to mitigate is checking scroll-linked effects (we will have to check scroll-linked animations too) and if there's any scroll-linked effect, we use this new machinery, and if there's no such thing, we don't use this machinery.

Botond, what do you think? I am quite unsure how often this situation happens in the wild.

Flags: needinfo?(botond)

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 31

•

3 years ago

I am going to add a pref to be able to flip the behavior at least.

Botond Ballo [:botond]

Comment 32

•

3 years ago

(In reply to Botond Ballo [:botond] from comment #14)

We would want to have some kind of limit in place, such that if the generation from the main thread is "too old", we give up trying to stay in sync and just use a newer APZ offset (otherwise, if the main thread takes a long time to paint we are back to seeming unresponsive / getting "sync" scrolling).

Now I understand this part: the limit is naturally in place due to this code which ensures we don't retain too many samples (never more than 2, I think).

(In reply to Hiroyuki Ikezoe (:hiro) from comment #30)

Now I realized that the way will introduce jitter scrolling in some cases.

So for example, given that there were two sampled scroll offsets;

10px with generation: 1

20px with generation: 2

and if on the main-thread we had used the 20px offset with generation 2, then rendered it. And in the next sampling we have

20px with generation 2

30px with generation 3

in the meantime if the main-thread is busy for some reasons, then in the next rendering we do re-use 20px offset again. Though in the next rendering we will use 30px with generation 3 or the next sampled offset depending on the main-thread work.

Yeah, good catch. I guess, if some paints complete before the end of the current vsync interval and others not until the next vsync interval, we will in fact get some jitter.

It wouldn't surprise me if this happens somewhat often.

A way to mitigate is checking scroll-linked effects (we will have to check scroll-linked animations too) and if there's any scroll-linked effect, we use this new machinery, and if there's no such thing, we don't use this machinery.

I think this is a promising idea. In the case we have a scroll-linked effect, the jitter is probably preferable to having the effect be out of sync. But in the case we don't have a scroll-linked effect, it would be a regression compared to the current consistent one-frame delay.

In fact, if we do this, we could consider doing something else: using the most recent sample (offsets.last() instead of offsets.first()) in the case where we have no scroll-linked effect. That would effectively get rid of the one-frame delay altogether (which has been requested by users, for example in this bug) if there are no scroll-linked effects.

One caveat here is that our detection logic for "are there any scroll-linked effects on this page?" is incomplete (see bug 1276361). If we are going with this route, we may want to take the opportunity to improve that detection logic as well.

An alternative solution to the jitter that we could consider, is to keep a queue of main-thread samples as well, and if the latest main-thread sample is very fresh (latest vsync), use a sample corresponding to the previous vsync (and the APZ sample with the corresponding generation) instead. This would mean the one-frame delay will always be there, but the jitter would be avoided (even with scroll-linked effects).

Flags: needinfo?(botond)

Hiroyuki Ikezoe (:hiro)

Assignee

Comment 33

•

3 years ago

(In reply to Botond Ballo [:botond] from comment #32)

An alternative solution to the jitter that we could consider, is to keep a queue of main-thread samples as well, and if the latest main-thread sample is very fresh (latest vsync), use a sample corresponding to the previous vsync (and the APZ sample with the corresponding generation) instead. This would mean the one-frame delay will always be there, but the jitter would be avoided (even with scroll-linked effects).

Thanks for the alternative solution. It took time to understand what it means. As I understand it, it can be done by without having the queue of the main-thread since we can tell the freshness by just comparing the last APZ sampled generation with the main-thread generation. And indeed, it fixes this bug pretty well as I've confirmed it locally. Hooray! I am going to drop the pref I added to flip this machinery.

Thank you!