Closed Bug 1326991 Opened 7 years ago Closed 5 years ago

[APZ] Moving content on BBC.com horribly lags during autoscrolling (low number of frames)

Categories

(Web Compatibility :: Desktop, defect, P5)

Firefox 49
x86_64
Windows 7
defect

Tracking

(platform-rel -)

RESOLVED WORKSFORME
Tracking Status
platform-rel --- -

People

(Reporter: arni2033, Unassigned, NeedInfo)

References

Details

(Keywords: regression, regressionwindow-wanted, Whiteboard: [gfx-noted][platform-rel-BBC][platform-rel-BBCNews])

>>>   My Info:   Win7_64, Nightly 49, 32bit, ID 20160526082509
1. Open http://www.bbc.com/future/story/20151123-how-dark-is-your-personality
2. Slowly autoscroll the page down, until content stops moving faster to the top side of the screen

AR:
 1st stage - the page scrolls to the bottom w/o lags; elements don't move relative to each other
 2nd stage - The page scrolls with MAJOR LAGS! Article content moves to the top side of the screen
             faster than the rest of the page (faster than it should according to scrollbar movement)
 3rd stage - the page scrolls to the bottom w/o lags; elements don't move relative to each other
ER:  
 2nd stage - the page should lag LESS, at least at the same level as Nightly 2015-06-01 or 2015-01-01

Note:
1) Word "lag" is used as "low number of frames per second". "Lags more" means "fewer frames"
2) I don't remember correctly which version was OK, probably even non-e10s.
   This should be tested in browser with only 1 tab opened, and with low averall RAM consumption on
   PC, so I never could properly check this. I assume this is regresion from bug 1209970, because it
   implemented some lags even for autoscrolling, which is not "async" in any way. I hate any "async"
   stuff btw, because it's too laggy.
3) In Note 2, "lag" is used in all possible meanings. "stuff" means any stuff in FF that is async or
   claims to be, or uses timeouts. I pretty much proved with my bug reports that all that stuff is
   laggy in FF. If you want to argue, read them first. UX is terrible, no thanks to UX team)
No longer blocks: 1277113
Component: Untriaged → Panning and Zooming
Product: Firefox → Core
If I autoscroll down this page at what should be a consistent speed, I can reproduce what appears to be janky scrolling partway through (what you're calling "2nd stage"). What's actually happening is that the page is resizing the <div class="hero-unit" ...> element down from 505px to 345.925px based on the scroll position. This resize is what causes the rest of the page to shift up on each scroll event, and makes it appear to scroll "faster" than the rest of the page. This behaviour is an intentional scroll-linked effect put in by the page authors, and it happens in all browsers (I checked Chrome and IE).

As for the jankiness, this is also expected because we are forced to do a full repaint on each scroll, because the page content is shifting on the scroll event. If I autoscroll down this page in Chrome I see even more severe jank - sometimes it doesn't repaint for a few seconds and then suddenly jumps. In IE the situation is better but I can still trigger some jank if I use up some of my CPU in background processes.

I'm going to move this to Tech Evangelism to see if we can reach out to the BBC and have them avoid constructing pages like this, as they behave poorly in all browsers. If not, then this is going to be a RESOLVED INVALID from the browser site.

Note that this may technically be a "regression" in Firefox, because of async scrolling, but there's no way to fix this without turning off async scrolling, which we're not going to do. Leaving the regression keyword, but dropping the regressionwindow-wanted.
Blocks: apz-desktop
Component: Panning and Zooming → Desktop
OS: Unspecified → Windows 7
Product: Core → Tech Evangelism
Hardware: Unspecified → x86_64
Whiteboard: [gfx-noted]
Version: Trunk → Firefox 49
Autoscroll does not use APZ yet, as far as I know. I think this bug occurs because we're now firing scroll events during the refresh driver tick, and we're also ticking the autoscroll animation from the refresh driver, and the ordering means that autoscroll adjusts the scroll position after the page processed its scroll event for the current paint.
I think I recall seeing a similar bug before, but I can't find it now.

I'm not sure how to fix this. We want the scroll event to fire before requestAnimationFrame callbacks, so that if pages call requestAnimationFrame while processing the scroll event, the callback gets called before we paint. On the other hand, we want to give autoscroll a way to adjust the scroll position, during the refresh driver tick, but before the scroll event fires. So either we need a magic requestAnimationFrameBeforeScrollEvent API that only autoscroll can call, or we need to move parts of autoscrolling lower into the platform. We'd have to do that for APZ autoscroll anyway. So that's probably what we should do.
(In reply to Markus Stange [:mstange] from comment #2)
> Autoscroll does not use APZ yet, as far as I know.

Correct.

> I think this bug occurs
> because we're now firing scroll events during the refresh driver tick, and
> we're also ticking the autoscroll animation from the refresh driver, and the
> ordering means that autoscroll adjusts the scroll position after the page
> processed its scroll event for the current paint.

Hm. This is a good point, but I'm not convinced it's the root cause of the bug. If I understand correctly, this problem would result in the scroll-linked effect lagging one frame behind the visual scroll position, because the SLE uses a scroll position just before it gets updated. However, other than being one frame behind, the total work done during a refresh driver tick should be pretty much the same (except maybe the first/last frame of the SLE animation). On the BBC website the main observed problem is jank resulting from the main-thread side of things taking too long, and I don't think fixing this ordering issue would change that. If the observed problem were that a SLE animation was one frame out of sync with a RAF animation, then I could see how this reordering issue might be the cause.

> I'm not sure how to fix this. We want the scroll event to fire before
> requestAnimationFrame callbacks, so that if pages call requestAnimationFrame
> while processing the scroll event, the callback gets called before we paint.
> On the other hand, we want to give autoscroll a way to adjust the scroll
> position, during the refresh driver tick, but before the scroll event fires.
> So either we need a magic requestAnimationFrameBeforeScrollEvent API that
> only autoscroll can call, or we need to move parts of autoscrolling lower
> into the platform. We'd have to do that for APZ autoscroll anyway. So that's
> probably what we should do.

Based on what I said above I don't think fixing the ordering issue will actually solve this bug, although it will help overall correctness and possibly fix other bugs. An APZ autoscroll will probably help somewhat - IMO the end result of doing that will be that the visual scroll position will advance smoothly, but the SLE animation will be lagging behind and still look bad.
The bug is that the lagging became too terrible, because of bug 1209970 & Co that implemented APZ bugs
even if I use non-apz way of scrolling. Please don't remove keyword "regressionwindow-wanted" if you haven't even made any investigation to find out what broke autoscrolling on the site beyond repair:
if I remember correctly, that site first worked fine, then it became laggy, then "terribly laggy".

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1)
> Note that this may technically be a "regression" in Firefox,
Then don't drop keywords which will result in finding the exact bug where you (team) broke it.

> Note that this may technically be a "regression" in Firefox, because of async scrolling,
> but there's no way to fix this without turning off async scrolling, which we're not going to do.
This manifests in non-apz way of scrolling, so you've broken a real working thing while working on APZ "improvement". This bug was filed for regression in real working thing (autoscrolling)
So I just ran mozregression and got this:

 9:41.08 INFO: Last good revision: b52d27f7628fd1ba34395eed42f7e00988fb9790
 9:41.08 INFO: First bad revision: 3ee6220f83eb6da54b906a3d3375168d6c3c0db2
 9:41.08 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b52d27f7628fd1ba34395eed42f7e00988fb9790&tochange=3ee6220f83eb6da54b906a3d3375168d6c3c0db2

Note that this points to a change that landed on 2016-07-04, which is *after* the buildid you reported in comment 0. So either I'm not seeing the same issue you're seeing, or there's some other problem here.
I think you're seing the same issue, "low frame rate"

I wrote this bug report in ~2016-11-30, after testing on some version later than 2016-05-26, which I don't/can't remember correctly, therefore only included version 2016-05-26. The lags became worse and worse through the time, i.e. were regressed several times. There was some point in time when they were better than in 2016-05-26.   (I also haven't expected you to actually waste your time to bisection, I thought it should be done by someone who doesn't often fix bugs and has more free time. Possibly, I could do that after ~3+ days, if I weren't busy with pending bugs. Will see)
Flags: needinfo?(arni2033)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1)
> I'm going to move this to Tech Evangelism to see if we can reach out to the
> BBC and have them avoid constructing pages like this, as they behave poorly
> in all browsers. If not, then this is going to be a RESOLVED INVALID from
> the browser site.

The scroll-linked effect used on this page looks like the sort of effect that web developers will be able to express in the future using scroll-driven animations.

However, if this is a Tech Evangelism issue, we ought to have a suggestion for something the web author can do today. What is our suggestion? Is it to simply not use a scroll-linked effect like this for the time being?
(In reply to Botond Ballo [:botond] from comment #7)
> The scroll-linked effect used on this page looks like the sort of effect
> that web developers will be able to express in the future using
> scroll-driven animations.

Neat!

> However, if this is a Tech Evangelism issue, we ought to have a suggestion
> for something the web author can do today. What is our suggestion? Is it to
> simply not use a scroll-linked effect like this for the time being?

Yeah I guess for now there's not much they can do except avoid using animations like this. Maybe we should hold off on evangelizing until we have scroll-driven animations implemented. (Or if arni2033 finds a second regression range that points to this being a gecko bug somehow, we can look at it more - but his account was disabled recently so I'm not holding out hope on that happening).
platform-rel: --- → ?
Whiteboard: [gfx-noted] → [gfx-noted][platform-rel-BBC][platform-rel-BBCNews]
platform-rel: ? → -
Priority: -- → P5

This seems to be fixed.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Product: Tech Evangelism → Web Compatibility
You need to log in before you can comment on or make changes to this bug.