Better handle hit-testing when inactive scrollframes intersect OOPIFs
Categories
(Core :: Panning and Zooming, enhancement)
Tracking
()
Tracking | Status | |
---|---|---|
firefox87 | --- | fixed |
People
(Reporter: kats, Assigned: tnikkel)
References
(Blocks 2 open bugs, Regressed 1 open bug)
Details
(Whiteboard: [2/5] patches bounced)
Attachments
(5 files, 2 obsolete files)
As discussed in https://bugzilla.mozilla.org/show_bug.cgi?id=1541589#c13 there are some hit-testing issues when inactive scrollframes are underneath OOPIFs. This happens because of the way we generate the hit info for the inactive scrollframe; it can end up on top of the OOPIF. The compositor then targets the inactive scrollframe for incoming events instead of the OOPIF.
Botond is fixing a subset of these cases in bug 1645433 but in discussion with Matt Woodrow they realized that there are other cases that won't be handled, such as when there is a positioned item inside the inactive scrollframe with a z-index that places it on top of the OOPIF. There is such a case in the mega-test here (click on the "scroller with split items" button), and I'll attach a standalone test case to this bug as well.
In terms of fixing it, we came up with a couple of potential solutions, which I wrote down in this comment.
Reporter | ||
Comment 1•4 years ago
|
||
Depends on D96075
Reporter | ||
Comment 2•4 years ago
|
||
Some more discussion on this topic today, with a variety of possible solutions to explore. The first few are all sub-cases of the second bullet point in bug 1541589 comment 13.
-
One is to ensure that hit-test info items inside the inactive scrollframe have the correct scrollId (that of the inactive scrollframe) with no other changes. In particular this would leave the ASRs untouched, the WebRenderScrollData untouched, and the APZC tree untouched. However, when APZ does a hit-test and gets a scrollId that it doesn't recognize, it would treat it the same as an unlayerized scrollframe and leave it in the input queue until a round-trip to the main thread layerized it and created the APZC instance. Implementation-wise, this would involve ensuring that hit-test info items inside an inactive scrollframe have an
mScrollTarget
explicitly set at this point rather than relying on getting the scrollId from the item's ASR. I thought about this approach a bit more as I was writing this down, and I realized that if the APZ tree doesn't have metrics or an APZC instance for the scrollId, it may not know where that scrollframe is within the tree of scrollframes, and may not be able to properly untransform the input event for main-thread dispatch, if e.g. the inactive scrollframe is inside a nested active scrollframe with an async transform. So that's also something to consider. -
Another is to ensure that all inactive scrollframes still create ASR instances, and that items inside the scrollframes use those ASRs. This would automatically ensure that hit-test info items inside the inactive scrollframe have the correct scrollId (i.e., the above bullet point) but would also cause WebRenderLayerScrollData instances corresponding to the ASRs get sent over as part of the WebRenderScrollData. This in turn would cause creation of the APZCs in the APZ tree. Input events targeted at these APZCs would not necessarily wait for a main-thread round-trip (unless there were e.g. APZ-aware listeners there) but would instead go straight to the APZC via the input queue. This would result in the APZC scrolling but immediately checkerboarding because of the lack of displayport. The main-thread version of the input event or the next repaint request would cause the displayport to get set and clear the checkerboard.
https://treeherder.mozilla.org/jobs?repo=try&group_state=expanded&revision=d2c349236e5e9c187eb84951104a6cd7e5d5ad6b is my most recent incomplete WIP for this approach, but it's failing some reftests and I need to investigate why.
- Another variant of the above scenario is that we set the
eRequiresTargetConfirmation
flag on the hit-test info items, so that the input events would wait in the InputQueue until the main-thread round-trip happened, which should eliminate or shorten the checkerboarding. This flag could be set either at display-list building time or tacked on to the result of the hit-test in the compositor. The latter is probably simpler.
Just for completeness, the first bullet point from bug 1541589 comment 13:
- Set a zero-margin displayport on every inactive scrollframe, thereby activating it. The end result of this should be the same as the second bullet above, but the implementation may be simpler as it takes advantage of existing abstractions rather than mucking around with ASRs directly.
Finally, a backup plan:
- Allow out-of-order event dispatch in these rare edge cases. Here, if the input events stamped with the wrong layers id by APZ, the parent process still dispatches it to that (wrong) content process. the content process can determine that the input event falls inside an OOPIF, and can then stamp the input event with the more-correct layers id of the OOPIF and send it back to the parent process to redispatch. This repeats until the input arrives at the correct OOPIF. Note that while a content process is processing an input event, other inputs may arrive and go to the correct target process (e.g. keyboard inputs that go directly to the currently-focused element, without needing to be stamped with a layers id by APZ), so this can result in out-of-order event delivery to an OOPIF. Still, the number cases in which this occurs may be low enough to make it ok.
Reporter | ||
Comment 3•4 years ago
|
||
New WIP: https://treeherder.mozilla.org/#/jobs?repo=try&revision=20682aa8e27981b4c2f2bc114a85c72e4129d07d
I'm expecting this to be green for correctness but I haven't really looked into the performance implications. This patchset basically activates all the scrollframes when WR is enabled, and then updates failing apz mochitests accordingly.
Updated•4 years ago
|
Reporter | ||
Comment 4•4 years ago
|
||
https://treeherder.mozilla.org/perfherder/compare?originalProject=mozilla-central&newProject=try&newRevision=b12972231592ab7eece8701984afe8db016bdcea&framework=1&selectedTimeRange=172800 is the Linux WR talos performance comparison. ~9% regression on tscrollx and tp5o_scroll but otherwise seems ok. So that's not terrible; I suspect the regression is mostly just from extra displaylist items and scroll data making IPC slower.
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
Looking at the regression it's basically every subtest that has regressed.
testing/talos/talos/tests/scroll/reader.htm from tscrollx is a simple page, the only things that have scrollable overflow are the root element and the scrollable div that presumably the test is scrolling. So I would expect both to be active for this test. So I would expect the patch to be a no-op, but yet we still see a large regression on that page.
Assignee | ||
Comment 6•4 years ago
|
||
I got other weird results like making the change only affect the content process, or only affect the non-content process, didn't move the numbers. And making the change apply to non-wr doesn't move the non-wr numbers at all. So I just disabled the change and pushed that to try and I still get the worse perf numbers. So it looks like the difference is coming purely from some difference when running talos on m-c vs on try. So I'm guessing there is no actual perf regression here, just tscroll/asap mode biting us again. For example bug 1645275 documents how tscroll bit us previously, although I can't tell if that is related or not.
So I will probably just get kats' tests and patches reviewed and land them and see if we get perf alerts.
Assignee | ||
Comment 7•4 years ago
|
||
If we don't have the WantAsyncScroll condition there, then some reftests
(e.g. clipPath-on-outflowElement-01a.html) will fail because even
non-scrollable overflow:hidden divs get activated.
Updated•4 years ago
|
Assignee | ||
Comment 8•4 years ago
|
||
Depends on D99983
Assignee | ||
Comment 9•4 years ago
|
||
Depends on D99984
Comment 10•4 years ago
|
||
Timothy, are you working on this scrollframe bug? Do you have an ETA for addressing the review comments from Phabricator?
Assignee | ||
Comment 11•4 years ago
|
||
Yeah, I'm working on it! Addressing the review comments really spiralled out into a much bigger patch stack. The patches are all written (approach approved by the eventually reviewer) they pass try now, but I'm working on a few test issues. Writing tests are always a nightmare and always takes at least 2x longer than debugging and writing a patch for the original issue it seems. I finally got the test working okay in bug 1689492 over the weekend (not linked here, but needed to so that my patch stack has a reasonable mental model that I can work with).
Comment 12•4 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #11)
The patches are all written (approach approved by the eventually reviewer) they pass try now, but I'm working on a few test issues. Writing tests are always a nightmare and always takes at least 2x longer than debugging and writing a patch for the original issue it seems. I finally got the test working okay in bug 1689492 over the weekend (not linked here, but needed to so that my patch stack has a reasonable mental model that I can work with).
Sounds good. I totally understand about tests taking longer than the patch itself. :)
I'm the EPM for the Fission project. If you get blocked or need anything, just let me know!
Assignee | ||
Comment 13•4 years ago
|
||
We want this for hit testing purposes.
Depends on D103859
Comment 14•4 years ago
|
||
Timothy, your patches (after botond's r+) can land without the new tests in bug 1689492, correct?
Adding new tests should follow right after but should not be a M6c blocker.
Assignee | ||
Comment 15•4 years ago
|
||
Bug 1689492 is already landed (it's a bug to land a test).
I'll land the patches in the dependant bug (bug 1690433) shortly.
Before flipping the pref here I need to modify a few tests slightly otherwise they will fail. (The tests were originally modified to expect the new behaviour always, whereas I'm only flipping the pref to enable the new behaviour with fission.) I'll change the tests more thoroughly after landing the pref flip.
Assignee | ||
Comment 16•4 years ago
|
||
Depends on D99985
Comment 17•4 years ago
|
||
Comment 18•4 years ago
|
||
Assignee | ||
Comment 19•4 years ago
|
||
Depends on D104176
Comment 20•4 years ago
|
||
Comment 21•4 years ago
|
||
Backed out 4 changesets (bug 1675547) for bc failures in browser_test_group_fission.js.
https://hg.mozilla.org/integration/autoland/rev/435e31de6c0ec8e3c1c61a430f9c6181a732ceae
Push with failures:
https://treeherder.mozilla.org/jobs?repo=autoland&revision=5064e7de4dcb3c75f80ee2fd7379c8aa724e0b77&selectedTaskRun=OYBHUfK5SC2h7T5c4q2c2g.0
Failure log:
https://treeherder.mozilla.org/logviewer?job_id=328968468&repo=autoland&lineNumber=3447
Comment 22•4 years ago
|
||
Comment 23•4 years ago
|
||
Also saw the following intermittent showing up with the backed out changes: https://treeherder.mozilla.org/logviewer?job_id=328969509&repo=autoland&lineNumber=12438
Updated•4 years ago
|
Comment 24•4 years ago
|
||
Backout merged: https://hg.mozilla.org/mozilla-central/rev/fa3b003cee3f
Updated•4 years ago
|
Assignee | ||
Comment 25•4 years ago
|
||
The failure in helper_check_dp_size.html is interesting. This only happens on the M-fis, so all scroll frames get a minimal display port at least. We call promiseMoveMouseAndScrollWheelOver in the test to activate an inner div. AsyncPanZoomController::HandleEndOfPan gets called (assuming from OnPanEnd). It calls FlushRepaints() on the overscroll handoff chain. This calls RequestContentRepaint on the outer div. The resulting UpdateSubFrame call does not clear the minimal display port property because we already have a display port (it happens to be minimal). Then PrepareForSetTargetAPZCNotification is called to activate the inner div, it sets a zero margin display port on all ancestors including the outer div, and hence removes the minimal display port property. The outer div ends up with a zero margin display port (but not minimal). This is the normal case. When the test fails these two things happen in reversed order: PrepareForSetTargetAPZCNotification removes the minimal display port property, and then UpdateSubFrame is called, which sets non-zero margins for the display port.
If we are in a situation where all scroll frames are not activated we avoid this entirely because apz doesn't know about the outer div scroll frame, so there is no RequestContentRepaint happening.
Assignee | ||
Comment 26•4 years ago
|
||
Possible actions to take:
- change SetZeroMarginDisplayPortOnAsyncScrollableAncestors to never clear a minimal display port property
- switch zero margin display ports to minimal display ports (I have a patch for this already), it would also fix bug 1691160
- divide all RequestContentRepaint calls into those that want to set a displayport and those that don't and send that in the repaint request and have the apz callback helper look at that field (I have a patch for this already)
- have UpdateSub/RootFrame in apz callback helper never replace a zero margin display port with a non-zero margin display port.
number 4 feels kind of hacky.
number 3 is appealing because it could potentially eliminate other sources of setting a full display port when we don't want. As we saw in bug 1687927, comment 8 avoiding setting a full display port is a nice and significant perf win.
number 2 I was planning on doing, but wanted to wait until fall out of this and related patches was clear.
number 1 seems like a nice, good, small idea to do, and doesn't preclude us from doing any of the other options or other potential fixes I haven't thought of on this list.
Assignee | ||
Comment 27•4 years ago
|
||
Bug 1691186 filed for number 1 above with patch.
Comment 28•4 years ago
|
||
Comment 29•4 years ago
|
||
Comment 30•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/10606ca6f128
https://hg.mozilla.org/mozilla-central/rev/213de2637a7a
https://hg.mozilla.org/mozilla-central/rev/aafff5935628
https://hg.mozilla.org/mozilla-central/rev/64ddd35a8fc0
Updated•4 years ago
|
Assignee | ||
Comment 31•4 years ago
|
||
Filed bug 1691878 to more better test this code in some existing tests, which was mentioned in review comments.
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Description
•