Closed Bug 1653796 Opened 4 years ago Closed 4 years ago

sample times for apz animations non-monotonic with webrender

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

RESOLVED FIXED
81 Branch
Tracking Status
firefox81 --- fixed

People

(Reporter: tnikkel, Assigned: kats)

References

Details

Attachments

(2 files)

With the patches for bug 1651332 (although the patches for that bug don't seem to create this problem this is how it is easier to observe the problem) we frequently get large non-monotonic jumps in the last sample time for apz animations, for example:

0x13db6f000 AsyncPanZoomController::StartAnimation mLastSampleTime 13.079496
0x13db6f000 AsyncPanZoomController::UpdateAnimation mLastSampleTime 11.261767
0x13db6f000 AsyncPanZoomController::UpdateAnimation mLastSampleTime 13.096266

(where the number output is the time between mLastSampleTime and the creation of that particular AsyncPanZoomController)

mLastSampleTime comes from two different sources, either GetFrameTime() or WebRenderBridgeParent::SetAPZSampleTime. The WebRenderBridgeParent::SetAPZSampleTime ones seem to be the problem.

WebRenderBridgeParent::SetAPZSampleTime sets the apz sample time to the last compose time plus the vsync interval. That seems suspicious if the last compose time is further in the past because we didn't need to compose.

kats, do you know about this? Where the problem might be?

Flags: needinfo?(kats)

I don't know offhand. The timestamp/frame scheduling stuff with WR is a bit complicated and I haven't looked at it for about a year, but I need to page it back in to deal with bug 1641070. I can look into this bug too, but can you clarify the STR a bit? Is it just (a) apply patches from bug 1651332, (b) run with allow_zooming=true, (c) zoom in some, and (d) click on scrollbar track to page-scroll? Or are there other steps involved?

Those are correct steps, in fact you don't even need step (c). It doesn't happen every time though.

Blocks: wr-apz
Severity: -- → S3
Priority: -- → P3

I'm not sure, but this might be annoying enough to block turning on the new scrollbar code again.

I'll start looking into this.

Assignee: nobody → kats
Flags: needinfo?(kats)

(In reply to Timothy Nikkel (:tnikkel) from comment #1)

WebRenderBridgeParent::SetAPZSampleTime sets the apz sample time to the last compose time plus the vsync interval. That seems suspicious if the last compose time is further in the past because we didn't need to compose.

Looking at the code, the SetAPZSampleTime call happens here in MaybeGenerateFrame, and there are two callers of MaybeGenerateFrame. One is CompositeToTarget which gets called from CompositorVsyncScheduler::Composite and sets the last compose time. The other is FlushFrameGeneration which also sets the last compose time. mLastComposeTime is poorly named for sure, but it seems like we do set it to a reasonably recent timestamp before we call SetAPZSampleTime.

Ah, the problem is that there can be a delay between the MaybeGenerateFrame bit (which sets the compose time) and actually generating the frame (which triggers the call to SampleForWebRender and updates animations). It seems to me like there's a spurious frame generation happening at the start of the animation and so that comes in with a stale timestamp which throws everything off.

I traced this a bit further into WR and it looks like this transaction gets sent from Gecko:

send_transaction with threaded=true, genframe=false, invalidate=false, low_priority=true
		SceneMsg::SetDisplayList
		ResourceUpdate::UpdateImage size(32x1474)
		ResourceUpdate::UpdateImage size(32x189)

which then ends up here with the has_built_scene condition true and WR ends up generating a frame even though Gecko didn't explicitly request one. This might be something to fix, but for now it's probably better to make the downstream code more robust so it doesn't do bad things in this scenario.

I see that Hiro added a ResetPreviousSampleTime method that looks like it's addressing something similar. Maybe I should piggyback on that.

The ResetPreviousSampleTime approach didn't work out, it wasn't getting called in all the right scenarios. I have another fix that is more localized and makes sense to me and fixes the problem.

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/33cf94c2936b
Eliminate non-monotonic sample times at the start of APZ animations with WR. r=tnikkel

Oh yeah, I guess we shouldn't override the sample time if it is coming from a test setup.

Flags: needinfo?(kats)

This makes the sample time more "strongly typed" by using a variant-style
class instead of a raw TimeStamp. The sample time can come from different
sources (vsync, testing, Now()) and it is useful to save that information
as we pass around the sample time. It would have been helpful with some
of the debugging I did recently, and it also is needed for the next patch.

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2066c853eb63
Introduce a SampleTime class. r=botond
https://hg.mozilla.org/integration/autoland/rev/3aecd9a7137e
Eliminate non-monotonic sample times at the start of APZ animations with WR. r=tnikkel
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: