Open Bug 1483099 Opened 6 years ago Updated 2 years ago

Reconsider CONTENT_FRAME_TIME metric

Categories

(Core :: Graphics, enhancement, P3)

63 Branch
enhancement

Tracking

()

Tracking Status
firefox63 --- affected

People

(Reporter: jrmuizel, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [gfx-noted])

The current implementation has some problems:
1. Moving time from the content thread to the compositor thread makes the metric worse but potentially improves the user experience.
2. If there's significant work that happens before paint starts the measured time goes down because we're waiting less. e.g. spending an extra 5ms in style improves the metric by 5ms as long as we don't go over the frame budget.
3. The time doesn't include GPU work. If our GPU work causes us to miss a frame CONTENT_FRAME_TIME won't get worse.

I'm not sure what a good solution to all of these problems is.
Blocks: 1481950
(In reply to Jeff Muizelaar [:jrmuizel] from comment #0)
> The current implementation has some problems:
> 1. Moving time from the content thread to the compositor thread makes the
> metric worse but potentially improves the user experience.

The approach of rounding down to an integer number of frames hopefully removes this (expect in the cases where the extra compositor time causes us to miss a frame, which is the right thing I think).

We can also see the reduction in main-thread time independently using CONTENT_PAINT_TIME.

A reduction in CONTENT_PAINT_TIME, with no change to the rounded-CONTENT_FRAME_TIME would be a good win.


> 2. If there's significant work that happens before paint starts the measured
> time goes down because we're waiting less. e.g. spending an extra 5ms in
> style improves the metric by 5ms as long as we don't go over the frame
> budget.

It seems likely that this skew would be consistent across Gecko/WR and doesn't matter too much.

We might want to use the vsync timestamp as our start (or two frame intervals after that?) as that's based on when we want to have things ready.

> 3. The time doesn't include GPU work. If our GPU work causes us to miss a
> frame CONTENT_FRAME_TIME won't get worse.

ID3D11Query has support to let us time the GPU work for a given frame. It's less obvious how to align that with our wall-clock based timestamps to figure out when we actually finished.

It's possible that just appending the time (assuming that the GPU work starts when we call Present) would give somewhat useful results.

It looks like GPUView [1] can show timelines, that might prove helpful here.


[1] https://docs.microsoft.com/en-us/windows/desktop/direct2d/profiling-directx-applications
(In reply to Jeff Muizelaar [:jrmuizel] from comment #0)
> The current implementation has some problems:
> 1. Moving time from the content thread to the compositor thread makes the
> metric worse but potentially improves the user experience.

I agree with Matt that the most sensible way to look at this data is rounded down
to the nearest integer, so it represents frame latency. What I wanted was to be
able to answer which implementation misses frames more often.

So I think it might make sense to round down each sample before computing averages
and comparing distributions. That would work prevent extra time on the last frame
from biasing the results.

> 2. If there's significant work that happens before paint starts the measured
> time goes down because we're waiting less. e.g. spending an extra 5ms in
> style improves the metric by 5ms as long as we don't go over the frame
> budget.

That is counter-intuitive, I agree. If were to start the timing at the vsync
timestamp though, then that work could negatively impact this metric even
though graphics isn't at fault. So neither are truly independent.

It's probably the least counter-intuitive to start this on vsync though.

I also agree with Matt that non-gfx workload should remain comparable in
both configurations.

> 3. The time doesn't include GPU work. If our GPU work causes us to miss a
> frame CONTENT_FRAME_TIME won't get worse.
> 
> I'm not sure what a good solution to all of these problems is.

GPU work is a hard problem, we we have a similar problem with OMTP.

We have two probes here, GFX_OMTP_PAINT_WAIT_TIME and gfx.omtp.paint_wait_ratio,
which record the frequency and length of times we block on finishing async
paints. This lets us measure a proxy of how often we fall behind in frame
budgets.

Would it be possible to do something similar with Present()? Or whatever API
call blocks us on GPU work still pending.
Priority: -- → P3
Whiteboard: [gfx-noted]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.