Closed Bug 1755006 Opened 3 years ago Closed 3 years ago

Worker events can be starved by refresh driver

Categories

(Core :: Layout, defect)

defect

Tracking

()

RESOLVED FIXED
100 Branch
Tracking Status
relnote-firefox --- 100+
firefox-esr91 --- wontfix
firefox97 --- wontfix
firefox98 --- wontfix
firefox99 --- wontfix
firefox100 --- fixed

People

(Reporter: jrmuizel, Assigned: smaug)

References

(Blocks 1 open bug, Regressed 1 open bug, Regression)

Details

(Keywords: perf-alert, regression)

Attachments

(6 files)

Attached file test.html (deleted) —

In bug 1749365 we're seeing a laggy volume control on Twitch caused by what I believe is high latency in receiving events from a worker.

I've created a test case that shows this starvation quite dramatically compared to Safari and Chrome.

Blocks: 1749365
Regressed by: 1300658

Set release status flags based on info from the regressing bug 1300658

Has Regression Range: --- → yes

asuth, does this look like a worker-side issue to you?

Flags: needinfo?(bugmail)

The problem as I understand it here is that postMessage() from workers to their parent are explicitly dispatched with a normal priority via a ThrottledEventQueue and this is intentionally structured this way because of bug 1300659 where the concern was a worker being able to run faster than the main thread and spam it so much it couldn't make forward progress. Whereas vsync is dispatched as its own vsync priority class which is higher than Normal and also the MediumHigh workers use for other types of events.

The general options would seem to be:

  • The refresh driver needs to try and do something to scale to leave more slack in the system. This seems tricky because I think the granularity is largely a question of dropping frames or not.
  • Worker-to-parent event prioritization could be more dynamic using a token bucket or similar so that if a worker is only sending a small number of message per second, those could be re-targeted to be medium-high. But in the event of spamming of postMessage like in bug 1300659, a steady-state would be reached where the worker drops down to normal priority again.
    • Do note that there are potential event ordering issues, but that the use of ThrottledEventQueue helps mitigate this because it internally buffers dispatched events and so can be re-targeted to different underlying queues without too much trouble.

This ends up being more of a :smaug question, though I think?

Flags: needinfo?(bugmail) → needinfo?(bugs)
Assignee: nobody → bugs
Component: DOM: Events → Layout
Flags: needinfo?(bugs)

The tracing version of this test case makes it easier to determine the behaviour of other browsers.

A basic summary of the results are:
Firefox tries keep running at 60Hz
Chrome drops the frame rate down to 8.3Hz (120ms)
Safari doesn't throttle events from the worker at all

Attached file chrome.json (deleted) —
Attached file safari.json (deleted) —
Attached file firefox.json (deleted) —
Attachment #9266046 - Attachment description: WIP: Bug 1755006, reduce framerate if the main thread is busy handling also other tasks → Bug 1755006, reduce framerate if the main thread is busy handling also other tasks, r=mstange
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d3f40b4651f5 reduce framerate if the main thread is busy handling also other tasks, r=mstange

Backed out changeset d3f40b4651f5 (Bug 1755006) for causing mochitest failures on test_pointerlock-api.html.
Backout link
Push with failures - 4
Failure Log
dt3 Failure Log

Flags: needinfo?(bugs)
Depends on: 1701097
Depends on: 1357082
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0211ed711a1c reduce framerate if the main thread is busy handling also other tasks, r=mstange
Flags: needinfo?(bugs)
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch
Status: RESOLVED → REOPENED
Flags: needinfo?(bugs)
Resolution: FIXED → ---
Target Milestone: 100 Branch → ---
Flags: needinfo?(bugs)

Hello, this(Bug 1655139) also turned into high frequency starting from this bug which got backed out. Can you please look at this too?

Flags: needinfo?(bugs)
Depends on: 1759533
Blocks: 1759581
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cd8d7e03fcb5 reduce framerate if the main thread is busy handling also other tasks, r=mstange
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch
Regressions: 1760007
Keywords: perf-alert
Regressions: 1762090
Blocks: 1759533
No longer depends on: 1759533

Release Note Request (optional, but appreciated)
[Why is this notable]: Performance of the volume slider on Twitch is much better than before
[Affects Firefox for Android]:
[Suggested wording]: Improve the fairness between painting and handling other events. This noticeably improves the performance of the volume slider on Twitch.
[Links (documentation, blog post, etc)]: Before and after videos here: https://jrmuizel.github.io/twitch/volume.html

relnote-firefox: --- → ?
Regressions: 1768308

Clearing the leftover needinfo

Flags: needinfo?(smaug)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: