Closed Bug 1526094 Opened 6 years ago Closed 5 years ago

Validate WebRender performance in Release 67

Tracking

(Product Support Area (PSA):Platform, data-science-status Resolved)

Status:

RESOLVED FIXED

Project Flags:

Product Support Area (PSA)

Platform

Tracking Flags:

Tracking

Status

data-science-status

---

Resolved

People

(Reporter: tdsmith, Assigned: tdsmith)

References

(
URL
)

Details

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Description

•

6 years ago

Brief Description of the request: The WR team expects to turn on WR by default (for supported hardware) in release 67. We're interested in holding back a set of users in release 67 for a few weeks to validate the performance wins.

Any timelines for the request or how this fits into roadmaps: This should run during the 67 cycle.

Links to any assets (e.g Start of a PHD, BRD; any document that helps describe the project):

WebRender release criteria
v1.0 experiment bug
v1.1, 1.2, 1.3 experiment bug (currently deployed)
Experimenter ticket

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Assignee: nobody → tdsmith

Status: NEW → ASSIGNED

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Depends on: 1526061

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Depends on: 1539309

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 1

•

6 years ago

Note that the plan for release 67 is to do a gradual rollout, see bug 1541488. Not sure if that would impact this bug at all.

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1541488

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Comment 2

•

6 years ago

Corey, can I trouble you for a experiment design review?

The structure is essentially the same as the 66 experiment that you reviewed in Bug 1521626. The only difference is the context; WR was off by default in 66 and it will be on by default for eligible users in 67. We'd like to hold back a group in order to measure and document WR performance.

The population calculation for this study and the endpoints (described in Bug 1526041) are same as for 66. Based on our experience with the deployment for the 66 experiment I'll ask for the larger 5% sample that we ended up requiring in the last study.

Flags: needinfo?(cdowhygelund)

Corey Dow-Hygelund [:ccd]

Comment 3

•

6 years ago

Tim, sure thing. I will review Monday/Tuesday of next week, if that works.

Corey Dow-Hygelund [:ccd]

Comment 4

•

6 years ago

Experiment design review checklist:

What is the goal of the effort the experiment is supporting?

The use of Webrender as a rendering solution for Firefox. Webrender has many desirable qualities, and has been validated by two previous experiments. This experiment is to add further validation as this feature is rolled out in 67.

Is an experiment a useful next step towards this goal?

Yes, because it gives a an estimation for feature rollout if Webrender is performant and stable relative to the existing Firefox rendering solution.

What is the hypothesis or research question? Are the consequences for the top-level goal clear if the hypothesis is confirmed or rejected?

To validate the previous results on the Release and Beta channel found in a two previous experiments that WebRender is a stable and performant rendering solution.
Yes the consequences are clear: it will either validate or invalidate Webrender as having acceptable performance and stability for feature rollout.

Which measurements will be taken, and how do they support the hypothesis and goal? Are these measurements available in the targeted release channels? Has there been data steward review of the collection?

The measurements being take are as follows:
No more than a 5% increase in overall crash reports
No more than a 5% increase in OOM crash reports
No more than a 5% increase in shutdown crashes
Telemetry probes:
CANVAS_WEBGL_SUCCESS - no more than 5% regression in "True" value
DEVICE_RESET_REASON - no more than 5% regression in number of submissions
CHECKERBOARD_DURATION - no more than 5% regression in distribution
CHECKERBOARD_PEAK - no more than 5% regression in distribution
CHECKERBOARD_SEVERITY - no more than 5% regression in distribution
CONTENT_LARGE_PAINT_PHASE_WEIGHT - no more than 5% regression in number of submissions
CONTENT_PAINT_TIME - no more than 5% regression in distribution
FX_PAGE_LOAD_MS - no more than 5% regression in distribution
FX_TAB_CLICK_MS - no more than 5% regression in distribution
COMPOSITE_TIME - no more than 10% regression in distribution
CONTENT_FRAME_TIME - no more than 10% regression in distribution
COMPOSITE_FRAME_ROUNDTRIP_TIME - expect to see an improvement here
These metric measure rendering performance, thereby supporting the hypothesis.
These measurements are all available in the release channel (though it is too much effort to make that determination).
At the time, there hasn't been an optional data steward review of the collection.

Is the experiment design supported by an analysis plan? Is it adequate to answer the experimental questions

Yes, the experiment plan follows the previous experiments. In addition, the plan increases the sample size from the Release experiments, from the observed deployment behavior and sample sizes acquired.

Is the requested sample size supported by a power analysis that includes the core product metrics?

Yes, it is the same as the previous study. However, statistics acquired from the similar previous study are used to calculate the requisite sample size.

If the experiment is deployed to channels other than release, is it acceptable that the results will not be representative of the release population?

Not applicable - experiment is deployed on release.

Flags: needinfo?(cdowhygelund)

Corey Dow-Hygelund [:ccd]

•

5 years ago

Merged! Thanks, Saptarshi.

Status: ASSIGNED → RESOLVED

Product Support Area (PSA): --- → Platform

data-science-status: Peer Review → Resolved

Closed: 5 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Validate WebRender performance in Release 67

Categories

(Data Science :: Experiment Collaboration, task, P1)

Tracking

(Product Support Area (PSA):Platform, data-science-status Resolved)

People

(Reporter: tdsmith, Assigned: tdsmith)

References

(
URL
)

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10