Open Bug 1822630 Opened 1 year ago Updated 10 months ago

Distribute tests for WebGPU CTS better

Categories

(Core :: Graphics: WebGPU, defect, P1)

defect

Tracking

()

People

(Reporter: ErichDonGubler, Assigned: ErichDonGubler)

References

(Blocks 3 open bugs)

Details

Our current vendoring of gpuweb/cts is far from perfect WRT the test we generate for CI. There are two related pain points we'd like to resolve around them with this bug:

  1. Chunked test files contain a fixed number of tests, evenly divided from the set generated by CTS upstream scripting. Each individual chunk requires its own set of metadata for the tests it contains. Both of these facts create a particularly painful problem: when we vendor in updates to the CTS, additions and removals of tests causes:

    1. The chunk that new tests land in and all subsequent chunks have to have lines moved between them. This generates a high volume of diff noise, making it difficult for a human to visually identify tests that have, in fact, been added or removed (rather than being moved between chunks).
    2. Expectation metadata for tests must also be manually moved to match generated changes, because of the above. This is tedious, because a high percentage of tests still fail, ATOW.
  2. :ErichDonGubler's initial stab at WebGPU CTS left an action item for improving the distribution of task times per :jmaher's request, and we'd like to honor that here. To wit, the WPT tests we generate for WebGPU CTS (viz., wpt* jobs in Taskcluster runs like these) need to:

    1. Stay within 50 minutes of execution time for optimized builds.
    2. Stay within 30 minutes of execution time for debug builds.
Assignee: nobody → egubler
Assignee: egubler → nobody
Severity: -- → S3
Flags: needinfo?(jimb)
Assignee: nobody → egubler
Status: NEW → ASSIGNED

Discussed the way we might resolve this a bit with :jgilbert, :teoxoy, and :jimb, since it's become relevant with :jgilbert's work on bug 1831263. Our tentative direction is to use a hybrid manual-automatic WPT test chunking approach based on the upstream tree structure, instead of the “linear” chunking we do. The things we're thinking of getting with it:

  • Hand-pick a set of test paths that are relatively fast smoke-level tests, much like WebGL's “core” tasks in CI are currently structured. This is intended allow folks to triage whether or not they've broken something fundamental quickly.
    • Maybe split the tests b/w API and shaders?
  • “auto”-chunk everything else into CTS test files based on, say, unique path components to a depth of 5, i.e., webgpu:api,operation,adapter,requestDevice,* and all other tests underneath it becomes their own WPT test file.
  • When a set of tests from the above are taking too long, we split it, and add task chunks as necessary. We feel that it should be much easier to understand and resolve longer running sets of tests when they're divided conceptually, so this seems tractable to plan on doing.

When a set of tests from the above are taking too long, we split it. We feel that it should be much easier to understand and resolve longer running sets of tests when they're divided conceptually, so this seems tractable to plan on doing.

This sounds fantastic to me.

Removing from current work queue, since I haven't been actively working on this recently.

Assignee: egubler → nobody
Status: ASSIGNED → NEW
Priority: -- → P3
Blocks: 1834558
Blocks: webgpu-phase-2
No longer blocks: webgpu-v1-cts-blockers
Assignee: nobody → egubler
Blocks: webgpu-v1
No longer blocks: webgpu-phase-2
Priority: P3 → P1
You need to log in before you can comment on or make changes to this bug.