reduce build-plain to either every 10th push or m-c tier-2 only
Categories
(Testing :: General, task, P3)
Tracking
(firefox77 fixed)
Tracking | Status | |
---|---|---|
firefox77 | --- | fixed |
People
(Reporter: jmaher, Assigned: bc)
References
(Regressed 1 open bug)
Details
(Whiteboard: [ci-costs-2020:done])
Attachments
(2 files, 2 obsolete files)
we run linux/windows debug build-plain on every push to autoland. While we do not spend a lot of cpu hours and costs building these, it is unnecessary to be run every commit.
in the ~6 months of data we have in bigquery, there are 272 revisions where Bp jobs fail (most fail both linux/windows), and 6 of those that do not have another build failing at the same time. looking at the 6 revisions all failures are intermittent (failure to download something).
The risk is low to reduce frequency here as we are not finding plain only regressions.
Reporter | ||
Comment 1•5 years ago
|
||
we should consider valgrind builds as well
Reporter | ||
Comment 2•5 years ago
|
||
valgrind build jobs have found 3 regressions all in the month of January, but that is all in the 7 months I looked at data.
Updated•5 years ago
|
Reporter | ||
Comment 3•5 years ago
|
||
in addition we should reduce win/aarch64* builds and linux64/aarch64 builds to be every 10th push as we are not running tests for those on autoland.
icing on the cake is the opt builds (as we only test on shippable) once we move the few remaining tests that depend on regular opt builds to be on shippable.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
jmaher: Can you clarify the meaning of "reduce ... to either every 10th push or m-c tier-2 only" for me?
We want every 10th push on autoland and want every push on mozilla-central but they should be tier 2 there?
Reporter | ||
Comment 5•5 years ago
|
||
we should treat all the builds referenced here like fuzzing builds, forced SETA every 10th push. We might want to reconsider to be like some tier-2 perf tests and be every 25th push, but we don't have that implemented yet. Changing to every 10th would be a boost in the short term with no risk.
Assignee | ||
Comment 6•5 years ago
|
||
I was looking into this initially with the idea that these would be very much like the seta fuzzing changes I made earlier but that requires simultaneous changes to treeherder to support using seta on the specific builds and is a pain to test. I came to a different idea this morning that I could just use a normal schedule without involving seta at all. The advantage was that there are no treeherder changes required and testing is very easy. The question is whether we would want every 10th push on all trees (projects) or just autoland. I'll assume autoland and allow builds for every push on mozilla-central. I'll put up a phab in a bit to show what I'm talking about.
Reporter | ||
Comment 7•5 years ago
|
||
yeah, every build for m-c, beta, release, esr, try- this would only apply to autoland
Assignee | ||
Comment 8•5 years ago
|
||
PushIntervalStrategy is modeled on seta's approach to schedule tasks on
every Nth push. It is restricted to the autoland project.
Two strategies "push-interval-10" and "push-interval-25" are defined for
scheduling tasks for every 10th and every 25th push respectively.
Debugging output is available via the --verbose option to mach taskgraph optimized.
Assignee | ||
Comment 9•5 years ago
|
||
This patch uses the new push-interval-10 to schedule the linux, windows plain and aarch64
builds on autoland every 10th push.
Tested locally with a local checkout whose pushlog_id was not divisible
by 10 using parameters.yml downloaded from the Gecko Decision Task using
./mach taskgraph optimized --verbose --parameters /tmp/parameters.yml
parameters.yml from autoland showed the following optimizations.
0:56.13 PushIntervalStrategy: Removing task build-linux64-aarch64/opt interval 10
0:56.13 PushIntervalStrategy: Removing task build-linux64-plain/debug interval 10
0:56.13 PushIntervalStrategy: Removing task build-signing-win64-aarch64/opt interval 10
0:56.13 PushIntervalStrategy: Removing task build-win64-aarch64/debug interval 10
0:56.13 PushIntervalStrategy: Removing task build-win64-plain/debug interval 10
0:56.18 PushIntervalStrategy: Removing task valgrind-linux64-valgrind/opt interval 10
while parameters.yml from mozilla-central did not show any PushIntervalStrategy
optimizations.
Depends on D70181
Assignee | ||
Comment 10•5 years ago
|
||
Feedback appreciated on the approach and the results before I formally ask for review.
Updated•5 years ago
|
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 11•5 years ago
|
||
Assignee | ||
Comment 12•5 years ago
|
||
Depends on D70182
Updated•5 years ago
|
Updated•5 years ago
|
Comment 13•5 years ago
|
||
Comment 14•5 years ago
|
||
Comment 15•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8808cb9cbff2
https://hg.mozilla.org/mozilla-central/rev/2001c1f52aa0
Reporter | ||
Updated•5 years ago
|
Comment 16•4 years ago
|
||
Does this mean that an unfortunately-timed patch may make it to m-c only to find that a push-interval job breaks later? Or do we have some way of letting the interval jobs catch up before selecting merge candidates?
Reporter | ||
Comment 17•4 years ago
|
||
Thanks for asking!
we already run most jobs every 10th push, that is as safe as any other job. If we were running every 25th push that is more of what we view as tier-2; most likely it will be caught before the merge, but it could miss the timing window and there could be a regression. All the builds in this bug have been adjusted to the 10th push which will be required to be green before merging to m-c.
Updated•4 years ago
|
Description
•