Closed
Bug 1166414
Opened 9 years ago
Closed 9 years ago
SETA does not retrigger for across the board build bustage
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 1173822
People
(Reporter: kmoir, Unassigned)
Details
Possible fixes:
1) make it easy for the sheriffs to "run all coalesced jobs" (via arbitrary build api, mozci/treeherder) when the build failures are fixed
2) inspect build results and retrigger if failed with coalescing disabled
irc conversation from today:
jmaher kmoir: RyanVM|sheriffduty brings up a valid concern with SETA; specifically if we force coalescing for 9 pushes, what happens if the 10th push is a DONTBUILD or we have a build failure
kmoir jmaher: hmm, my understanding would be that it wouldn't run another one to account for the failure.
kmoir jmaher: it's just at the level of scheduling, not inspecting results and then changing scheduling
jmaher kmoir: assuming we don't have a valid build for the 10th run, would the 11th run attempt it, and would we continue to attempt it until we had a valid build to schedule?
jmaher kmoir: sounds like we could get into a state where we don't run some jobs for 20 or 30 pushes
kmoir jmaher: I don't think so, but I could do some testing.
kmoir jmaher: you mean if there are multiple failures on each of the pushes where the job would run
jmaher kmoir: ok- thanks; is there anything I could do to help test this out
RyanVM|sheriffduty kmoir: yes, say the 10th push is a DONTBUILD
RyanVM|sheriffduty or across the board bustage
jmaher kmoir: yes- assuming the 10th push was a DONTBUILD and the 18-23rd pushes are full of build failures
jmaher mshal: I honestly have no idea
kmoir jmaher: is that a likely scenario?
RyanVM|sheriffduty kmoir: across the board bustage is
jmaher kmoir: we have a lot of build failures on the tree; dontbuild is not that often, but a couple times a day
jonasfj Note, I have no idea what I'm talking aobut
RyanVM|sheriffduty kmoir: it's what got me thinking about it today
RyanVM|sheriffduty kmoir: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=build
RyanVM|sheriffduty add another 20 pushes to get a feel for what today's been like
kmoir looks
kmoir jmaher RyanVM: so my thinking is that it wouldn't reschedule them the way it is implemented now, would have to do some testing to confirm
kmoir this was not a scenario I looked at when testing
jmaher kmoir: do you see the concern that RyanVM|sheriffduty has with build failures
kmoir jmaher: yes, definitely
jmaher kmoir: open to ideas and helping where possible to reduce concerns with that
kmoir jmaher: the only thing I can think of is to inspect the results of the last run and if there is a failure, do not coallesce, not sure how to implement this yet
jmaher kmoir: another option is to make it easy for the sheriffs to "run all coalesced jobs" (via arbitrary build api, mozci/treeherder) when the build failures are fixed
kmoir jmaher: okay I'll open a bug and we can discuss the way forward there
jmaher kmoir: cool
Comment 1•9 years ago
|
||
I think an easy fix here would be to have the sheriffs click a button and fill all the jobs for a given push; this would be useful when the builds are passing. The problem here is we would need this button to be pushed AFTER all the builds complete for the given revision.
maybe there is a simple hack in buildbot.
Comment 2•9 years ago
|
||
pretty sure that if the 10th build is DONTBUILD, we would test on the 11th. this is because we're not actually counting pushes, we're counting sendchanges from the builds to trigger tests. if we don't get to the point of running sendchange for whatever reason, then that doesn't count towards the pending test count.
Comment 3•9 years ago
|
||
That's exactly what I was hoping to hear. Thanks catlee!
Comment 4•9 years ago
|
||
should we close this bug then?
Comment 5•9 years ago
|
||
Would we get a sendchange if the entire push was busted across the board?
Reporter | ||
Comment 6•9 years ago
|
||
If a push was busted across the board and the build was broken (red), the tests aren't invoked so no sendchange is invoked. So wouldn't it invoke the tests on the next sendchange when the build was fixed?
Reporter | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
Assignee | ||
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•