Closed
Bug 421895
Opened 17 years ago
Closed 14 years ago
Enable users to cancel Try Server builds
Categories
(Release Engineering :: General, defect, P3)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jwatt, Assigned: catlee)
References
Details
(Whiteboard: [tryserver][db])
Attachments
(2 files, 1 obsolete file)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
If I upload a patch and I see almost immediately that it doesn't work on Mac, I'd like to be able to cancel the other builds so I can try again without waiting hours for the Windows build to complete. This could really shorten the cycle between various attempts to get a patch working on other platforms. As well as a manual button to cancel your build, it would be nice to have an option to have all platforms cancel their builds automatically if even one of the platforms fails, since usually you're going to want to retry a new patch. Might then be good to have the automatically canceled builds turn a color other than red (so you can see which build was the actual failing build).
Updated•17 years ago
|
Assignee: server-ops → nobody
Component: Server Operations → Try Server
Product: mozilla.org → Webtools
QA Contact: justin → try-server
Updated•17 years ago
|
Priority: -- → P3
Updated•15 years ago
|
Component: Try Server → Release Engineering
Product: Webtools → mozilla.org
QA Contact: try-server → release
Target Milestone: Future → ---
Comment 3•15 years ago
|
||
I believe this is wanted for production-master as well. Futuring until we have the resources for this.
Component: Release Engineering → Release Engineering: Future
Comment 4•15 years ago
|
||
Mass move of bugs from Release Engineering:Future -> Release Engineering. See http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Updated•14 years ago
|
Priority: P4 → P5
Assignee | ||
Comment 5•14 years ago
|
||
It's possible to cancel *pending* builds using schedulerdb. Canceling active builds still requires talking directly to the buildbot master.
Whiteboard: [tryserver] → [tryserver][db]
Updated•14 years ago
|
Blocks: try_enhancements
Comment 6•14 years ago
|
||
For cancelling pending builds, it would be nice to be able to hg commit --close-branch hg push -f try
Updated•14 years ago
|
Assignee: nobody → lsblakk
Updated•14 years ago
|
Whiteboard: [tryserver][db] → [tryserver][db][triagefollowup]
Updated•14 years ago
|
Whiteboard: [tryserver][db][triagefollowup] → [tryserver][db]
Assignee | ||
Comment 7•14 years ago
|
||
Will do this as part of buildapi
Assignee: lsblakk → catlee
Depends on: 591351
Assignee | ||
Updated•14 years ago
|
Priority: P5 → P3
Assignee | ||
Comment 8•14 years ago
|
||
Let's start with this. The read-write stuff is almost there, pending actual testing in staging and read/write access to the database.
Attachment #488118 -
Flags: review?(nrthomas)
Comment 9•14 years ago
|
||
Comment on attachment 488118 [details] [diff] [review] read-only bits enabled Lots of great work here cleaning and expanding on what we have now. I mostly have questions and a few suggestions, but a bit much to r+ with changes. >diff --git a/buildapi/controllers/builds.py b/buildapi/controllers/builds.py >+log = logging.getLogger(__name__) >+access_log = logging.getLogger("buildapi.access") What was the strategy for using log vs access_log in the functions in this file ? require_auth might be better using access_log, and the log() in branch() is the odd one out for the r/o functions. >+ def require_auth(self): We just need ldap auth set up in apache to make this work ? When we get to r/w that is. >diff --git a/buildapi/lib/helpers.py b/buildapi/lib/helpers.py > def convert_master(m): >+ fqdn, port = addr_for_master(m) >+ pretty_name = '%s:%i' % (fqdn, port) I was using everything up to the first '.' before, which is useful for pending/running because it's shorter. Are you making that change intentionally ? >diff --git a/buildapi/model/builds.py b/buildapi/model/builds.py >+def requestFromRow(row): >+def buildFromRow(row): >+ request = requestFromRow(row) >+ build = { >+ 'buildid': row.buildid, >+ 'requests': [request], I had thought that including requests in a build blog was duplicating quite a lot of information, but perhaps it's just buildername and branch. >+def getRevision(branch, revision, starttime=None, endtime=None, limit=None): >+ # TODO: Look at changes table too >+ q = getBuildsQuery(branch) For unscheduled jobs ? >+def reprioritizeRequest(who, brid, priority, engine): >+ if r['claimed_at'] != 0: >+ log.info("Request %i is already running, nothing to do", brid) >+ return False >+ if r['complete'] != 0: >+ log.info("Request %i is complete, nothing to do", brid) >+ return False These are the messages end users would want to see. Can we pass them back in the response ? >+ q = br.update().where(and_(br.c.id == brid, br.c.complete == 0)) Surprised there isn't a claimed_at != 0 condition here too. It looks like a guard against the job getting picked up since the first query. >+ res = q.execute(bind=engine) >+ return True What happens if the execute fails, or stalls ? Is there a timeout ? Is there meant to be some handling of res here ? >+def cancelRequest(who, brid, engine): Similar comments as reprioritizeRequest. >+def cancelBuild(who, bid, engine): You could use the r/o connection here for smidgen more safety. >diff --git a/buildapi/tests/functional/test_builds.py b/buildapi/tests/functional/test_builds.py >+ def setUp(self): >+ self.engine = sqlalchemy.create_engine("sqlite:///:memory:") >+ sql = open(os.path.join(os.path.dirname(os.path.dirname(__file__)), "state.sql")).read().split(";") state.sql is another file to be committed ? >diff --git a/test.ini b/test.ini >+branches = branch1, branch2 >+ >+auth_override = >+ >+masters.aglon.name = aglon:/home/catlee/mozilla/buildapi/buildapi/tests/master >+masters.aglon.fqdn = aglon.localdomain >+masters.aglon.port = 5000 How will the prod values be handled for this ? Perhaps add a production.ini.
Attachment #488118 -
Flags: review?(nrthomas) → review-
Assignee | ||
Comment 10•14 years ago
|
||
(In reply to comment #9) > Comment on attachment 488118 [details] [diff] [review] > read-only bits enabled > > Lots of great work here cleaning and expanding on what we have now. I mostly > have questions and a few suggestions, but a bit much to r+ with changes. > > >diff --git a/buildapi/controllers/builds.py b/buildapi/controllers/builds.py > >+log = logging.getLogger(__name__) > >+access_log = logging.getLogger("buildapi.access") > > What was the strategy for using log vs access_log in the functions in this file > ? require_auth might be better using access_log, and the log() in branch() is > the odd one out for the r/o functions. Ok, I'll fix it up so that r/o functions don't do explicit logging (it's available elsewhere if we need it), and have require_auth use access_log. > > >+ def require_auth(self): > > We just need ldap auth set up in apache to make this work ? When we get to r/w > that is. The proxy on build.m.o is set up to do LDAP auth and forwards that along to the app on cruncher. > >diff --git a/buildapi/lib/helpers.py b/buildapi/lib/helpers.py > > def convert_master(m): > >+ fqdn, port = addr_for_master(m) > >+ pretty_name = '%s:%i' % (fqdn, port) > > I was using everything up to the first '.' before, which is useful for > pending/running because it's shorter. Are you making that change intentionally > ? Nope, fixed it now. > >diff --git a/buildapi/model/builds.py b/buildapi/model/builds.py > >+def requestFromRow(row): > >+def buildFromRow(row): > >+ request = requestFromRow(row) > >+ build = { > >+ 'buildid': row.buildid, > >+ 'requests': [request], > > I had thought that including requests in a build blog was duplicating quite a > lot of information, but perhaps it's just buildername and branch. I wanted to make it easy to go back and forth between builds and requests. Including the request information here seems like a good concession to make for convenience. > >+def getRevision(branch, revision, starttime=None, endtime=None, limit=None): > >+ # TODO: Look at changes table too > >+ q = getBuildsQuery(branch) > > For unscheduled jobs ? And for changes that were merged with others into a single sourcestamp. I've updated the comment. > >+def reprioritizeRequest(who, brid, priority, engine): > >+ if r['claimed_at'] != 0: > >+ log.info("Request %i is already running, nothing to do", brid) > >+ return False > >+ if r['complete'] != 0: > >+ log.info("Request %i is complete, nothing to do", brid) > >+ return False > > These are the messages end users would want to see. Can we pass them back in > the response ? This code is gone now with the message broker approach. > > >+ q = br.update().where(and_(br.c.id == brid, br.c.complete == 0)) > > Surprised there isn't a claimed_at != 0 condition here too. It looks like a > guard against the job getting picked up since the first query. gone. > > >+ res = q.execute(bind=engine) > >+ return True > > What happens if the execute fails, or stalls ? Is there a timeout ? Is there > meant to be some handling of res here ? gone. > > >+def cancelRequest(who, brid, engine): > Similar comments as reprioritizeRequest. gone. > > >+def cancelBuild(who, bid, engine): > > You could use the r/o connection here for smidgen more safety. gone. > >diff --git a/buildapi/tests/functional/test_builds.py b/buildapi/tests/functional/test_builds.py > >+ def setUp(self): > >+ self.engine = sqlalchemy.create_engine("sqlite:///:memory:") > >+ sql = open(os.path.join(os.path.dirname(os.path.dirname(__file__)), "state.sql")).read().split(";") > > state.sql is another file to be committed ? Yes, it's in my repo...forget if I stripped that out of the last patch for brevity or if I forgot to hg add it. > > >diff --git a/test.ini b/test.ini > >+branches = branch1, branch2 > >+ > >+auth_override = > >+ > >+masters.aglon.name = aglon:/home/catlee/mozilla/buildapi/buildapi/tests/master > >+masters.aglon.fqdn = aglon.localdomain > >+masters.aglon.port = 5000 > > How will the prod values be handled for this ? Perhaps add a production.ini. Yup!
Assignee | ||
Comment 11•14 years ago
|
||
I think I've addressed your comments from the previous patch. The read-write parts have been gutted and replaced by calls out to a message broker. scripts/buildapi-agent.py is responsible for acting on the job requests to actually cancel builds, reprioritize, etc. Some implementation notes are here: http://hg.mozilla.org/users/catlee_mozilla.com/buildapi/file/fa37a00544bf/RESTAPI_NOTES.txt
Attachment #488118 -
Attachment is obsolete: true
Attachment #494194 -
Flags: feedback?(nrthomas)
Assignee | ||
Updated•14 years ago
|
Attachment #494194 -
Flags: feedback?(salbiz)
Assignee | ||
Updated•14 years ago
|
Attachment #494194 -
Flags: feedback?(bear)
Assignee | ||
Comment 12•14 years ago
|
||
It's still missing some bits, but I think it's good enough to start having people poke at.
Attachment #495564 -
Flags: review?
Comment 13•14 years ago
|
||
Comment on attachment 494194 [details] [diff] [review] read-only bits + a bit of read-write Is it still useful to look at this patch given the later one attached and the changes we discussed last week ?
Assignee | ||
Updated•14 years ago
|
Attachment #494194 -
Flags: feedback?(salbiz)
Attachment #494194 -
Flags: feedback?(nrthomas)
Attachment #494194 -
Flags: feedback?(bear)
Assignee | ||
Updated•14 years ago
|
Attachment #495564 -
Flags: review?
Assignee | ||
Comment 14•14 years ago
|
||
self-serve now allows you to do this. https://build.mozilla.org/buildapi/self-serve/try
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•