Closed
Bug 1017551
Opened 11 years ago
Closed 9 years ago
Add Nagios alert for pending jobs in scheduler DB with submit time > N hours ago
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 978956
People
(Reporter: emorley, Unassigned)
References
Details
(Keywords: sheriffing-P1, Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2531] )
In bug 1012633 it took some time for us to realise that there was a backlog building up - since the normal metric of "are there more than N total queued jobs" missed the fact that most jobs were being picked up by a machine and run fine, those specific ones were not.
In order to not miss this in future, we should add a Nagios alert (or fix/tweak the current rule if we already have something similar) that checks for pending (ie scheduled but not started) jobs in the scheduler DB that have a submit time > N hours ago.
Up for suggestions as to what we set N to - and we may need to vary N depending on whether the job is scheduled on a main tree ({mozilla-central, mozilla-inbound, b2g-inbound, fx-team, mozilla-aurora, mozilla-beta, mozilla-release, mozilla-b2g*, ...}) or a lower priority tree ({try, cedar, ash, ...}).
Reporter | ||
Updated•11 years ago
|
Keywords: sheriffing-P1
Comment 1•11 years ago
|
||
Similar/dupe of bug 978956?
Reporter | ||
Comment 2•11 years ago
|
||
Yeah though that states it's just for tests; happy for you to dupe either way and/or morph
Comment 3•11 years ago
|
||
Given the total lack of activity in that bug, I don't really care at this point.
Comment 4•11 years ago
|
||
Updated•10 years ago
|
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2531]
Updated•9 years ago
|
Component: Tools → Buildduty
QA Contact: hwine → bugspam.Callek
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•