Closed Bug 1453656 Opened 7 years ago Closed 4 years ago

Lower the TakenUntil timeout

Categories

(Taskcluster :: Services, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jonasfj, Unassigned)

References

Details

Lowering the takenUntil timeout in the queue will increase the reclaimWork frequency, bringing us following benefits: + faster cancellation (in correctly implemented workers) + faster detection of dead workers (and following retries) + better tracking of worker state in queue. Downside is: - less tolerant of temporary network failures - more load on servers. I suggest we do this after a possible migration to postgres
Depends on: 1436478
QA Contact: jhford
Component: Queue → Services

This is still something we should discuss doing. Perhaps we should make this configurable at a deployment level so that we can experiment.

Priority: P4 → P5
QA Contact: jhford

This should be a deployment-configurable value, so it can be tuned up and down to balance the factors Jonas pointed out.

We have postgres support now, so there's nothing preventing doing this.

No longer depends on: 1436478
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.