Closed Bug 679465 Opened 13 years ago Closed 13 years ago

talos-addon-master1.amotest.scl1.mozilla.com experiences random timeouts

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: anodelman, Assigned: bkero)

References

Details

talos-addon-master1.amotest.scl1.mozilla.com has been giving random timeouts. The vm itself is under minimal load and has plenty of free space/cpu. I fear that this vm is suffering from starvation in its vm cluster. Is there a means of giving the vm higher priority to cycles so that it is more reliable?
Blocks: 599169
A more likely explanation is that it's suffering from the network issues in bug 677348. Once we get that resolved, let's see how your machine does.
Tossing this one to bkero since he's our ganeti master.
Assignee: server-ops → bkero
The host this has been running on was recently rebuilt to have more drives (and thus more IOPs for VMs). Currently talos-addon-msater1 is the only VM running on it, and the loadavg is 0.80 0.59 0.51, which are quite low for the box. I'm suspecting that zandr is correct, as I've experienced erratic network timeouts on all the hosts in scl1 that I've attempted to connect to.
Alice, Have the timeouts ceased, or are they still occuring?
The machine isn't under heavy use right now, so I can't say if it is resolved or not.
If the problem is not occurring, then at the least this ticket is resolved. Lots of things have changed to improve both network and IO performance in scl1. Open a new bug if problems recur.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.