Closed Bug 655437 Opened 14 years ago Closed 13 years ago

decommission talos-r3-leopard-007

Categories

(Infrastructure & Operations Graveyard :: Servicedesk, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: hlangi)

References

()

Details

(Whiteboard: [slaveduty][badslave?])

https://build.mozilla.org/buildapi/recent/talos-r3-leopard-007/600 It reads like a demon's resume. Is "Device not configured" Mac-speak for read-only filesystem or something? In any case, it's burned a bit over 500 jobs since 1pm yesterday, please stop it before it burns again.
Depends on: 655440
Wups, forgot a handy log like http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1304736190.1304736196.28699.gz&fulltext=1 clobber build tools failed === Output === rm -rf tools in dir /Users/cltbld/talos-slave/test/. (timeout 1200 secs) ... rm: tools/.hg: Device not configured rm: tools/.hgignore: Device not configured rm: tools/.hgtags: Device not configured rm: tools/.pylintrc: Device not configured rm: tools/breakpad: Device not configured rm: tools/buildbot-helpers: Device not configured rm: tools/buildfarm: Device not configured rm: tools/cdmaker: Device not configured rm: tools/clobberer: Device not configured rm: tools/lib: Device not configured rm: tools/MANIFEST.in: Device not configured rm: tools/release: Device not configured rm: tools/scripts: Device not configured rm: tools/setup.py: Device not configured rm: tools/stage: Device not configured rm: tools/sut_tools: Device not configured rm: tools: Directory not empty etc., until being unable to save the downloaded build is finally fatal and it can move on to burning another job.
I've disabled talos-r3-leopard-007 in slavealloc, it won't get any more jobs.
Severity: blocker → normal
Whiteboard: [slaveduty][badslave?]
Disabling didn't work because it wasn't rebooting. I also can't SSH. However, I graceful'd it on its current master (using the new handy-dandy buildslave-page link in slavealloc), and it seems to have disconnected. This will need a crash cart at scl1 - I'd like to know what this was!
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
Summary: Stop talos-r3-leopard-007 before it burns another 500 jobs → talos-r3-leopard-007 - burning jobs with "Device not configured"
Assignee: server-ops-releng → zandr
colo-trip: --- → scl1
This looks like a drive failure. The finder was frozen with 'Sat 10:31AM' for the clock. Rebooting gives me the flashing ? folder. Pulling out and bringing back to MV.
colo-trip: scl1 → ---
Assignee: zandr → mlarrain
What is the next step for this?
Either decommissioning or hardware repair.
Giving machine to desktop to send out for repair.
Status: NEW → ASSIGNED
Assignee: mlarrain → hlangi
Component: Server Operations: RelEng → Server Operations: Desktop Issues
QA Contact: zandr → tfairfield
I got this back from wefixmacs, the logic board needs to be replaced which is not cost effective.
Assignee: hlangi → server-ops-releng
Status: ASSIGNED → NEW
Component: Server Operations: Desktop Issues → Server Operations: RelEng
QA Contact: tfairfield → zandr
Summary: talos-r3-leopard-007 - burning jobs with "Device not configured" → DNR talos-r3-leopard-007
Assignee: server-ops-releng → mlarrain
Summary: DNR talos-r3-leopard-007 → decommission talos-r3-leopard-007
Assignee: mlarrain → jwatkins
Assignee: jwatkins → mlarrain
Machine has been removed from Nagios, DHCP, DNS and inventory. Assigning to Desktop to finish off.
Assignee: mlarrain → hlangi
Status: NEW → ASSIGNED
Component: Server Operations: RelEng → Server Operations: Desktop Issues
QA Contact: zandr → tfairfield
Closing this out.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Blocks: 717621
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.