Closed
Bug 838425
(tegra-108)
Opened 12 years ago
Closed 10 years ago
tegra-108 problem tracking
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Callek, Unassigned)
References
()
Details
(Whiteboard: [buildduty][buildslaves][capacity])
No jobs taken on this device for > a week (< 3 weeks)
Reporter | ||
Comment 1•12 years ago
|
||
(mass change: filter on tegraCallek02reboot2013)
I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis.
---
Command I used (with a manual patch to the fabric script to allow this command)
(fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046 048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra
The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
Reporter | ||
Comment 2•12 years ago
|
||
had to cycle clientproxy to bring this back
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•11 years ago
|
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•11 years ago
|
Comment 3•11 years ago
|
||
Back in production.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 11 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 4•11 years ago
|
||
Saturday, October 26, 2013 10:35:46 AM
Comment 5•11 years ago
|
||
flashed and reimaged
Comment 6•11 years ago
|
||
Back in production
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 7•11 years ago
|
||
2014-01-16 10:15:35 tegra-108 p online active OFFLINE :: error.flg [Automation Error: Unable to connect to device after 5 attempts]
pdu reboot didn't help
Comment 8•11 years ago
|
||
SD card swapped & reimaged/flashed.
Reporter | ||
Comment 9•11 years ago
|
||
Taking jobs again
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 10•11 years ago
|
||
Power cycled, waited a day.
SD card reformat was successful:
$>exec newfs_msdos -F 32 /dev/block/vold/179:9
newfs_msdos: warning, /dev/block/vold/179:9 is not a character device
newfs_msdos: Skipping mount checks
/dev/block/vold/179:9: 31100416 sectors in 485944 FAT32 clusters (32768 bytes/cluster)
bps=512 spc=64 res=32 nft=2 mid=0xf0 spt=16 hds=4 hid=0 bsec=31108096 bspf=3797 rdcl=2 infs=1 bkbs=2
return code [0]
$>exec rebt
$>^]
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 11•11 years ago
|
||
looks like there were some errors while trying to write to SD card. Maybe a lockfile issue:
from watcher.log:
583 04/07/2014 09:25:04: INFO: INFO: attempting to ping device
584 04/07/2014 09:25:04: DEBUG: calling [ping -c 5 tegra-108]
585 04/07/2014 09:25:08: INFO: Connecting to: tegra-108
586 2014-04-07 09:30:01 -- *** ERROR *** failed to aquire lockfile
587 04/07/2014 09:30:10: INFO: INFO: attempting to create file /mnt/sdcard/writetest
588 2014-04-07 09:35:01 -- *** ERROR *** failed to aquire lockfile
589 2014-04-07 09:40:01 -- *** ERROR *** failed to aquire lockfile
590 2014-04-07 09:45:01 -- *** ERROR *** failed to aquire lockfile
591 2014-04-07 09:50:01 -- *** ERROR *** failed to aquire lockfile
592 2014-04-07 09:55:01 -- *** ERROR *** failed to aquire lockfile
593 2014-04-07 10:00:01 -- *** ERROR *** failed to aquire lockfile
594 2014-04-07 10:05:01 -- *** ERROR *** failed to aquire lockfile
595 2014-04-07 10:10:01 -- *** ERROR *** failed to aquire lockfile
596 04/07/2014 10:10:21: INFO: /builds/tegra-108/error.flg
597 04/07/2014 10:10:51: INFO: verifyDevice: failing to check SD card
598 reconnecting socket
599 Automation Error: error pushing file: Automation Error: Timeout in command push /mnt/sdcard/writetest 14687
600 Remote Device Error: unable to write to sdcard
Updated•10 years ago
|
QA Contact: armenzg → bugspam.Callek
Updated•10 years ago
|
Comment 13•10 years ago
|
||
replaced SD card, flashed and reimaged tegra
[vle@admin1a.private.scl3 ~]$ telnet tegra-108.tegra.releng.scl3.mozilla.com 20701
Trying 10.26.85.84...
Connected to tegra-108.tegra.releng.scl3.mozilla.com.
Escape character is '^]'.
$>^]q
telnet> q
Comment 14•10 years ago
|
||
And then it did exactly one job, before it got tired and had to lie down.
Comment 15•10 years ago
|
||
Oh, that's the new normal.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 16•10 years ago
|
||
foopy117 is going away, and these tegras are attached to it.
They are being decomissioned in Bug 1043938, and I just disabled all of them in slavealloc and marked them as decom with a comment pointing at Bug 1043938
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•