Closed
Bug 740440
(tegra-228)
Opened 13 years ago
Closed 10 years ago
tegra-228 problem tracking
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: coop, Unassigned)
References
()
Details
(Whiteboard: [mobile][capacity][buildduty])
tegra-228 has been offline for 7 days according to last-build-per-slave.html, and has indeterminate status after coming back from bug 731339.
Updated•13 years ago
|
Assignee: nobody → bear
Comment 1•13 years ago
|
||
This was brought back up.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment 2•12 years ago
|
||
Had a great long run, but now it's broken - five reds in a row according to https://secure.pub.build.mozilla.org/buildapi/recent/tegra-228
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: tegra-228 problem tracking → [disable me] tegra-228 problem tracking
Updated•12 years ago
|
Assignee: bear → nobody
Comment 3•12 years ago
|
||
Comment 4•12 years ago
|
||
Clubbed senseless after 22.
Summary: [disable me] tegra-228 problem tracking → tegra-228 problem tracking
Comment 5•12 years ago
|
||
DCOps reimaged, start_cp run
Status: REOPENED → RESOLVED
Closed: 13 years ago → 12 years ago
Resolution: --- → FIXED
Comment 6•12 years ago
|
||
Failing every run, please re-disable.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 7•12 years ago
|
||
I'm resolving again since I no longer see the "failing every run" symptoms. We can reopen and send over to recovery if need be though
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 8•12 years ago
|
||
No jobs taken on this device for >= 7 weeks
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 9•12 years ago
|
||
(mass change: filter on tegraCallek02reboot2013)
I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis.
---
Command I used (with a manual patch to the fabric script to allow this command)
(fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046 048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra
The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
Comment 10•12 years ago
|
||
had to cycle clientproxy to bring this back
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 11•12 years ago
|
||
Showing this error its last few runs.
https://tbpl.mozilla.org/php/getParsedLog.php?id=23257749&tree=Mozilla-Inbound
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 13•11 years ago
|
||
Successfully recovered.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Comment 14•11 years ago
|
||
Hitting Bug 722166 on every run. Disabled in slavealloc.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 16•11 years ago
|
||
Re-enabled in slavealloc after work in bug 922822.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 17•11 years ago
|
||
Power cycled, waited a day.
error.flg: Remote Device Error: Unable to properly remove /mnt/sdcard/tests
SD card reformat successful:
$>exec newfs_msdos -F 32 /dev/block/vold/179:9
newfs_msdos: warning, /dev/block/vold/179:9 is not a character device
newfs_msdos: Skipping mount checks
/dev/block/vold/179:9: 15110464 sectors in 236101 FAT32 clusters (32768 bytes/cluster)
bps=512 spc=64 res=32 nft=2 mid=0xf0 spt=16 hds=4 hid=0 bsec=15114240 bspf=1845 rdcl=2 infs=1 bkbs=2
return code [0]
$>exec rebt
$>^]
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 18•11 years ago
|
||
hasn't made it back to production since reformat:
check.sh:
2014-04-09 13:45:04,705 tegra-228 p online active OFFLINE :: error.flg [Automation Error: Unable to connect to device after 5 attempts]
watcher.log:
2026 04/09/2014 13:25:03: DEBUG: calling [ping -c 5 tegra-228]
2027 04/09/2014 13:25:07: INFO: Connecting to: tegra-228
2028 04/09/2014 13:25:07: INFO: INFO: Unable to connect to device after 1 try
2029 04/09/2014 13:25:07: INFO: We're going to sleep for 90 seconds
2030 04/09/2014 13:26:37: INFO: Connecting to: tegra-228
2031 04/09/2014 13:26:37: INFO: INFO: Unable to connect to device after 2 try
2032 04/09/2014 13:26:37: INFO: We're going to sleep for 90 seconds
2033 04/09/2014 13:28:08: INFO: Connecting to: tegra-228
2034 04/09/2014 13:28:08: INFO: INFO: Unable to connect to device after 3 try
2035 04/09/2014 13:28:08: INFO: We're going to sleep for 90 seconds
2036 04/09/2014 13:29:38: INFO: Connecting to: tegra-228
2037 04/09/2014 13:29:38: INFO: INFO: Unable to connect to device after 4 try
2038 04/09/2014 13:29:38: INFO: We're going to sleep for 90 seconds
2039 2014-04-09 13:30:01 -- *** ERROR *** failed to aquire lockfile
2040 04/09/2014 13:31:08: INFO: Connecting to: tegra-228
2041 04/09/2014 13:31:08: INFO: /builds/tegra-228/error.flg
2042 04/09/2014 13:31:38: INFO: verifyDevice: failing to telnet
2043 reconnecting socket
2044 reconnecting socket
2045 reconnecting socket
2046 reconnecting socket
2047 reconnecting socket
2048 Automation Error: Unable to connect to device after 5 attempts
not sure what to do. Sounds like a network issue not a SD/image. Callek, what do we do here?
Flags: needinfo?(bugspam.Callek)
Updated•11 years ago
|
QA Contact: armenzg → bugspam.Callek
Reporter | ||
Comment 19•10 years ago
|
||
Still showing the symptons from comment #18 even after PDU reboot.
Depends on: 1017337
Comment 20•10 years ago
|
||
replaced SD card, flashed and reimaged tegra
[vle@admin1a.private.scl3 ~]$ telnet tegra-228.tegra.releng.scl3.mozilla.com 20701
Trying 10.26.85.178...
Connected to tegra-228.tegra.releng.scl3.mozilla.com.
Escape character is '^]'.
$>^]q
telnet> q
Comment 21•10 years ago
|
||
Still hasn't taken a job, apparently there's something else wrong with it.
Comment 22•10 years ago
|
||
My lack of patience was apparently what was wrong with it.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Comment 23•10 years ago
|
||
Hasn't taken a job for 13 days.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 24•10 years ago
|
||
Disabled in slavealloc to stop the pointless stream of reboots.
Comment 25•10 years ago
|
||
SD card formatted, tegra flashed and reimaged.
vle@vle-10516 ~ $ telnet tegra-228.tegra.releng.scl3.mozilla.com 20701
Trying 10.26.85.178...
Connected to tegra-228.tegra.releng.scl3.mozilla.com.
Escape character is '^]'.
$>^]
telnet> q
Comment 26•10 years ago
|
||
Reenabled.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Flags: needinfo?(bugspam.Callek)
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•