Closed Bug 778808 (tegra-039) Opened 12 years ago Closed 11 years ago

tegra-039 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

ARM
Android
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: coop, Unassigned)

References

()

Details

(Whiteboard: [buildduty][buildslaves][capacity][badslave])

No description provided.
Depends on: 778812
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Had to manually reformat its sdcard today, pdu reboot, and clear the error flag. It's back in production though.
Wasn't doing too bad, for a tegra, before it killed itself.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Manually reformatted the sdcard today, pdu rebooted, and cleared the error flag. It's back in production and running builds.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Please reimage - swap its sdcard - and before putting current sd card back in pool run a 2-pass verification over it. (if the card is good, I'd be inclined to say bad device)
Status: RESOLVED → REOPENED
Depends on: 817995
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Not sure what to make of this, but https://tbpl.mozilla.org/php/getParsedLog.php?id=18662219&tree=Profiling and https://tbpl.mozilla.org/php/getParsedLog.php?id=18666092&tree=Fx-Team are it timing out in the same bunch of media tests that are not a current known intermittent.
And if you look at bug 786539, it's a Windows and Mac timeout, with some misstars of Android timing out in the test and then hanging for 2400 seconds, only on this tegra, post-reimage and card swap. Bug 832768? A Linux failure, with a misstar of this slave timing out in media tests. Bug 832283? Entirely this slave. Bug 824309? Entirely this slave. Bug 824307? Entirely this slave. Bug 798440? Entirely this slave. BURN THE WITCH.
Severity: normal → major
Status: RESOLVED → REOPENED
Priority: P3 → --
Resolution: FIXED → ---
Whiteboard: [buildduty][buildslaves][capacity] → [buildduty][buildslaves][capacity][badslave]
(fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -f devices.json -D tegra-039 stop_cp ... [OK] Stopped clientproxy for tegra-039
(mass change: filter on tegraCallek02reboot2013) I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis. --- Command I used (with a manual patch to the fabric script to allow this command) (fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046 048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
now taking jobs
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Or, yeah, instead of burning the witch, we could just let her babysit the children. Either way.
back in production
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Wednesday, October 16, 2013 5:47:47 PM
Status: RESOLVED → REOPENED
Depends on: 944498
Resolution: FIXED → ---
replaced SD and flashed/reimaged
Only took one job before going down, not sure what to do
Depends on: 949447
sd card replaced and reimaged/flashed.
handled in last recovery on Dec 16.
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.