Closed Bug 838431 (tegra-339) Opened 12 years ago Closed 11 years ago

tegra-339 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

ARM
Android

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Callek, Unassigned)

References

()

Details

(Whiteboard: [buildduty][buildslaves][capacity])

No jobs taken on this device for > 3 week (< 6 weeks)
(mass change: filter on tegraCallek02reboot2013) I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis. --- Command I used (with a manual patch to the fabric script to allow this command) (fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046 048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
Depends on: 838687
did a start/stop cp cycle
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
9 days, 19:33:25 since last job
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 858134
recovery didn't help, dunno what to do 2013-04-05 06:20:55 tegra-339 p online active OFFLINE :: error.flg [Automation Error: Unable to connect to device after 5 attempts]
Depends on: 865749
Sending this slave to recovery -->Automated message.
Depends on: 889567
Depends on: 892096
device refuses to keep its SUTAgent up... clint is there anyone on your team (or perhaps you) we can/should hand this device to in order to investigate? (There are a few in the 3xx range that exhibit similar behavior) or do you suspect there will be no human time able to be devoted to this problem anytime soon, and we should just decommission them instead?
Flags: needinfo?(ctalbert)
Wow, there isn't anyone local that can look at these things. I can take a quick look at it next week if we pull it and get it to SF. If you're cool with me pulling it I'll be in MV tomorrow. I can't promise much though.
Flags: needinfo?(ctalbert)
I think if we want to ship to get a non-managers eyes on it, we can do that. I also think it may be about the time we stop wasting too many human resources to investigate these devices (see e-mail). But yea, feel free to pull it tomorrow. (or we can have dcops pull it)
So it never got pulled because I couldn't find it in haxxor. Do we want to pull it next week (when I have the entire A*team in town? Or do we want to just shut off the 3xx series machines that seem to be consistently unreliable?
Product: mozilla.org → Release Engineering
Do you still need me to do anything with this one, Callek?
Flags: needinfo?(bugspam.Callek)
(In reply to Clint Talbert ( :ctalbert ) from comment #10) > Do you still need me to do anything with this one, Callek? Nope, thanks for checking (sorry for delay in reply)
Flags: needinfo?(bugspam.Callek)
Back in production. Last 17 jobs are green.
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
Remote Device Error: unable to write to sdcard
Status: RESOLVED → REOPENED
Depends on: 928492
Resolution: FIXED → ---
16GB SD card has been swopped because tegra wouldn't re-image.
Back in production.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.