Closed
Bug 768489
Opened 12 years ago
Closed 12 years ago
talos-r3-w7-038 isn't processing minidumps properly / is crashing extremely frequently during tests
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: billm)
References
Details
(Whiteboard: [buildduty])
Bug 768156, bug 767906 & bug 768483 lack proper stacks & are all from the same slave (talos-r3-w7-038).
Reporter | ||
Comment 1•12 years ago
|
||
It would seem that slave isn't very well.
Of the 38 new [orange] crashes filed in the last 7 days:
https://bugzilla.mozilla.org/buglist.cgi?keywords=crash%2C%20;keywords_type=allwords;list_id=3531381;status_whiteboard_type=allwordssubstr;chfieldto=Now;query_format=advanced;chfield=[Bug%20creation];chfieldfrom=2012-06-19;status_whiteboard=[orange]
20 of them originated from this slave:
https://bugzilla.mozilla.org/buglist.cgi?keywords=crash%2C%20;keywords_type=allwords;list_id=3531378;status_whiteboard_type=allwordssubstr;chfieldto=Now;chfield=[Bug%20creation];query_format=advanced;chfieldfrom=2012-06-19;status_whiteboard=[orange];longdesc=talos-r3-w7-038;longdesc_type=allwordssubstr
Bill, seems like some of those JS crashes may not be real after all :-(
Reporter | ||
Updated•12 years ago
|
Summary: talos-r3-w7-038 isn't processing minidumps properly → talos-r3-w7-038 isn't processing minidumps properly / is crashing extremely frequently during tests
Reporter | ||
Comment 2•12 years ago
|
||
Reporter | ||
Comment 3•12 years ago
|
||
Reporter | ||
Comment 4•12 years ago
|
||
(More unfiled):
https://tbpl.mozilla.org/php/getParsedLog.php?id=12929167&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=12952071&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=12973085&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=12971305&tree=Fx-Team
https://tbpl.mozilla.org/php/getParsedLog.php?id=12981379&tree=Fx-Team
Reporter | ||
Comment 5•12 years ago
|
||
Please may we remove this slave from production.
Severity: major → critical
Assignee | ||
Comment 6•12 years ago
|
||
Actually, we see users who have crashes like and we can never solve them. Could we try to figure out what's wrong with this slave so we can help our users?
Reporter | ||
Comment 7•12 years ago
|
||
Armen, please may you give billm access to the slave after taking it out of production. Thanks! :-)
Updated•12 years ago
|
Blocks: talos-r3-w7-038
Updated•12 years ago
|
Component: Release Engineering → Release Engineering: Machine Management
QA Contact: release → armenzg
Whiteboard: [buildduty]
Comment 8•12 years ago
|
||
I disabled the slave on slavealloc and added a note.
billm, would you be able to look what is wrong with this slave?
Assignee | ||
Comment 9•12 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #8)
> I disabled the slave on slavealloc and added a note.
>
> billm, would you be able to look what is wrong with this slave?
Yes, I'd like to try. At the very least, it would be interesting to run a memory test. It may just be a hardware malfunction. Can you please send me a login?
Comment 10•12 years ago
|
||
I asked IT to get you access.
You should have the machine ready tomorrow morning.
Assignee: nobody → armenzg
Comment 12•12 years ago
|
||
removing this from Critical to Major as it is in dev's hands to work with
Severity: critical → major
Comment 13•12 years ago
|
||
Bill: any update on the status of this slave? Found anything? Are you done with it?
Assignee | ||
Comment 14•12 years ago
|
||
I ran a memory test and didn't find anything. I tried running a few tests and none of them failed.
However, I don't think we can add this slave back to the pool. It will likely just cause more failures.
Comment 15•12 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #14)
> I ran a memory test and didn't find anything. I tried running a few tests
> and none of them failed.
>
> However, I don't think we can add this slave back to the pool. It will
> likely just cause more failures.
OK, thanks for trying. I'll get IT to re-educate the machine in bug 776924.
Comment 16•12 years ago
|
||
This slave has been fixed, at least in theory. If this recurs, please re-open and we'll just decommission this slave.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•