Closed
Bug 561235
Opened 15 years ago
Closed 12 years ago
Make Talos use mozcrash for minidump processing
Categories
(Testing :: Talos, defect, P4)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Unassigned)
References
Details
(Whiteboard: [talos][buildfaster:p2][mozbase])
Attachments
(2 obsolete files)
We could save a lot of time and bandwidth if we only download symbols on crash.
This would require changes to talos, so it could be passed a URL to download and unpack.
Reporter | ||
Updated•15 years ago
|
Summary: Only download symbols on a crash → Talos - only download symbols on a crash
Comment 1•15 years ago
|
||
Still needed if the symbol files are 2.0MB ?
Reporter | ||
Comment 2•15 years ago
|
||
Symbols for OSX are still 15MB. Not as critical to do right away, but would be nice to do soon.
Priority: -- → P4
Reporter | ||
Comment 3•15 years ago
|
||
We should make use of the cgi that's going to be set up to handle this.
Depends on: 561754
Reporter | ||
Updated•15 years ago
|
Comment 4•15 years ago
|
||
I'm going to write some Python code in bug 563745 to do this, so we can probably just copy and paste it to Talos for now.
Comment 5•15 years ago
|
||
I wrote a test script:
http://hg.mozilla.org/users/tmielczarek_mozilla.com/minidump-stackwalk-cgi/file/9c6291a68500/testsubmit.py
and wound up using that code almost verbatim in the unittest harnesses.
Comment 6•13 years ago
|
||
What would it be required to make this happen? My comprehension is limited on this area.
Comment 7•13 years ago
|
||
I think we should go with the approach I described in comment 4 and comment 5. I wrote a CGI that accepts a minidump + a URL to symbols, and produces a stack trace. This way, the slaves don't have to download anything, only the server does (and it can cache symbol so it only has to download them once).
(In reply to comment #7)
> I think we should go with the approach I described in comment 4 and comment
> 5. I wrote a CGI that accepts a minidump + a URL to symbols, and produces a
> stack trace. This way, the slaves don't have to download anything, only the
> server does (and it can cache symbol so it only has to download them once).
I agree that's a fine solution. What are the steps to roll this into production?
Reporter | ||
Comment 9•13 years ago
|
||
(In reply to comment #8)
> (In reply to comment #7)
> > I think we should go with the approach I described in comment 4 and comment
> > 5. I wrote a CGI that accepts a minidump + a URL to symbols, and produces a
> > stack trace. This way, the slaves don't have to download anything, only the
> > server does (and it can cache symbol so it only has to download them once).
>
> I agree that's a fine solution. What are the steps to roll this into
> production?
1) need a host to put said CGI on
2) test harnesses need to know how to interact with the CGI
3) change buildbot code to stop downloading/unpacking of symbols and instead pass symbol URL and CGI URL to test harnesses
4) ???
5) profit!
Comment 10•13 years ago
|
||
2 is fixed for unittests, but not for Talos. Should be easy enough to port the automation.py code to Talos.
Reporter | ||
Comment 11•13 years ago
|
||
(In reply to comment #10)
> 2 is fixed for unittests, but not for Talos. Should be easy enough to port
> the automation.py code to Talos.
I wonder about failover behaviour too here. Should we retry, or specify an alternate server to talk to, or just accept that sometimes the CGI won't be available?
Comment 12•13 years ago
|
||
If we were to fix bug 642167, it might not be a big deal if the CGI doesn't respond.
Reporter | ||
Updated•13 years ago
|
Whiteboard: [talos] → [talos][buildfaster:p2]
Reporter | ||
Updated•13 years ago
|
Component: Release Engineering → Talos
Product: mozilla.org → Testing
QA Contact: release → talos
Summary: Talos - only download symbols on a crash → Talos - send minidumps to stackwalk cgi for processing
Version: other → unspecified
Comment 13•13 years ago
|
||
Porting the unittest code to Talos would just involve reusing the code here:
http://mxr.mozilla.org/mozilla-central/source/build/automationutils.py#103
Updated•13 years ago
|
Assignee: nobody → wlachance
Comment 14•13 years ago
|
||
So here's a first cut at making talos able to use the cgi server for parsing crashdumps. I copied over the code from automationutils.py into a seperate "crashhandler.py" module inside Talos. The idea here is that we might eventually want to factor crash parsing out into MozBase, and it'll be easier to do that if we know that we're using virtually the same code in automationutils.py and talos.
The behaviour for choosing a minidump crash parser is slightly different between automationutils.py and I opted to go with the former's behaviour. If the user wants to do local minidump crash parsing, they'll need to set the MINIDUMP_STACKWALK environment variable to a path to a minidump crash parser. Before, talos would try to guess what platform the user was on and set the minidump parser to the appropriate file checked into talos inside the breakpad subdirectory.
I haven't really though enough about this to know which approach is really "better", so I opted for that of automationutils.py because I guessed it was touched most recently. I may have made the wrong call.
Attachment #552246 -
Flags: review?(ted.mielczarek)
Comment 15•13 years ago
|
||
Comment on attachment 552246 [details] [diff] [review]
Add support to talos for use of CGI crashhandler
I'm not a Talos peer, so you'll probably want Alice to review this. (Also I wrote all the code you copied there, so it would be a bit inappropriate for me to review it!)
Just one note, you're using the poster lib here, you'll need to hg add poster.zip as well.
Attachment #552246 -
Flags: review?(ted.mielczarek) → review?(anodelman)
Comment 16•13 years ago
|
||
This adds poster.zip, required on systems without this package installed
Attachment #552246 -
Attachment is obsolete: true
Attachment #552372 -
Flags: review?(anodelman)
Attachment #552246 -
Flags: review?(anodelman)
Comment 17•13 years ago
|
||
Apparently we're planning to take a different approach to this due to load issues on the buildmaster (Bug 679759). I'll wait til' that cooks, then probably adapt it to Talos. Can hold off on reviews until then.
Comment 18•13 years ago
|
||
Being sick and falling behind on reviews pays off!
Can you remove the review flag until you are ready to go? Otherwise it will keep showing up in my queue.
Comment 19•13 years ago
|
||
Comment on attachment 552372 [details] [diff] [review]
Add support to talos for use of CGI crashhandler (take 2)
Unassigning anode as reviewer
Attachment #552372 -
Flags: review?(anodelman)
Reporter | ||
Updated•13 years ago
|
Summary: Talos - send minidumps to stackwalk cgi for processing → Talos - download symbols on crash as required
Updated•13 years ago
|
Attachment #552372 -
Attachment is obsolete: true
Comment 20•13 years ago
|
||
Yeah, we should implement the same approach used in bug 679759 for Talos.
Reporter | ||
Comment 21•13 years ago
|
||
any progress here?
Comment 22•13 years ago
|
||
AFAIK, no one is actively working on this. If it is a high priority, we should probably figure out someone.
Comment 23•13 years ago
|
||
(In reply to Jeff Hammel [:jhammel] from comment #22)
> AFAIK, no one is actively working on this. If it is a high priority, we
> should probably figure out someone.
If it's not super high priority (and I'm guessing it isn't if we managed to get this far without doing it), it would make a nice first bug for someone.
Comment 24•13 years ago
|
||
It would reduce the run time for each job that does not crash since the 1) download 2) unzip and 3) remove steps for symbols would not be executed.
In other words, it has value to increase slightly our capacity.
Comment 25•13 years ago
|
||
Any progress?
This is blocking some work in RelEng that would reduce load (see bug#561754 for details).
Comment 26•12 years ago
|
||
I am not currently working on this, remove myself to correct that impression.
Assignee: wlachance → nobody
Comment 27•12 years ago
|
||
Do we have a preferred solution for this?
Comment 28•12 years ago
|
||
Yeah, this is implemented in mozcrash. We should just use that.
Summary: Talos - download symbols on crash as required → Make Talos use mozcrash for minidump processing
Comment 29•12 years ago
|
||
Dup of bug 675688 now?
Updated•12 years ago
|
Whiteboard: [talos][buildfaster:p2] → [talos][buildfaster:p2][mozbase]
Comment 30•12 years ago
|
||
(In reply to Ted Mielczarek [:ted] from comment #29)
> Dup of bug 675688 now?
I think bug 675688 should be closed and this bug kept open for the talos side of things.
Comment 33•12 years ago
|
||
Fixed in bug 824984.
You need to log in
before you can comment on or make changes to this bug.
Description
•