Closed
Bug 514593
Opened 15 years ago
Closed 15 years ago
SkipList Split up [@ js_Interpret] crash signatures
Categories
(Socorro :: General, task)
Socorro
General
Tracking
(Not tracked)
RESOLVED
FIXED
1.2
People
(Reporter: jruderman, Assigned: lars)
Details
(Whiteboard: [crashkill][crashkill-metrics][1.9.2-fixed])
js_Interpret is a gigantic function. As a single signature, js_Interpret is the #24 topcrash for Firefox 3.5.2, but this isn't really useful for getting bugs fixed.
I can think of two ways to "split it up" server-side, using the line number information that already exists:
* Make the line number be part of the signature, e.g. js_Interpret:4546.
* Look at the source code to find the most recent BEGIN_CASE line. For example, the section after BEGIN_CASE(JSOP_CALLPROP) might be given the fake signature js_Interpret_JSOP_CALLPROP.
The latter would be more stable against changes to the code, but would require the server to have a copy of the relevant version of jsinterp.cpp.
Reporter | ||
Updated•15 years ago
|
Summary: Split up js_Interpret crash signatures → Split up [@ js_Interpret] crash signatures
Comment 1•15 years ago
|
||
The latter sounds really complicated. The former sounds more workable, but is going to really segregate out your crashes. I suppose that might be ok, since crashes on different lines are unlikely to have the same root cause.
Bug 510505 might help here, if we could figure out an inexpensive way to get the current opcode into a skidmark.
Alternatively, we could maybe have BEGIN_CASE do some symbol trickery to give socorro something to chew on?
Comment 3•15 years ago
|
||
(In reply to comment #2)
> Alternatively, we could maybe have BEGIN_CASE do some symbol trickery to give
> socorro something to chew on?
I was thinking about this, but I'm unaware of any way to convince the compiler to output a different method name for certain blocks of code, aside from actually splitting them into separate methods and forcing them to be inlined. I also thought about trying to post-process the debug symbols, but that started to make my head hurt.
How does socorro deal with inlines? Those are both within the same function, nested, and it seems like we could do something similar here with a synthetic _begin_socorro_region_JSOP_FOO/_end_socorro_region_JSOP_FOO symbol pair.
Not sure how to get that done on Windows, but I think there are some directives in gcc that would help us.
Alternatively (heh) we could use the valgrind trick of special nop or jump-over-nop sequences to put sentinels in the code that we scan with the socorro tools.
Or we could figure out how to reuse the address-of-label stuff we use for token threading, but that only works on GCC too.
Comment 5•15 years ago
|
||
The Breakpad symbol dumping tools are generally inline-aware, although Linux stabs may still have some bugs. If you can point out a specific method that gets inlined (and a method calling it), I could show you what it looks like in the symbol files. The Breakpad symbols are just a dump of what the compiler's debug info tells us, so modulo post-processing (which would be a bit painful), we're limited to what the compiler produces.
The Breakpad symbol format basically looks like:
FUNC address size name
lineaddress size line file
The line data can specify a new file at any point inside the function. Post-processing to break it up would require recalculating the function size, and adding synthetic functions with the appropriate address+size.
Also, jimb wrote up some nice docs on the symbol file format if you care to know more:
http://code.google.com/p/google-breakpad/wiki/SymbolFiles
Reporter | ||
Comment 6•15 years ago
|
||
Bug 507723 changed where js_Interpret code lives. Now much of it is in jsops.cpp. See e.g. reed's crash:
http://crash-stats.mozilla.com/report/index/3e9f5932-2d9b-4fd5-aaad-8de102090907
Comment 7•15 years ago
|
||
(In reply to comment #5)
> The Breakpad symbol dumping tools are generally inline-aware, although Linux
> stabs may still have some bugs.
The Linux STABS dumper doesn't deal with inlined functions at all.
The new DWARF-on-Linux reader doesn't deal with them yet, but it's on the agenda.
Comment 8•15 years ago
|
||
I think the line number idea is good enough for a start. If we find that it is splitting things up too much, we can deal with that later.
Reporter | ||
Comment 9•15 years ago
|
||
Bug 519363 comment 1 lists the top five crash points in js_Interpret for 1.9.1.
Updated•15 years ago
|
Whiteboard: [crashkill][crashkill-metrics]
Comment 10•15 years ago
|
||
We really need to get on this; @js_Interpret is the Firefox 3.6 Beta 1 topcrash by an order of magnitude.
Comment 11•15 years ago
|
||
(In reply to comment #10)
> We really need to get on this; @js_Interpret is the Firefox 3.6 Beta 1 topcrash
> by an order of magnitude.
I looked at a sample of the "last" 500 of these in FF3.6b1 just now. I found that 95% were covered by bug 525028 and bug 519363. Both are already marked blocking 1.9.2 so we just need to get them landed.
Comment 12•15 years ago
|
||
If we want to make sure that we're not shipping new js_interp regressions, we need this before we ship. Marking this as blocking 1.9.2+ so we can argue about it. From what I'm hearing from dmandelin, we need this to know.
Flags: blocking1.9.2+
Updated•15 years ago
|
Summary: Split up [@ js_Interpret] crash signatures → SkipList Split up [@ js_Interpret] crash signatures
Comment 13•15 years ago
|
||
I forgot to mention earlier that another reason this is important is that the core-count/dll/add-on distribution analyses don't produce useful data if all js_Interpret crashes are grouped together, but might tell us some good stuff once they are split.
Comment 14•15 years ago
|
||
Another note: the "SkipList" in the title suggests that this is about adding js_Interpret to the SkipList, but that is not what we want, and won't help. Rather, we want the line number of the crash point added to the signature.
Assignee | ||
Comment 15•15 years ago
|
||
For example, in https://crash-stats.mozilla.com/report/index/12914933-360e-4d39-a05a-b7d422091123 the signature would be: "js_Interpret:4436" and in https://crash-stats.mozilla.com/report/index/47b10c2a-75ec-4192-99c2-3c6692091123 it would be "js_Interpret:3046"
confirm this for me and I'll get on it.
Comment 16•15 years ago
|
||
(In reply to comment #15)
> For example, in
> https://crash-stats.mozilla.com/report/index/12914933-360e-4d39-a05a-b7d422091123
> the signature would be: "js_Interpret:4436" and in
> https://crash-stats.mozilla.com/report/index/47b10c2a-75ec-4192-99c2-3c6692091123
> it would be "js_Interpret:3046"
>
> confirm this for me and I'll get on it.
That looks great.
Assignee | ||
Comment 17•15 years ago
|
||
ok, I've got this implemented and working on my dev system. I'm going to get it pushed on to staging tomorrow, but there is going to have to be some fancy dancing to do it. There is another change working its way through staging/production push, but it is stalled because of an yet unresolved problem. It is likely we're going have to back out that change to push this change through. I will keep this thread updated on progress...
Assignee | ||
Updated•15 years ago
|
Target Milestone: --- → 1.2
Updated•15 years ago
|
Assignee: nobody → lars
Does this add line numbers to all signatures? Or just to js_Interpret? I'm somewhat worried that adding it to all functions is going to make analysis across versions harder.
It would be great if we could just have a whitelist of functions to which the line number was included. And let that whitelist just consist of js_Interpret for now.
Though this is all IMHO, others might disagree.
This should just be to js_Interpret.
Assignee | ||
Comment 20•15 years ago
|
||
The processor uses a configurable parameter called 'signaturesWithLineNumbersRegEx' to identify signatures that must be combined with their associated line numbers. Currently, the default is just 'js_Interpret'. If you have others to add, it is just a configuration change.
This feature is on track to be added to the production instance of Socorro in our first coordinated release Thursday night.
We're sorry this has taken so long to get into production. As our group has increased in size and scope, we've had to move away from the cowboy software production model to a more coordinated system. This new structured release schedule will keep our group from stepping on our own proverbial toes.
Thanks Lars, you guys rock!
Assignee | ||
Comment 22•15 years ago
|
||
this is complete and in production. For an example, see this one:
http://crash-stats.mozilla.com/report/index/5a4d380b-7951-4191-86ca-6e0662091203
the new signatures should start showing up in trend reports in the next few hours (it's 2009-12-03 22:30pst now).
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
Whiteboard: [crashkill][crashkill-metrics] → [crashkill][crashkill-metrics][1.9.2-fixed]
Updated•13 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•