Closed
Bug 866937
Opened 12 years ago
Closed 9 years ago
B2G crash stacks are missing symbol information
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: jrmuizel, Unassigned)
References
Details
(Whiteboard: [leave-open])
Attachments
(1 file)
(deleted),
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
If you look here you'll see we're missing symbols for libui and libc
Updated•12 years ago
|
Component: Release Engineering → Release Engineering: Automation (General)
QA Contact: catlee
Comment 1•12 years ago
|
||
Jeff, do you have an example crash and hopefully some details of the build the crash came from ?
This came from a recent unagi nightly
http://symbols.mozilla.org/b2g/b2g-18.0-Android-20130429070204-arm-symbols.txt
and contains libui.so and libc.so
For kicks, both m-b2g18 and mozilla-b2g18_v1_0_1 are posting manifests with the exact filename. They'll be overwriting each other, but we'll get the union of the symbols sets.
OS: Mac OS X → Gonk (Firefox OS)
Comment 2•12 years ago
|
||
I think the "here" where comment 0 meant you to look was at bug 818103, where we crash emulator test runs (or hang them, it's never been entirely clear) all day long, always like https://tbpl.mozilla.org/php/getParsedLog.php?id=22381622&tree=Mozilla-Central at libc.so + 0xdc04
Comment 3•12 years ago
|
||
The emulator images are prebuilt, AIUI, so we don't have symbols for them in the symbol packages that get uploaded with the builds. We only have symbols for Gecko. This basically boils down to bug 528231. :-/
Comment 4•12 years ago
|
||
That's not the whole story, because the call to the test harness has
--symbols-path <url to gecko symbols>
and what's missing is the emulator symbols. Those appear to present inside the emulator package already, for example
b2g-distro/out/target/product/generic/symbols/system/lib/libc.so
Could we teach the test harness to take more than one --symbols-path ?
Comment 5•12 years ago
|
||
Apparently today is not a proof reading day, because ted said that already. I think the point about non-stripped binaries in b2g-distro/out/target/product/generic/symbols stands though, and the question that follows from that.
Comment 6•12 years ago
|
||
We would have to dump those into Breakpad format to make them usable, currently. If we can make that happen I'd be happy to teach the harness how to take an extra symbols path.
Comment 7•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #6)
> We would have to dump those into Breakpad format to make them usable,
> currently. If we can make that happen I'd be happy to teach the harness how
> to take an extra symbols path.
What's involved in dumping those symbols in breakpad format?
Comment 8•12 years ago
|
||
It's mostly just running dump_syms on every binary in $(PRODUCT_OUT)/symbols) from the build. If you have a B2G build dir you can just "./build.sh buildsymbols" and grab the symbols.zip from gecko-objdir/dist.
Comment 9•12 years ago
|
||
In order for this to work with the way that emulators are currently used in buildbot, we'd need to produce the buildsymbols in Jenkins, upload them to tooltool along with the emulator they came from, and download them in the relevant mozharness scripts.
I'm fine with that, but I think we may be getting full-stack emulator builds in buildbot soon-ish (bug 807792) which would make that work obsolete, so I'm tempted just to let this slide until that happens.
Comment 10•12 years ago
|
||
That doesn't sound like too much work.. We already need to copy the emulator to tooltool manually, I don't think copying one extra file at the same time will make things much more complicated than they already are.
I guess it depends on:
1) When will bug 807792 be finished? (catlee's last comment was that he hasn't had a chance to look at it yet)
2) Is this blocking bug 818103? If so, might be worth doing sooner rather than later.
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Comment 11•11 years ago
|
||
All of the debug B2G tests are failing now, and crashing in libc with no stack trace, e.g., https://tbpl.mozilla.org/php/getParsedLog.php?id=29445946&tree=Cedar&full=1
:ahal, do we know what to do to fix this?
Comment 12•11 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #8)
> It's mostly just running dump_syms on every binary in
> $(PRODUCT_OUT)/symbols) from the build. If you have a B2G build dir you can
> just "./build.sh buildsymbols" and grab the symbols.zip from
> gecko-objdir/dist.
I don't know any more than what :ted mentioned in comment 6 and comment 8. It doesn't sound too difficult of a problem though, and if you want me to handle the multiple --symbols-path options once we have the binaries in breakpad format I can look into it.
Comment 13•11 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #12)
> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #8)
> > It's mostly just running dump_syms on every binary in
> > $(PRODUCT_OUT)/symbols) from the build. If you have a B2G build dir you can
> > just "./build.sh buildsymbols" and grab the symbols.zip from
> > gecko-objdir/dist.
>
> I don't know any more than what :ted mentioned in comment 6 and comment 8.
> It doesn't sound too difficult of a problem though, and if you want me to
> handle the multiple --symbols-path options once we have the binaries in
> breakpad format I can look into it.
That would be great, thanks.
Comment 14•11 years ago
|
||
Comment 11 doesn't show "missing symbols", it shows a complete lack of a minidump. It's hitting a MOZ_ASSERT, which should be a totally safe reproducible crash. It doesn't look like we're catching it at all, which leads me to believe that Breakpad is not enabled for some reason.
From that log:
10:24:05 INFO - 10-21 17:24:04.690 45 45 F MOZ_Assert: Assertion failure: mState != hal::SWITCH_STATE_UNKNOWN, at ../../../gecko/dom/system/gonk/AudioChannelManager.h:52
10:24:05 ERROR - 10-21 17:24:04.690 45 45 F libc : Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1)
10:24:05 ERROR - This usually indicates the B2G process has crashed
Comment 15•11 years ago
|
||
I think breakpad is enabled [1], but it's possible that check_for_crashes isn't getting called anymore for some reason. I added some debugging info [2] to cedar which should help us figure out what's going on.
[1] https://github.com/mozilla/mozbase/blob/master/mozrunner/mozrunner/remote.py#L96
[2] https://hg.mozilla.org/projects/cedar/rev/18f3bc305b50
Comment 16•11 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #11)
> All of the debug B2G tests are failing now, and crashing in libc with no
> stack trace, e.g.,
> https://tbpl.mozilla.org/php/getParsedLog.php?id=29445946&tree=Cedar&full=1
>
> :ahal, do we know what to do to fix this?
This is bug 929139. I've asked dougt to review mchen's patch there in baku's absence.
Comment 17•11 years ago
|
||
I did some debugging on Cedar. First, we aren't checking for crashes in all the places we should be, so I fixed that. Though even with that fixed there still aren't any minidumps being found (ctrl-f for "checking for crashes"):
https://tbpl.mozilla.org/php/getParsedLog.php?id=29518599&tree=Cedar&full=1
In this case, it looks like the crash is happening before we restart the b2g process with the env variables in comment 15, so ted's assessment that the crashreporter isn't enabled is likely correct. For this particular instance we can either:
1) try to pass in an environment to the initial b2g process on emulator startup (not sure if this is possible)
2) enable it in the builds
3) live without stacks for startup crashes
Comment 18•11 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #17)
> I did some debugging on Cedar. First, we aren't checking for crashes in all
> the places we should be, so I fixed that. Though even with that fixed there
> still aren't any minidumps being found (ctrl-f for "checking for crashes"):
> https://tbpl.mozilla.org/php/getParsedLog.php?id=29518599&tree=Cedar&full=1
>
> In this case, it looks like the crash is happening before we restart the b2g
> process with the env variables in comment 15, so ted's assessment that the
> crashreporter isn't enabled is likely correct. For this particular instance
> we can either:
>
> 1) try to pass in an environment to the initial b2g process on emulator
> startup (not sure if this is possible)
> 2) enable it in the builds
> 3) live without stacks for startup crashes
Ouch! We can't do 1) without modifying the build, so 1) and 2) are effectively the same. Would there be any disadvantage to enabling the crash report in the build, for engineering builds?
Flags: needinfo?(ted)
Comment 19•11 years ago
|
||
ahal and I discussed this on IRC. bug 717538 enabled crash reporting by default for all non-debug builds. We left debug builds out because crash reporting interferes with debugging, so we didn't want to make developers' lives harder. I'd be fine with changing that default on B2G if it solves this problem and isn't a big inconvenience to developers. (You can still set MOZ_CRASHREPORTER_DISABLE=1 to turn it off.)
Flags: needinfo?(ted)
Comment 20•11 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #19)
> ahal and I discussed this on IRC. bug 717538 enabled crash reporting by
> default for all non-debug builds. We left debug builds out because crash
> reporting interferes with debugging, so we didn't want to make developers'
> lives harder. I'd be fine with changing that default on B2G if it solves
> this problem and isn't a big inconvenience to developers. (You can still set
> MOZ_CRASHREPORTER_DISABLE=1 to turn it off.)
I think that would be acceptable, as long as we document it well.
Comment 21•11 years ago
|
||
What's the status here? Is it possible to get symbols and stacks on debug builds?
Comment 22•11 years ago
|
||
(In reply to Gregor Wagner [:gwagner] from comment #21)
> What's the status here? Is it possible to get symbols and stacks on debug
> builds?
Is there a crash that this is holding up investigation for? It's hard to tell what is and isn't working at this point, but I believe we should be getting stacks on debug builds that happen during a test run. It's only if the emulator crashes before the test harness has a chance to manually turn crash reporting on that we wouldn't get symbols. Of course that's in theory.
But yes, we should enable it by default on debug builds either way. I'll figure out how to do that.
Comment 23•11 years ago
|
||
nsExceptionHandler seemed like the better place to fix this. Otherwise we'd be overriding an override.
Attachment #8339501 -
Flags: review?(ted)
Comment 24•11 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #22)
> (In reply to Gregor Wagner [:gwagner] from comment #21)
> > What's the status here? Is it possible to get symbols and stacks on debug
> > builds?
>
> Is there a crash that this is holding up investigation for? It's hard to
> tell what is and isn't working at this point, but I believe we should be
> getting stacks on debug builds that happen during a test run. It's only if
> the emulator crashes before the test harness has a chance to manually turn
> crash reporting on that we wouldn't get symbols. Of course that's in theory.
>
> But yes, we should enable it by default on debug builds either way. I'll
> figure out how to do that.
I am looking for example at the debug emulator with marionette:
https://tbpl.mozilla.org/php/getParsedLog.php?id=31127573&tree=Pine&full=1#error75
It would be nice to have a stack here.
Comment 25•11 years ago
|
||
Ah, it's possible that marionette isn't setting MOZ_CRASHREPORTER=1 anywhere. If that's the case, the above patch should fix it.
Comment 26•11 years ago
|
||
Comment on attachment 8339501 [details] [diff] [review]
Patch 1.0 - enable crashreporter by default on b2g debug builds
Review of attachment 8339501 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/crashreporter/nsExceptionHandler.cpp
@@ +807,5 @@
>
> +#if !defined(DEBUG) || defined(MOZ_WIDGET_GONK)
> + // In non-debug builds, enable the crash reporter by default, and allow
> + // disabling it with the MOZ_CRASHREPORTER_DISABLE environment variable.
> + // Also enable it by default in debug b2g builds as it is difficult to
Might want to say "gonk" instead of "b2g" here since that's what you're using.
Attachment #8339501 -
Flags: review?(ted) → review+
Comment 27•11 years ago
|
||
This didn't do the trick here:
https://tbpl.mozilla.org/php/getParsedLog.php?id=31190112&tree=Pine&full=1
Comment 28•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4231aceecfe0
Leaving the bug open because of comment 27 and because I'm still not clear which symbols are actually missing.
Whiteboard: [leave-open]
Comment 29•11 years ago
|
||
(In reply to Gregor Wagner [:gwagner] from comment #27)
> This didn't do the trick here:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=31190112&tree=Pine&full=1
Are you sure that's a crash? There's no segfault in the logcat anymore.
Comment 30•11 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #29)
> (In reply to Gregor Wagner [:gwagner] from comment #27)
> > This didn't do the trick here:
> > https://tbpl.mozilla.org/php/getParsedLog.php?id=31190112&tree=Pine&full=1
>
> Are you sure that's a crash? There's no segfault in the logcat anymore.
Grep for MOZ_CRASH. I don't know why it's not linked any more.
Comment 31•11 years ago
|
||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Assignee | ||
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•