Closed
Bug 823354
Opened 12 years ago
Closed 10 years ago
DMD reports on Fennec don't have stack traces
Categories
(Core :: General, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: kats, Unassigned)
References
Details
(Whiteboard: [MemShrink:P2])
Attachments
(2 files)
I built and ran DMD on fennec using the instructions on the wiki page at https://wiki.mozilla.org/Performance/MemShrink/DMD, but the file that was dumped out looked like the one at http://people.mozilla.com/~kgupta/bug/769761-dmd.txt (that is, no stack traces).
Comment 1•12 years ago
|
||
Interesting. Can you check whether you're building with -funwind-tables? That should be turned on as I read configure.in, but maybe it's not for some reason.
(Just do a clean build and look at one of the invocations of g++.)
Updated•12 years ago
|
Whiteboard: [MemShrink]
Reporter | ||
Comment 2•12 years ago
|
||
Yup, -funwind-tables is being passed to g++
Comment 3•12 years ago
|
||
If I was looking into this, I'd use gdb to look at what happens inside NS_StackWalk and then possibly step into _Unwind_Backtrace, assuming that's called.
Comment 5•12 years ago
|
||
Try adding --es env2 MOZ_LINKER_EXTRACT=1
Reporter | ||
Comment 6•12 years ago
|
||
I still get the same results with that.
Updated•12 years ago
|
Assignee: nobody → bugmail.mozilla
Whiteboard: [MemShrink] → [MemShrink:P2]
Comment 7•12 years ago
|
||
It should be possible to reuse the breakpad unwinding infrastructure that we're using for profiling to fix this (bug 779291)
Comment 8•12 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #7)
> It should be possible to reuse the breakpad unwinding infrastructure that
> we're using for profiling to fix this (bug 779291)
DMD just uses NS_StackWalk. Does NS_StackWalk need to be hooked up to the new unwinder?
Comment 9•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #8)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #7)
> > It should be possible to reuse the breakpad unwinding infrastructure that
> > we're using for profiling to fix this (bug 779291)
>
> DMD just uses NS_StackWalk. Does NS_StackWalk need to be hooked up to the
> new unwinder?
I would expect so.
Reporter | ||
Updated•10 years ago
|
Assignee: bugmail.mozilla → nobody
Comment 10•10 years ago
|
||
kats, are both -fno-omit-frame-pointer and -funwind-tables being used? I ask because I just discovered that using --enable-profiling fixed the problems we had with stack unwinding on Mac opt builds.
Reporter | ||
Comment 11•10 years ago
|
||
The instructions on the wiki page at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD#Fennec_2 no longer seem sufficient to enable DMD. I built with --enable-dmd and started fennec with the right environment variables but didn't see the "DMD is enabled" output anywhere, and dumping memory reports doesn't dump any DMD reports. I'm not sure what changed, do I need to do something special to build the replace-malloc code in?
Comment 12•10 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #11)
> The instructions on the wiki page at
> https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD#Fennec_2 no
> longer seem sufficient to enable DMD. I built with --enable-dmd and started
> fennec with the right environment variables but didn't see the "DMD is
> enabled" output anywhere, and dumping memory reports doesn't dump any DMD
> reports. I'm not sure what changed, do I need to do something special to
> build the replace-malloc code in?
glandium tweaked things a bit... setting $DMD to 1 is no longer necessary at startup; just setting MOZ_REPLACE_MALLOC_LIB to libdmd should suffice. So I don't know what's wrong. glandium, any ideas?
Flags: needinfo?(mh+mozilla)
Comment 13•10 years ago
|
||
It could be a number of things, but without more details, I can't tell. Does logcat say something? Does it say something about libdmd if you run with MOZ_DEBUG_LINKER=1? Try to see if replace_init is called?
Flags: needinfo?(mh+mozilla)
Reporter | ||
Comment 14•10 years ago
|
||
As far as I can tell replace_malloc.c is getting compiled but the init() function in that file is never getting run. I added __android_log_print calls there that never show up in the logcat.
Reporter | ||
Comment 15•10 years ago
|
||
Wait that might not be right. I had linker debugging enabled as well and it might just have flooded the logcat so the print statement got dropped. When I disable linker debugging I see the log. Digging further...
Reporter | ||
Comment 16•10 years ago
|
||
Ah, what appears to be happening is that the MOZ_REPLACE_MALLOC_LIB env var is set at some point after the replace_malloc code is initialized. The env vars are set from Java in setupGeckoEnvironment at http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/GeckoThread.java?rev=b4628cb58bb8#112 before any libraries are loaded from java, but I guess this is too late already.
Reporter | ||
Comment 17•10 years ago
|
||
Appears to be a catch-22. Calling putenv requires loading mozglue, but mozglue assumes that the environment variables are already in place when it is initialized. The only way to break this I can think of is to move the native putenv implementation (http://mxr.mozilla.org/mozilla-central/source/mozglue/android/nsGeckoUtils.cpp#17) into a separate library that gets loaded even before mozglue. sigh.
Reporter | ||
Comment 18•10 years ago
|
||
The only other idea I had was to use the wrapper hook [1] that Android provides to set the environment variable before even starting up fennec but when I tried that it didn't seem to be working. Not sure why, maybe it's just busted in the version of Android I have on my phone.
[1] see for example the latter half of https://staktrace.com/spout/entry.php?id=762 which describes how to use it in the context of using valgrind
Reporter | ||
Comment 19•10 years ago
|
||
Oh! Apparently I need to the setprop stuff as root, not as a regular user. With that I can start Fennec with DMD. I don't have time at the moment but I'll update the wiki instructions with this later (ni to myself so I don't forget).
I pulled a DMD log (this is just with the default DMD option; the only env var I set was MOZ_REPLACE_MALLOC_LIB) and am attaching it. It has some symbol information but the format of the file is different from what I remember so I'll need to figure out if it contains all the info we want (or njn, maybe you can tell just by looking at it).
Flags: needinfo?(bugmail.mozilla)
Comment 20•10 years ago
|
||
Thanks, kats. The output file is now JSON and you pass it to $OBJDIR/dist/bin/dmd.py to get human-readable output. (The docs are now at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD.)
The output looks so-so. Here's one example stack trace:
> #01: replace_malloc[libdmd.so +0x272a]
> #02: malloc[libmozglue.so +0x24c3c]
It's uselessly short, and there are no source locations -- it looks like it needs to be passed through fix_linux_stack.py or a similar script, but I don't know if such a thing exists for Fennec.
Here's a representative one:
> #01: replace_malloc[libdmd.so +0x272a]
> #02: malloc[libmozglue.so +0x24c3c]
> #03: Java_org_mozilla_gecko_mozglue_DirectBufferAllocator_nativeAllocateDirectBuffer[libmozglue.so +0x32446]
> #04: dvmPlatformInvoke[libdvm.so +0x1e294]
> #05: _Z16dvmCallJNIMethodPKjP6JValuePK6MethodP6Thread[libdvm.so +0x4d414]
> #06: ???[libdvm.so +0x276a4]
> #07: _Z12dvmInterpretP6ThreadPK6MethodP6JValue[libdvm.so +0x2b580]
> #08: _Z14dvmCallMethodVP6ThreadPK6MethodP6ObjectbP6JValueSt9__va_list[libdvm.so +0x5fc34]
> #09: _Z13dvmCallMethodP6ThreadPK6MethodP6ObjectP6JValuez[libdvm.so +0x5fc5e]
> #10: ???[libdvm.so +0x547da]
> #11: __thread_entry[libc.so +0xe3dc]
> #12: pthread_create[libc.so +0xdac8]
This one is long enough to be useful, but again no source locations, and the function names are mangled.
So, things are better than they were when the bug was open and stack traces were entirely empty, but the stack traces still aren't all that useful.
Comment 21•10 years ago
|
||
fix_linux_stack.py should work, as long as it can feed the right .so files to addr2line, and as long as the addr2line in the PATH understands arm.
Reporter | ||
Comment 22•10 years ago
|
||
Running dmd.py from my $OBJDIR/dist/bin seemed to work in that it found the libraries and symbolicated stuff. It couldn't do things from libdvm.so and other android libraries, but it did libxul and libmozglue just fine. I've updated the instructions at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD to describe the Android setup. I think that means there's nothing left to do in this bug.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(bugmail.mozilla)
Resolution: --- → FIXED
Comment 23•10 years ago
|
||
Thanks, kats!
You need to log in
before you can comment on or make changes to this bug.
Description
•