Closed
Bug 802240
Opened 12 years ago
Closed 12 years ago
Figure out how to get Android library data from the APK to Breakpad for SPS profiling
Categories
(Core :: mozglue, defect)
Tracking
()
RESOLVED
FIXED
mozilla23
People
(Reporter: ted, Assigned: glandium)
References
(Depends on 1 open bug)
Details
Attachments
(2 files, 5 obsolete files)
(deleted),
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
(deleted),
application/x-bzip
|
Details |
In Julian's patch to make SPS use Breakpad for unwinding we're relying on Breakpad being able to load libraries so it can read symbols out of them. This only works if you have the libraries on disk. For the loading-from-the-APK case we'll need to figure out how to make the library loader and Breakpad meet in the middle to get the symbol data. If we used my "dump symbols from ARM unwind tables" patch and we fixed this then we could profile release Android builds, which would be really cool.
I think the answer here is going to be to expose an API to get the load addresses of the .ARM.exidx and .ARM.extab sections, and to also make sure that we can call directly into the Breakpad methods that convert that data into debug info.
Assignee | ||
Comment 1•12 years ago
|
||
dl_iterate_phdr can give your the Phdrs, where you can find the EXIDX program headers yourself.
Comment 2•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #0)
> I think the answer here is going to be to expose an API to get the load
> addresses of the .ARM.exidx and .ARM.extab sections
I'd prefer something a bit more general, which is that Mike's linker
gives us a pair of functions, to mmap and munmap files in the mapped
APK. Then we can hand those to breakpad and it can use them in
preference to the standard mmap/munmap, to read whatever it wants.
This makes it agnostic to whether we go with EXIDX or CFI or some
combination of both. It would be used to replace the mmap call in
LoadELF in
toolkit/crashreporter/google-breakpad/src/common/linux/dump_symbols.cc
The functionality can be very restricted compared to the standard
mmap:
* MAP_ANON, MAP_PRIVATE only
* mmap gets to choose where to put it
* zero offset in the file
Also, presumably it would need to identify the file by name rather
than by fd.
Comment 3•12 years ago
|
||
Oh, and no partial unmaps necessary. munmap would only have to handle
unmapping precisely the range that just got mapped (I assume -- unless
breakpad is doing wierd stuff w/ mappings.)
Reporter | ||
Comment 4•12 years ago
|
||
Breakpad simply maps the entire file, pokes at it, and then unmaps it.
Assignee | ||
Comment 5•12 years ago
|
||
Julian, can you test if this works for you?
You need to define the following functions where you'll be using them:
extern "C" {
MFBT_API size_t __dl_get_mappable_length(void *handle);
MFBT_API void *__dl_mmap(void *handle, void *addr, size_t length, off_t offset);
MFBT_API void __dl_munmap(void *handle, void *addr, size_t length);
}
handle is what you get from dlopen();
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → mh+mozilla
Comment 6•12 years ago
|
||
Random comment, of no great importance, while reading through the patch:
+void
+CustomElf::MappableMUnmap(void *addr, size_t length) const
+{
+ return mappable->munmap(addr, length);
+}
Inconsistency in return type (void vs implied-not-void?) Can this
return an error code instead of void, so it's possible to check it
didn't fail?
Assignee | ||
Comment 7•12 years ago
|
||
As found through irc debugging, this fails because of MappableDeflate::finalize freeing memory.
Comment 8•12 years ago
|
||
Contents of this patch:
* Mike's linker patch from comment 5
* Fixes to the above that Mike told me over irc
* My changes to the SPS and breakpad that hook up to the linker
changes. These center around the use of the type AltMapFns to
abstractify out how breakpad maps objects so as to read from them.
Basically, local_debug_info_symbolizer.cc creates one of these
AltMapFns objects that hook up to faulty.lib. It then hands it
off to ReadSymbolData, which uses that to do the mapping.
Comment 9•12 years ago
|
||
No functional changes. Is a minorly cleaned up version of
the previous rollup patch -- some debug printing removed,
and a couple of unrelated hunks deleted.
Attachment #733934 -
Attachment is obsolete: true
Comment 10•12 years ago
|
||
Comment on attachment 734841 [details] [diff] [review]
Rollup WIP patch, somewhat cleaned up
Ted, what's your view on the breakpad side changes here?
Specifically, the changes the following 3 files:
toolkit/crashreporter/google-breakpad/src/common/linux/dump_symbols.cc
toolkit/crashreporter/google-breakpad/src/common/linux/dump_symbols.h
tools/profiler/local_debug_info_symbolizer.cc
as summarised in comment 8. Would this approach be considered acceptable?
Attachment #734841 -
Flags: feedback?(ted)
Comment 11•12 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #5)
> Created attachment 729470 [details] [diff] [review]
> Expose an API to mmap the underlying file for a library loaded by faulty.lib
> Julian, can you test if this works for you?
This does indeed seem to work, when using the extra three fixes you
told me on irc. These are marked using //MH-IRC in the rollup patch.
Reporter | ||
Comment 12•12 years ago
|
||
Comment on attachment 734841 [details] [diff] [review]
Rollup WIP patch, somewhat cleaned up
Review of attachment 734841 [details] [diff] [review]:
-----------------------------------------------------------------
I think this would be much simpler if you just exposed ReadSymbolDataInternal as a public API, and did the mapping management in local_debug_info_symbolizer.
In fact, you could already just use that API like the unit tests do, since it's not static. (You just have to define a prototype for it because it's not in the header.)
Attachment #734841 -
Flags: feedback?(ted) → feedback-
Comment 13•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #12)
Good suggestion. That makes the patch about 25% shorter and almost
completely non-intrusive for Breakpad. Could you pls look over:
toolkit/crashreporter/google-breakpad/src/common/linux/dump_symbols.h
tools/profiler/local_debug_info_symbolizer.cc
Attachment #734841 -
Attachment is obsolete: true
Attachment #735400 -
Flags: feedback?(ted)
Reporter | ||
Comment 14•12 years ago
|
||
Comment on attachment 735400 [details] [diff] [review]
Rollup WIP patch, revised as per comment 12
Review of attachment 735400 [details] [diff] [review]:
-----------------------------------------------------------------
This looks a lot better, thanks. I'll land the dump_symbols.h change upstream for you.
Attachment #735400 -
Flags: feedback?(ted) → feedback+
Assignee | ||
Comment 15•12 years ago
|
||
Julian, can you test this? (totally untested locally, besides the fact that it builds)
Assignee | ||
Updated•12 years ago
|
Attachment #729470 -
Attachment is obsolete: true
Assignee | ||
Comment 16•12 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #15)
> Created attachment 736304 [details] [diff] [review]
> Expose an API to mmap the underlying file for a library loaded by faulty.lib
>
> Julian, can you test this? (totally untested locally, besides the fact that
> it builds)
Note, you shouldn't need the __attribute__((weak)) anymore (in fact, you can probably include ElfLoader.h instead of making your own declarations), and the api should now take care of system libraries too (no need for your fallback)
Assignee | ||
Comment 17•12 years ago
|
||
Attachment #736433 -
Flags: review?(nfroyd)
Assignee | ||
Updated•12 years ago
|
Attachment #736304 -
Attachment is obsolete: true
Assignee | ||
Comment 18•12 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #17)
> Created attachment 736433 [details] [diff] [review]
> Expose an API to mmap the underlying file for a library loaded by faulty.lib
Let's already get a review on this. It's kind of awful in that it sucks memory and the restrictions on GetMappableLength and friends are horrible, but it's only a quick hack to have something working. A proper patch will come in a followup.
Updated•12 years ago
|
Attachment #736433 -
Flags: review?(nfroyd) → review+
Comment 19•12 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #15)
> Julian, can you test this? (totally untested locally, besides the fact that
> it builds)
This appears to work OK in the case where it is only used to process
in-APK files.
When passed a non-fully-qualified, not-in-APK file name (eg
"libc.so"), it appears to fail at the __dl_get_mappable_length call --
even though the preceding dlopen call looks like it succeeds:
E/Profiler( 4282): 2013-04-11 20:46:18:
local_debug_info_symbolizer.cc:75: INFO: ReadSymbolData_APK: Unable to
get size for ELF file 'libc.so'
Comment 20•12 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #15)
> Julian, can you test this? (totally untested locally, besides the fact that
> it builds)
If I install the Quit Now extension, and start the profiler, then it
segfaults at exit. It segfault if I don't start the profiler:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 5781]
0xb00055f2 in ?? () from /home/sewardj/MOZ/JimDB/jimdb/lib/0288504044c06217/system/bin/linker
(gdb) where
#0 0xb00055f2 in ?? () from /home/sewardj/MOZ/JimDB/jimdb/lib/0288504044c06217/system/bin/linker
#1 0xb000593e in ?? () from /home/sewardj/MOZ/JimDB/jimdb/lib/0288504044c06217/system/bin/linker
#2 0x5b77158e in ElfLoader::DebuggerHelper::Remove (this=0x5b785258 <ElfLoader::Singleton+64>, map=0x1c3634c)
at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/ElfLoader.cpp:770
#3 0x5b77186a in ElfLoader::Forget (this=0x5b785218 <ElfLoader::Singleton>, handle=<error reading variable: Cannot access memory at address 0xfffffffe>)
at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/ElfLoader.cpp:416
#4 0x5b7728c2 in CustomElf::~CustomElf (this=0x1c36338, __in_chrg=<optimized out>) at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/CustomElf.cpp:237
#5 0x5b772946 in CustomElf::~CustomElf (this=0x1c36338, __in_chrg=<optimized out>) at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/CustomElf.cpp:238
#6 0x5b771ed0 in Release (this=<optimized out>) at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/ElfLoader.h:230
#7 ReleaseDirectRef (this=<optimized out>) at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/ElfLoader.h:153
#8 ElfLoader::~ElfLoader (this=0x5b785218 <ElfLoader::Singleton>, __in_chrg=<optimized out>) at /home/sewardj/MOZ/MC-11-04-2013/mozglue/linker/ElfLoader.cpp:444
#9 0x400b7f84 in __cxa_finalize () from /home/sewardj/MOZ/JimDB/jimdb/lib/0288504044c06217/system/lib/libc.so
#10 0x400b8320 in exit () from /home/sewardj/MOZ/JimDB/jimdb/lib/0288504044c06217/system/lib/libc.so
#11 0x400b8320 in exit () from /home/sewardj/MOZ/JimDB/jimdb/lib/0288504044c06217/system/lib/libc.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Comment 21•12 years ago
|
||
> If I install the Quit Now extension, and start the profiler, then it
> segfaults at exit. It segfault if I don't start the profiler:
^
Duh. That should be "doesn't segfault".
Assignee | ||
Comment 22•12 years ago
|
||
Add DEFINES += -DMOZ_DEBUG_LINKER to mozglue/linker/Makefile.in and attach a logcat of GeckoLinker near the crash.
Comment 23•12 years ago
|
||
(In reply to Julian Seward from comment #19)
> E/Profiler( 4282): 2013-04-11 20:46:18:
> local_debug_info_symbolizer.cc:75: INFO: ReadSymbolData_APK: Unable to
> get size for ELF file 'libc.so'
> [23:31] <glandium> sewardj: edit mozglue/linker/ElfLoader.cpp,
find /system/lib and add a / (/system/lib/)
- systemPath = "/system/lib";
+ systemPath = "/system/lib/";
This doesn't fix the problem. If an incorrect systemPath was the
problem, would the initial dlopen("libc.so") call have succeeded?
Comment 24•12 years ago
|
||
Note, this is also with systemPath = "/system/lib/"; as per
comment 23, and you can see the calls dlopen("libc.so") etc in
the log.
Assignee | ||
Comment 25•12 years ago
|
||
(In reply to Julian Seward from comment #23)
> (In reply to Julian Seward from comment #19)
> > E/Profiler( 4282): 2013-04-11 20:46:18:
> > local_debug_info_symbolizer.cc:75: INFO: ReadSymbolData_APK: Unable to
> > get size for ELF file 'libc.so'
>
> > [23:31] <glandium> sewardj: edit mozglue/linker/ElfLoader.cpp,
> find /system/lib and add a / (/system/lib/)
>
> - systemPath = "/system/lib";
> + systemPath = "/system/lib/";
>
> This doesn't fix the problem.
It does for me:
void * foo = dlopen("libc.so", RTLD_GLOBAL | RTLD_LAZY);
__android_log_print(ANDROID_LOG_ERROR, "GeckoLinker", "__dl_get_mappable_length returns %d", __dl_get_mappable_length(foo));
dlclose(foo);
--> E/GeckoLinker( 5644): __dl_get_mappable_length returns 286500
Assignee | ||
Comment 26•12 years ago
|
||
(In reply to Julian Seward from comment #24)
> Created attachment 736529 [details]
> Complete logcat (.txt.bz2) as per comment 22
>
> Note, this is also with systemPath = "/system/lib/"; as per
> comment 23, and you can see the calls dlopen("libc.so") etc in
> the log.
Ah, but the error is different this time: "Failed to mmap ELF file 'libc.so'" instead of "Unable to
get size for ELF file 'libc.so'"
About the crash, the log shows this:
E/GeckoLinker( 7034): Caught segmentation fault @0x7c3de0e8
That's within the libxul range, after libxul is unloaded. Sounds like the good old "run code after libxul is unloaded", probably triggered by the profiler being on. This doesn't sound related to this patch at all.
Assignee | ||
Comment 27•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #735400 -
Attachment is obsolete: true
Comment 28•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
You need to log in
before you can comment on or make changes to this bug.
Description
•