Closed Bug 1216681 Opened 9 years ago Closed 9 years ago

Locate symbols with fix_stack_using_bpsyms by UUID, not by file name

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(firefox47 fixed)

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: chmanchester, Assigned: chmanchester)

References

Details

Attachments

(2 files, 2 obsolete files)

fix_stack_using_bpsyms considers more than one file with the same name a fatal error, but we need to get symbols out of the gtest libxul, so we should identify symbol files by UUID instead of just the leaf of the file path. Ted describes how to achieve this in https://bugzilla.mozilla.org/show_bug.cgi?id=992983#c54
At least readelf gives different output. $ obj-dbg/toolkit/crashreporter/google-breakpad/src/tools/linux/dump_syms/dump_syms obj-dbg/dist/bin/libxul.so 2>/dev/null | grep MODULE MODULE Linux x86_64 A2C1FE9C6ED589D569E7C53C2D23F1620 libxul.so $ readelf -n obj-dbg/dist/bin/libxul.so | grep 'Build ID' Build ID: 9cfec1a2d56ed58969e7c53c2d23f1626cf74bc3 Breakpad reads a prefix of what it finds, and maybe applies some other formatting...
Oh, the byte order is swapped before formatting.
That works great, but mac testers don't have otool installed, so I guess I'll revive the standalone fileid tool.
Note that it won't work exactly right for Mac, you'll have to #include "common/mac/file_id.h" and use file_id.MachoIdentifier like: https://chromium.googlesource.com/breakpad/breakpad/+/master/src/common/mac/dump_syms.cc#282
I couldn't figure out how to build it in a way that lets us use std::string on the test machine, but just doing it with c strings looks ok so far: https://treeherder.mozilla.org/#/jobs?repo=try&revision=932cd0097e23&exclusion_profile=false
Attached file MozReview Request: [mq]: foo.patch (deleted) —
Bug 1216681 - Add a fileid utility to extract the breakpad GUID from object files for identification in fix_stack_using_bpsyms. r=ted
Attachment #8677887 - Flags: review?(ted)
It's a little messy, but it works.
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch https://reviewboard.mozilla.org/r/23037/#review21071 Two things: 1) What's the plan for Windows? Do we reliably have dumpbin on the test machines? 2) Either you're going to get bitrotted by bug 1069556 or I'm going to have to fix this patch if I get that landed soon. ::: testing/tools/fileid/mac_fileid.cpp:86 (Diff revision 1) > + if (object_files_count != 1) { This is going to break on Universal builds. You can call NXGetLocalArchInfo + NXFindBestFatArch like dump_syms.cc does: https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/common/mac/dump_syms.cc#474 https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/common/mac/dump_syms.cc#207 ::: testing/tools/fileid/moz.build:29 (Diff revision 1) > + Program('fileid') You can un-duplicate most of this between the two archs.
Attachment #8677887 - Flags: review?(ted)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #9) > Comment on attachment 8677887 [details] > MozReview Request: Bug 1216681 - Add a fileid utility to extract the > breakpad GUID from object files for identification in > fix_stack_using_bpsyms. r=ted > > https://reviewboard.mozilla.org/r/23037/#review21071 > > Two things: > 1) What's the plan for Windows? Do we reliably have dumpbin on the test > machines? I guess this isn't failing on windows, but not for the reason I thought (we're just not dumping the gtest libxul, but we'll want to). Will look into this. > 2) Either you're going to get bitrotted by bug 1069556 or I'm going to have > to fix this patch if I get that landed soon. I have no problem with getting bitrotted if that's close. > > ::: testing/tools/fileid/mac_fileid.cpp:86 > (Diff revision 1) > > + if (object_files_count != 1) { > > This is going to break on Universal builds. You can call NXGetLocalArchInfo > + NXFindBestFatArch like dump_syms.cc does: > https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google- > breakpad/src/common/mac/dump_syms.cc#474 > https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google- > breakpad/src/common/mac/dump_syms.cc#207 Will do. > > ::: testing/tools/fileid/moz.build:29 > (Diff revision 1) > > + Program('fileid') > > You can un-duplicate most of this between the two archs.
Ok, we might be somewhat stuck for windows. Dumpbin comes with Visual C++, so I doubt it's on the testers (although I'm testing that now), and the breakpad stuff for extracting guids on windows work with pdb files, which aren't present on the testers.
It's not that hard to write some straight C++ code to grab this info on Windows. I can give you some pointers.
OK it was slightly harder than I remembered but not terrible: https://gist.github.com/luser/d07ed10249090ec27dc9
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch Bug 1216681 - Add a fileid utility to extract the breakpad GUID from object files for identification in fix_stack_using_bpsyms. r=ted
Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries.
Bug 1216681 - Update mac_fileid.cpp for the breakpad update.
Comment 15 has the gist from comment 13 plugged in to the harness, but it needs a little more work and some testing. The stacks we get on windows don't have full paths, so we need to pass the app dir from each harness, and of course testing windows on try... Comment 14 has the patch for mac and linux with review comments addressed, which will be enough to unblock gtests if we leave getting symbols for the gtest xul on windows to a follow up, and comment 16 is rebased on the breakpad update for whenever that happens.
(In reply to Chris Manchester [:chmanchester] from comment #17) > Comment 15 has the gist from comment 13 plugged in to the harness, but it > needs a little more work and some testing. The stacks we get on windows > don't have full paths, so we need to pass the app dir from each harness, and > of course testing windows on try... Do we get full paths on Linux/Mac? If so, maybe we could just fix the Windows stack printing code to output full paths as well?
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch https://reviewboard.mozilla.org/r/23037/#review21263 ::: testing/tools/fileid/mac_fileid.cpp:35 (Diff revision 2) > + google_breakpad::scoped_array<uint8_t> contents; This is definitely not going to be fantastic in production, libxul is 100+MB in Mac builds. It especially sucks since we only really need to read a few bytes of the Mach-O headers, but I guess the API of FatReader is forcing you into this. I think the best possible solution here is to just call `NXGetLocalArchInfo` and pass its cputype/cpusubtype to `MachoIdentifier`. This could maybe fail in some weird edge-case ways, but I don't think we really care. We're only really running tests on 64-bit systems in production and we have 64/32 universal binaries, so this should work reliably for our purposes. ::: testing/tools/fileid/moz.build:27 (Diff revision 2) > + '../../../toolkit/crashreporter/google-breakpad/src', This would be nicer as an absolute path: ``` LOCAL_INCLUDES += [ '/toolkit/crashreporter/google-breakpad/src', ] ```
Attachment #8677887 - Flags: review?(ted)
https://reviewboard.mozilla.org/r/23727/#review21265 I think if you follow my suggestion in the previous review you can basically drop most of this patch (save the moz.build bit).
https://reviewboard.mozilla.org/r/23725/#review21267 ::: testing/tools/fileid/win_fileid.cpp:3 (Diff revision 1) > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ Given that I wrote this code it would be good to get someone else to give it an r+, even if it's just a rubber-stamp. Maybe jimm would be willing? ::: tools/rb/fix_stack_using_bpsyms.py:150 (Diff revision 1) > -def fixSymbols(line, symbolsDir): > +def fixSymbols(line, symbolsDir, appDir=None): Having to pass the appDir around is pretty awkward. If you can fix the stack printing code to print full paths that would be better. It looks like we fill the filename in here: https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.cpp#804 That struct has some other members that might be more useful than ModuleName: https://msdn.microsoft.com/en-us/library/windows/desktop/ms680193%28v=vs.85%29.aspx LoadedImageName looks the most sensible. However, looking at that struct, I also note that it might already have the ID info we're looking for: PdbSig70 + PdbAge are the same things that fileid_win.cpp are pulling out of the PE headers, so maybe we could just stuff that into MozCodeAddressDetails instead? You'd probably have to add an extra member to the struct: https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.h#85 and then fix the printing function to print it (at the end of the line?) if it's non-empty: https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.cpp#1116
https://reviewboard.mozilla.org/r/23037/#review21263 Thank you for the review and pointers here. > This is definitely not going to be fantastic in production, libxul is 100+MB in Mac builds. It especially sucks since we only really need to read a few bytes of the Mach-O headers, but I guess the API of FatReader is forcing you into this. > > I think the best possible solution here is to just call `NXGetLocalArchInfo` and pass its cputype/cpusubtype to `MachoIdentifier`. This could maybe fail in some weird edge-case ways, but I don't think we really care. We're only really running tests on 64-bit systems in production and we have 64/32 universal binaries, so this should work reliably for our purposes. Agreed, this is not fantastic. I started with passing NXGetLocalArchInfo to MachoIdentifier, but it failed consistently on my macbook and on try because it returned something in the PowerPC range, but x86_64 was the match for the binary. I think putting OS_TEST in DEFINES and calling BreakpadGetArchInfoFromName with that would work, looking at it now symbolstore.py ends up doing that for non-universal binaries, which are the debug builds as far as I can tell, which is where we're having this problem. In practice this will just end up being "x86_64", but like you said this should work reliably for our purposes.
https://reviewboard.mozilla.org/r/23725/#review21267 > Given that I wrote this code it would be good to get someone else to give it an r+, even if it's just a rubber-stamp. Maybe jimm would be willing? I'll ping jimm when I get this working on try. > Having to pass the appDir around is pretty awkward. If you can fix the stack printing code to print full paths that would be better. > > It looks like we fill the filename in here: > https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.cpp#804 > > That struct has some other members that might be more useful than ModuleName: > https://msdn.microsoft.com/en-us/library/windows/desktop/ms680193%28v=vs.85%29.aspx > LoadedImageName looks the most sensible. > > However, looking at that struct, I also note that it might already have the ID info we're looking for: PdbSig70 + PdbAge are the same things that fileid_win.cpp are pulling out of the PE headers, so maybe we could just stuff that into MozCodeAddressDetails instead? You'd probably have to add an extra member to the struct: > https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.h#85 > > and then fix the printing function to print it (at the end of the line?) if it's non-empty: > https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.cpp#1116 Ok, great. I'll look into this -- much better than passing the appDir down from every harness.
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch Bug 1216681 - Add a fileid utility to extract the breakpad GUID from object files for identification in fix_stack_using_bpsyms. r=ted
Attachment #8681040 - Attachment is obsolete: true
Attachment #8681041 - Attachment is obsolete: true
> > That struct has some other members that might be more useful than ModuleName: > > https://msdn.microsoft.com/en-us/library/windows/desktop/ms680193%28v=vs.85%29.aspx > > LoadedImageName looks the most sensible. > > > > However, looking at that struct, I also note that it might already have the ID info we're looking for: PdbSig70 + PdbAge are the same things that fileid_win.cpp are pulling out of the PE headers, so maybe we could just stuff that into MozCodeAddressDetails instead? You'd probably have to add an extra member to the struct: > > https://dxr.mozilla.org/mozilla-central/rev/b41b92c09fcf94d077a54297aea1dc675b161a9d/mozglue/misc/StackWalk.h#85 It looks like the guid is just garbage when the pdb file isn't present, but LoadedImageName has the full path.
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch Bug 1216681 - Add a fileid utility to extract the breakpad GUID from object files for identification in fix_stack_using_bpsyms. r=ted
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch Bug 1216681 - Add a fileid utility to extract the breakpad GUID from object files for identification in fix_stack_using_bpsyms. r=ted fix_stack_using_bpsyms.py locates a .sym file based on file name only, not uuid or full path, which causes a failure if a duplicate leaf file name is introduced. This patch introduces a small utility program on mac and linux to extract a breakpad guid from a shared library or executable to identify the correct symbol file when this ambiguity occurs. A subsequent commit implements this for windows.
Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm This patch introduces a small utility program to extract a guid from a shared library or executable on windows to identify the correct symbol file to read in fix_stack_using_bpsyms.py. In order for this to work correctly on windows, the library name provided by MozDescribeCodeAddress needs to be a full path, so the LoadedImageName field from the IMAGEHLP_MODULE64 structure is used here instead of the ModuleName field.
Attachment #8682814 - Flags: review?(ted)
Attachment #8682814 - Flags: review?(jmathies)
Comment on attachment 8682814 [details] MozReview Request: Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm,ted https://reviewboard.mozilla.org/r/24159/#review22309 I looked over the python script here but I'm just a python hacker so nothing to add. I'm sure ted can provide a more robust review for that. ::: mozglue/misc/StackWalk.cpp:804 (Diff revision 1) > - strncpy(aDetails->library, modInfo.ModuleName, > + strncpy(aDetails->library, modInfo.LoadedImageName, do you want the full path here? ::: testing/tools/fileid/win_fileid.cpp:31 (Diff revision 1) > + fprintf(stderr, "usage: fileid <file>\n"); lets add some output here describing what this tool does so if someone runs it without params, they get a simple explananation about what it's for.
Attachment #8682814 - Flags: review?(jmathies) → review+
https://reviewboard.mozilla.org/r/24159/#review22309 > do you want the full path here? Yes, that's the intention.
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch https://reviewboard.mozilla.org/r/23037/#review26023 ::: tools/rb/fix_stack_using_bpsyms.py:85 (Diff revision 5) > + # We should always be packaged with a "fileid" executable. Does this work reliably on local builds run from dist/bin?
Attachment #8677887 - Flags: review?(ted) → review+
https://reviewboard.mozilla.org/r/23037/#review26023 > Does this work reliably on local builds run from dist/bin? Yes... but I'll double check on os x.
It's likely these failures are fallout from the ec2 mesa upgrade, since retriggers showed the failures on pushes before yours. I closed the tree for all of the fallout from the mesa upgrade, but feel free to reland once it reopens.
Flags: needinfo?(cmanchester)
Depends on: 1237156
Comment on attachment 8682814 [details] MozReview Request: Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm,ted https://reviewboard.mozilla.org/r/24159/#review28999 This looks good. Sorry for letting it sit for so long. ::: testing/tools/fileid/moz.build:35 (Diff revision 1) > - Program('fileid') > +Program('fileid') nit: Can you move this up to the top of the moz.build file while you're here? I find that a bit more sensible.
Attachment #8682814 - Flags: review?(ted) → review+
Attachment #8677887 - Attachment description: MozReview Request: Bug 1216681 - Add a fileid utility to extract the breakpad GUID from object files for identification in fix_stack_using_bpsyms. r=ted → MozReview Request: [mq]: foo.patch
Comment on attachment 8677887 [details] MozReview Request: [mq]: foo.patch Review request updated; see interdiff: https://reviewboard.mozilla.org/r/23037/diff/5-6/
Comment on attachment 8682814 [details] MozReview Request: Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm,ted Review request updated; see interdiff: https://reviewboard.mozilla.org/r/24159/diff/1-2/
Attachment #8682814 - Attachment description: MozReview Request: Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm → MozReview Request: Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm,ted
Comment on attachment 8682814 [details] MozReview Request: Bug 1216681 - Add a windows version of fileid to extract a guid from windows binaries. r=jimm,ted Minor changes to get this working with the breakpad update.
Attachment #8682814 - Flags: review+ → review?(ted)
Attachment #8682814 - Flags: review?(ted) → review+
I guess the `NO_PGO = True` needs to be inside the Windows conditional.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: