Closed Bug 1448957 Opened 7 years ago Closed 7 years ago

Prefix signature isn't working with additional "static bool" prefix on a frame signature

Categories

(Socorro :: Processor, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mccr8, Assigned: willkg)

References

Details

Attachments

(1 file)

prefix_signature_re.txt contains this line: mozilla::SpinEventLoopUntil<T>, which worked fine as seen in bug 1411908, giving it a signature of [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::dom::quota::QuotaManager::ShutdownObserver::Observe ]. However, at some point recently, the signature for the SpinEventLoopUntil frame changed to "static bool mozilla::SpinEventLoopUntil<T>" as seen in bug 1448637, but for some reason the prefix regexp isn't matching it any more. I've seen this "static bool" added onto a few other places that I can't recall off the top of my head, so I don't know if that's the new part or what. But it seems like whatever the reason, the signature in bug 1448637 should have SpinEventLoopUntil as a prefix with the current prefix_signature_re.txt file.
A multi-paragraph iteration of the description because it's hard to parse: """ prefix_signature_re.txt contains this line: mozilla::SpinEventLoopUntil<T> which worked fine as seen in bug 1411908, giving it a signature of: [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::dom::quota::QuotaManager::ShutdownObserver::Observe ] However, at some point recently, the signature for the SpinEventLoopUntil frame changed to: static bool mozilla::SpinEventLoopUntil<T> as seen in bug 1448637, but for some reason the prefix regexp isn't matching it any more. I've seen this "static bool" added onto a few other places that I can't recall off the top of my head, so I don't know if that's the new part or what. But it seems like whatever the reason, the signature in bug 1448637 should have SpinEventLoopUntil as a prefix with the current prefix_signature_re.txt file. """ Here's an example crash from bug #1448637: https://crash-stats.mozilla.com/report/index/a9e71c13-73dd-4d48-82b2-8281f0180324 This seems puzzling to me. I don't know why "static bool" is showing up. Peter, Ted: Does it seem right for mdsw to get back "static bool mozilla::SpinEventLoopUntil<T>" from symbolication?
Flags: needinfo?(ted)
Flags: needinfo?(peterbe)
My best guess would be that a compiler update (bug 1408789) changed the signature here. It should be fine to just optionally match and remove the leading `static bool`. This does lead me to wonder whether other signatures were impacted, though. I don't have time to dig into this currently, I'll be on PTO for a week starting tomorrow. It should be possible to figure out when this signature changed by grepping for `mozilla::SpinEventLoopUntil` in the xul.sym file for individual nightly builds. Once the build with the change is found, we could take the xul.sym from the previous nightly and the xul.sym from the nightly with the change, grep out all the function names from each (`grep ^FUNC xul.sym | cut -d' ' -f5- | sort -u`) and diff the two lists to see what else has changed.
Flags: needinfo?(ted)
"arena_t::DallocSmall | static void arena_dalloc" is another signature affected by this.
(In reply to Andrew McCreight [:mccr8] from comment #4) > "arena_t::DallocSmall | static void arena_dalloc" is another signature > affected by this. Specifically, it is a crash signature that I came across while looking through my bug mail. Bug 1449241 and some other dupes.
Calixte: Can you help out with what Ted describes in Comment 3?
Flags: needinfo?(cdenizet)
I wrote a python script (https://github.com/calixteman/bazar/blob/master/syms.py) to get the xul.pdb symbols for different nightlies (in making a bisection) and I can confirm that "static bool mozilla::SpinEventLoopUntil" appeared in build 20180313100127. I can provide a list of functions which have 'static' in their signature in build 20180313100127 and not in the previous build (there are 27454 such functions).
Flags: needinfo?(cdenizet)
Bug 1423261 also landed in the regression time frame, and there is a patch in there that says "Fix the way we annotate crash reports". Is it possible that caused the change? Otherwise I guess it must be the compiler update as noted in Comment 3.
It's from the compiler update in bug 1424281.
It might be useful to see the mangled names before and after the compiler change. Just on the off chance that MS introduced some new mangling token but we're using old tools that somehow interpret it as "static".
(In reply to David Major [:dmajor] from comment #10) > It might be useful to see the mangled names before and after the compiler > change. Just on the off chance that MS introduced some new mangling token > but we're using old tools that somehow interpret it as "static". The Breakpad symbol dumping code uses DIA and just calls `get_undecoratedNameEx`: https://dxr.mozilla.org/mozilla-central/rev/c23c7481957f7b982cffc0ce1d25979c69ca2c2f/toolkit/crashreporter/google-breakpad/src/common/windows/pdb_source_line_writer.cc#961
So interestingly, I downloaded the last win32 nightly before the compiler update: https://archive.mozilla.org/pub/firefox/nightly/2018/03/2018-03-12-22-00-41-mozilla-central/firefox-61.0a1.en-US.win32.txt and the first nightly after the compiler update: https://archive.mozilla.org/pub/firefox/nightly/2018/03/2018-03-13-10-01-27-mozilla-central/firefox-61.0a1.en-US.win32.txt and then fetched the matching PDB files from the symbol server using symchk. I initially couldn't reproduce this problem until I realized that I was using an older version of the DIA SDK DLLs (which is what dump_syms uses to read PDB files). Once I changed things to use the version in my up-to-date VS2017 install I got the same results with `static bool` in the function names. Interestingly once I did that, dumping symbols from the xul.pdb of the build *before* the compiler update produced the same results! Whatever has changed here it's clearly a change in the DIA SDK and not the compiler itself. It looks like with the older DIA version the `get_undecoratedNameEx` call would fail on these symbols and we'd get the function names we were previously. With the new version it succeeds and we get the `static bool` stuff. If I had to guess I'd say that someone fixed a bug in DIA or some code that it uses and it just happens to change the function names we get.
Flags: needinfo?(peterbe)
Blocks: 1411908
Seems like symbolication is going to return "static bool" (and friends) going forward and that's ok. I see two ways to fix signatures: 1. Nix "static bool" (and friends) from the name, then proceed with signature generation as is 2. Keep "static bool" (and friends) in the names and fix signature generation prefix lists and all that to handle possibly matching on things like "mozilla::SpinEventLoopUntil<T>" and "static bool mozilla::SpinEventLoopUntil<T>" Is "static bool" important to keep around? I don't know if there are other types of mozilla::SpinEventLoopUntil<T> requiring us to differentiate between them or not in the signature. Ted, Andrew: Which would you prefer?
Flags: needinfo?(ted)
Flags: needinfo?(continuation)
I don't think it's important to keep the `static bool`, no. calixte: it'd be interesting to see the full list of changes so we could see if there are other patterns here we should consider.
Flags: needinfo?(ted)
Ok. I'll write some code to drop "static bool" and friends from the beginning of names the signature generation files are taken into account. For example: static bool mozilla::SpinEventLoopUntil<t> will get converted to mozilla::SpinEventLoopUntil<t> and then run through the the code that applies prefix and other signature list files. After that, I can reprocess affected crashes.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P1
That sounds reasonable to me, thanks.
Flags: needinfo?(continuation)
I have some edge cases I'm not sure about: 1. bp-4f34920f-2319-4631-9527-0b31d0180411 static class JSObject* Reify Should signature generation drop "static class JSObject*" leaving just "Reify"? 2. bp-2bcbf0b4-8573-459c-9977-816f90180411 static HRESULT `anonymous namespace'::get_default_endpoint Should signature generation drop "static HRESULT"? 3. bp-e3e6d763-084c-4902-9722-cea290180411 static void std::panicking::rust_panic_with_hook This one is funny--if we drop "static void", then it skips a bunch of frames until it hits this one: static <NoType> core::result::unwrap_failed<webrender::device::ShaderError>(struct str*, union webrender::device::ShaderError) Should we drop "static <NoType>" there?
Calixte, Andrew, Ted: Thoughts? ^^^
Flags: needinfo?(ted)
Flags: needinfo?(continuation)
Flags: needinfo?(cdenizet)
I think it is okay to drop the return type. I can't think of many situations where we want to distinguish methods with the same name based on their return type. > This one is funny--if we drop "static void", then it skips a bunch of frames until it hits this one: Yeah, that's good! All of those frames are generic Rust code for runtime assertions, so we want them gone from signatures.
Flags: needinfo?(continuation)
Here are some example frames I'm seeing from yesterday: * static bool js::jit::TryAttachCallStub * static bool mozilla::SpinEventLoopUntil<T> * static class js::NativeObject* CallTraceHook<T> * static class nsIFrame* GetNearestFrameContainingPresShell * static class std::basic_string<T>& const std::basic_string<T>::_Reallocate_for<T> * static js::jit::EnterJitStatus EnterJit * static js::jit::JitExecStatus EnterBaseline * static nsresult nsDOMCSSDeclaration::ModifyDeclaration<T> * static struct smallvec::SmallVec<T>* hashglobe::hash_map::Entry<T>::or_insert_with<T> smallvec::SmallVec<T> * static void `anonymous namespace'::HangMonitorParent::UpdateMinidump * static void mozilla::dom::indexedDB::`anonymous namespace'::ConnectionPool::NoteIdleDatabase * static void proxy_Finalize I suspect pulling out specifiers will be straight forward. There's a variety of return types, so I'm not sure how to deal with those. Maintaining a list seems like a bad idea, but maybe it's "good enough"? I'm not sure I can reliably parse them out any other way. If anyone has ideas, I'm all ears. I'll mull over this some more over the weekend.
Flags: needinfo?(ted)
Flags: needinfo?(cdenizet)
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #20) > There's a variety of return types, so I'm not sure how to deal with those. > Maintaining a list seems like a bad idea, but maybe it's "good enough"? I'm > not sure I can reliably parse them out any other way. If option 1 from comment 15 is too difficult, maybe try option 2? I don't really care if the static return type is there, I just want the regexps to match. That might require some more up front work, but in the future people can just add new things to those lists with the static stuff built in already.
I think I got a good way to do option 2, so I'm going to go that route. Here's the crash from the description: you@processor:/app$ python -m socorro.signature a9e71c13-73dd-4d48-82b2-8281f0180324 Crash id: a9e71c13-73dd-4d48-82b2-8281f0180324 Original: shutdownhang | static bool mozilla::SpinEventLoopUntil<T> New: shutdownhang | static bool mozilla::SpinEventLoopUntil<T> | mozilla::dom::quota::QuotaManager::ShutdownObserver::Observe Same?: False Here are some other examples of signatures that changed: you@processor:/app$ socorro-cmd fetch_crashids --num=20 --signature-contains=static | python -m socorro.signature --different-only Crash id: 40277fb7-9d14-4e79-9a02-cdf900180413 Original: static void std::panicking::rust_panic_with_hook New: static union style::properties::animated_properties::AnimationValue geckoservo::glue::compose_animation_segment Same?: False Crash id: b769e6c1-cf8d-45f8-9db3-0f8270180413 Original: static void std::panicking::rust_panic_with_hook New: static union style::properties::animated_properties::AnimationValue geckoservo::glue::compose_animation_segment Same?: False Crash id: f87a131d-6c67-44be-a991-b3f970180413 Original: static void std::panicking::rust_panic_with_hook New: static void webrender::resource_cache::ResourceCache::update_resources Same?: False Crash id: 3b91662e-b123-4d42-bff9-ad61f0180413 Original: static void arena_dalloc New: static void arena_dalloc | mozilla::SchedulerGroup::Runnable::`scalar deleting destructor' Same?: False Crash id: f4231a7e-82f9-4f15-9ede-4fae00180413 Original: static void std::panicking::rust_panic_with_hook New: static void core::option::expect_failed | bool geckoservo::glue::Servo_Element_IsDisplayContents Same?: False Crash id: d184de1c-61c6-45a2-8b43-54bba0180413 Original: static void std::panicking::rust_panic_with_hook New: static void core::option::expect_failed | bool geckoservo::glue::Servo_Element_IsDisplayContents Same?: False Crash id: 838c2d5b-b89b-45de-aa08-4b88a0180413 Original: static void std::panicking::rust_panic_with_hook New: static void core::option::expect_failed | static struct style::style_resolver::MatchingResults style::style_resolver::StyleResolverForElement<T>::match_primary<T> Same?: False I'm concerned that the "static void" stuff adds significant length to the signatures such that they'll get truncated. I haven't seen an example of that, yet. I'm also concerned that the "static void" stuff only shows up in crashes from Windows users, so those crashes will get different signatures than OSX and Linux. Seems like if we could get signatures to match across platforms, that'd probably be a better solution. Is option 2 good enough for now? Is it worth opening up a bug for nixing specifiers and return types?
Flags: needinfo?(continuation)
Yeah, maybe just removing "static void" as a special case, too, would be ok. It does make the signatures really long. Signatures already vary a bit by platform due to inlining so I wouldn't worry too much about that.
Flags: needinfo?(continuation)
Commits pushed to master at https://github.com/mozilla-services/socorro https://github.com/mozilla-services/socorro/commit/2d3d49e92952ae43d50b12086564c7059b1e229c fix bug 1448957 - ignore specifiers and return types in signature generation This fixes signature generation to ignore specifiers and return types when matching against the signature list. https://github.com/mozilla-services/socorro/commit/b0910bfb01ad8a4cc68b89202bb534b3293a7e16 Merge pull request #4406 from willkg/1448957-static-bool-2 fix bug 1448957 - ignore specifiers and return types in signature generation
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
This is in prod now. I verified it with the crash from comment #1. Andrew: Do we want to reprocess crashes with the word "static" in the signature? If so, how far back to we want to go?
Flags: needinfo?(continuation)
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #26) > Andrew: Do we want to reprocess crashes with the word "static" in the > signature? If so, how far back to we want to go? This has been around long enough that I think it is fine to leave it as is.
Flags: needinfo?(continuation)
Awesome! Ted, Andrew: Thank you so much for working with me on this. I know bits, but there's a ton of stuff I don't know. I appreciate you turning around questions so quickly!
I'll send a message to the stability list about this change.
I don't know if it is practical, but something that would be handy is if the signatures in bugzilla that contain static bool could be changed to also include the non static bool versions. Doing them manually hopefully won't be too big of an issue as they come up, but it would make the transition a little smoother.
I have no way to do that. Maybe whoever writes up bugs based on crash report signatures and/or triages them has tools for that?
Oh, sorry, I saw your message about how this only affects the regexp matching. I think in that case it isn't easy to do so we can just deal with updating manually.
In case anyone is looking at this, we fixed function signature normalization to remove prefixes and return types in: https://github.com/mozilla-services/socorro/pull/4562 Part of that was undoing the changes we did in this bug. That landed in: https://github.com/mozilla-services/socorro/commit/928585729553ad811f2548b40270995927daeb84
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: