Closed
Bug 802106
Opened 12 years ago
Closed 12 years ago
Breakpad stackwalking optimization: optimize PostfixEvaluator
Categories
(Toolkit :: Crash Reporting, defect)
Toolkit
Crash Reporting
Tracking
()
RESOLVED
FIXED
mozilla21
People
(Reporter: ted, Assigned: jseward)
References
Details
Attachments
(4 files, 2 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
Julian did a callgrind profile of his SPS unwinder that uses Breakpad for unwinding, and it shows that we spend a huge portion of time in the PostfixEvaluator class:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/postfix_evaluator.h
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/postfix_evaluator-inl.h
Specifically, the unwind rules in Breakpad symbol files are specified as postfix expressions translated from DWARF CFI, and every time we walk the stack from a callee frame to its caller we build a PostfixEvaluator, feed it the current register state, and evaluate one of these expressions to recover the caller register values. See:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/stackwalker_amd64.cc#122
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/cfi_frame_info.cc#46
Most of the damage seems to be fallout from heavy use of the STL: lots of memcmp from comparing string keys and an awful lot of mallocing. I suspect we could get some pretty large wins if we could just switch from string keys to int keys by maintaining a string table.
Assignee | ||
Comment 1•12 years ago
|
||
This replaces the use of std::string keys in the dictionaries used for
evaluation, with a new type UniqueString*, basically an interned
string for which equality can be done via pointer equality.
It also changes the type of the evaluator stack element from
std::string to a new type StackElem, which is a tagged union of
ValueType and UniqueString*. This makes the evaluator a bit clearer
and avoids storing literals (numbers) in the stack as text; the latter
clearly is inefficient in that it causes arithmetic on literals to
require conversion of operands to/from strings.
This helps, although it doesn't give a huge speedup. In the SPS
integration it reduces the cost of StackwalkerAMD64::GetCallerFrame
from about 150k instructions down to 100k, better but still insanely
expensive. One background reason is there is still a lot of
std::string activity present because the expression strings are
tokenised and parsed using std::string.
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → jseward
Assignee | ||
Comment 2•12 years ago
|
||
Give "expressions" (the things evaluated by the postfix evaluator)
their own type (Module::Type) rather than merely being strings. Allow
the type to have special non-string representation for the special
cases "identifier + immediate" and "*(identifier + immediate)". These
special cases can represent all the Dwarf Cfi expressions we generate,
and so allow Cfi unwinding to happen without endlessly having to parse
strings. Also enhance the postfix evaluator to handle these special
cases. This speeds up StackwalkerAMD64::GetCallerFrame by very
roughly a factor of (150k insns/frame before, 46k/frame after).
Reporter | ||
Comment 3•12 years ago
|
||
This looks great! After reading your patch I can offer some more concrete suggestions. My original idea was that we'd be able to avoid making changes to PostfixEvaluator at all, which should make the patch a bit conceptually simpler. I thought that perhaps in CFIFrameInfo::FindCallerRegs:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/cfi_frame_info.cc#73
we could somehow lazily init the PostfixEvaluator and only use it if the unwind rule required it.
Assignee | ||
Comment 4•12 years ago
|
||
Rehash of the UniqueString patch, to be applied on top of the patch in
comment 2. This patch replaces the used of std::string keys with a
new type, UniqueString*, which only ever has one copy of any string
and hence allows string equality to be done by pointer equality. In
combination with Ted's direct-from-the-dwarf-sections patches and the
Cfi rule re-representation patch in comment 2, this allows CFI
unwinding to happen with essentially no std::string allocation or
comparison. According to my rough measurements, this reduces the cost
of calls to StackwalkerAMD64::GetCallerFrame from about 46kI/frame to
about 26kI/frame.
This is not as aggressive as changing of keys directly to integers,
but it does sidestep the problem of various evaluators (particularly
stackwalker_x86.cc) stuffing entirely arbitrary string keys into the
evaluation dictionaries and expecting everything still to work.
This patch contains some "backwards compatibility" impedance matching,
so that uses of the plain postfix evaluator with plain strings, still
works right (although untested).
It also contains a cleanup change to postfix_evaluator-inl.h which
should possibly be factored out into its own patch: it changes the
type of the stack elements from strings to a union of string
(actually, of course, UniqueString*) and ValueType. This makes the
code clearer and avoids having to convert literal values back and
forth to strings during evaluation.
Attachment #672755 -
Attachment is obsolete: true
Assignee | ||
Comment 5•12 years ago
|
||
One final speedup patch for the unwinder, that applies on top of the
patch in comment 4. This changes the type of the dictionaries used in
evaluation from instantiations of std::map to a new type,
UniqueStringMap<ValueType>. This has two advantages:
* it allows precise measurement/profiling of dictionary usage stats
* it allows using a optimised map implementations
UniqueStringMap optimises the small-map case. From some testing it
appears that almost all maps (dictionaries) have 10 or fewer elements.
UniqueStringMap therefore stores (key,value) pairs in a fixed array of
10 elements and performs linear search for access. If this array
fills up (very rarely), then it allocates and uses a std::map to
handle the overflow.
This easily doubles the performance of
StackwalkerAMD64::GetCallerFrame, reducing the cost from about
26kI/frame to 11kI/frame. Profiling shows it removes almost all the
std::map activity during CFI unwinding.
Overall cost when testing SPS on x86_64-linux, for max-32 element
stack unwinds, shows a cost of about 670K insns/unwind. SPS can
sample at 1KHz on this machine (Core i5) at less than 40% cpu
utilisation now.
Assignee | ||
Comment 6•12 years ago
|
||
This applies on top of the patch in comment 5, and fixes a problem
with it. UniqueStringMap contains a raw pointer, and the default
assignment operator for it does not handle that correctly, leading to
read/write after free and double frees, on arm-android. If we had
used the default copy constructor the same problem would exist. This
patch fixes both cases.
Assignee | ||
Comment 7•12 years ago
|
||
Misc fixes needed to build/use native unwind on arm-android:
* don't call abi::__cxa_demangle since it doesn't exist
* ~MmapWrapper(): don't munmap unless base_/size_ seem sane
Applies on top of the patch in comment 6.
Reporter | ||
Comment 8•12 years ago
|
||
Comment on attachment 674594 [details] [diff] [review]
Change representation of Cfi rules internally
I fixed a few style issues that I knew would get dinged in review, and also fixed some of the Breakpad unit tests, and got this patch up for review:
https://breakpad.appspot.com/516004/
Reporter | ||
Comment 9•12 years ago
|
||
I rolled up all the remaining patches here along with the Breakpad bits of the patch from bug 812133 and pushed them to Try:
https://tbpl.mozilla.org/?tree=Try&rev=717a458f3518
It all built fine on my local Linux machine.
Reporter | ||
Comment 10•12 years ago
|
||
Comment on attachment 675297 [details] [diff] [review]
Change std::string keys to UniqueString*, second try
After working out a few small issues with Julian, and cleaning up some style nits, this patch is up for upstream review here:
https://breakpad.appspot.com/519002/
Reporter | ||
Comment 11•12 years ago
|
||
Comment on attachment 675301 [details] [diff] [review]
Replace eval-dictionary mappings with a specialised type, UniqueStringMap
This patch (with the "fix assignment operator" patch rolled into it) is up for upstream review:
https://breakpad.appspot.com/520002/
Reporter | ||
Updated•12 years ago
|
Attachment #678232 -
Attachment is obsolete: true
Reporter | ||
Comment 12•12 years ago
|
||
Comment on attachment 678240 [details] [diff] [review]
Misc fixes for arm-android
Uploaded this patch for upstream review:
https://breakpad.appspot.com/521002
I didn't really understand the purpose of the dump_symbols MmapWrapper change, so I left it out. If it turns out to be important we can put it back, but it feels hacky.
Reporter | ||
Comment 13•12 years ago
|
||
Patches included in:
https://hg.mozilla.org/integration/mozilla-inbound/rev/be126e0d29a1
Comment 14•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
You need to log in
before you can comment on or make changes to this bug.
Description
•