Open Bug 1743983 Opened 3 years ago Updated 1 year ago

Rewrite the minidump-analyzer in Rust

Tracking

()

Status:

NEW

People

(Reporter: gsvelto, Assigned: afranchuk)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

Bug 1743983 - Rewrite the minidump-analyzer in Rust r=gsvelto 2 years ago Alex Franchuk (deleted), text/x-phabricator-request		Details
Bug 1743983 pt2 - Remove the old minidump-analyzer r=gsvelto 1 year ago Alex Franchuk (deleted), text/x-phabricator-request		Details

Gabriele Svelto [:gsvelto]

Reporter

Description

•

3 years ago

As per title. The current minidump-analyzer takes the JSON annotations file as generated in a Firefox crash report plus the minidump and creates a stack trace that will be added to the JSON annotations. While doing so it also retrieves module signatures and adds them to the annotations.

We only ever implemented reading unwinding information on x86-64/Windows in bug 1333126, for all the other platforms we only used frame pointers or stack scanning.

A rewrite could be done in several steps:

Build a version that only uses stack scanning & frame pointers but which is otherwise compatible with the existing implementation
Add unwinding support for Windows (by using the symbolic-debuginfo crate to read the unwinding information from PE files)
Add unwinding support macOS (likewise but for Mach-O ones)
Add unwinding support for Linux/Android. This is trickier as I believe that symbolic-debuginfo has no way of surfacing this data - yet - so we might have to gimli/goblin directly to parse the data

Gian-Carlo Pascutto [:gcp]

Updated

•

2 years ago

Blocks: 1795864

Alex Franchuk

Assignee

Updated

•

2 years ago

Assignee: nobody → afranchuk

Alex Franchuk

Assignee

Comment 1

•

2 years ago

Once the stackwalking functionality is split out of the processor we can move forward, as we don't want to audit/vendor the huge dependencies of minidump-processor.

Alex Franchuk

Assignee

Comment 2

•

2 years ago

Attached file Bug 1743983 - Rewrite the minidump-analyzer in Rust r=gsvelto (deleted) — Details

This needs the minidump-walk-stack crate to be vendored. It builds when minidump and
minidump-walk-stack are changed to paths in the split-processor-crate branch of rust-minidump
(see https://github.com/rust-minidump/rust-minidump/pull/825).

Alex Franchuk

Assignee

Comment 3

•

1 years ago

I'm now just waiting on a new release of rust-minidump. Unfortunately the way vendoring works, I can't make progress or do it ahead of time without the crates being in crates.io (I tried directing them to both git urls and local paths as a means of working on it prior to publishing, but that seems to cause the packages to not be pulled in by vendoring properly).

Gabriele Svelto [:gsvelto]

Reporter

Comment 4

•

1 years ago

There's a bit more work that needs to be done. Steven Fackler did some testing and noticed that the logic that finds symbols has some issues. We also don't demangle symbols when using local debug information. Neither of these issues would preclude us from using the existing code for minidump-analyze because we won't be symbolicating the stacks on the user machine.

However the symbol information also contains the inlining information and with Steven's patch applied I get incoherent stacks. The latter needs to be fixed before we roll out the new minidump-analyzer as we need consistent stacks with what we get on Socorro.

Gabriele Svelto [:gsvelto]

Reporter

Comment 5

•

1 year ago

I've just published version 0.17.0 of the rust-minidump crate including the new minidump-unwind crate which we need here.

Alex Franchuk

Assignee

Comment 6

•

1 year ago

This is almost there. When vendoring dependencies, some files hit the size limit. We've reached out to gimli developers wrt the fixtures in that crate (which don't need to be vendored). Some of the other large files are necessary so we'll have to accept them.

:glandium brought up slight concern with vendoring symbolic since we hope to replace it with wholesym in the future (i.e. symbolic will live on in the repository forever). I researched a bit and came to the conclusion that it will be a non-trivial effort to update wholesym for our needs. We will want to expose CFI and probably add some crate features to be able to turn off networked stuff (which brings along unnecessary crates for our purposes). This is somewhat straightforward for macos and linux as macho-unwind-info and gimli provide CFI support. Windows (PE) will be a little more effort. But at this time I think it makes more sense to go ahead with using symbolic. It's entirely possible we'll use it for other things in the future, so I don't feel too bad about vendoring it in the repository. In total the source of the crates are about 1.1MB.

Phabricator Automation

Updated

•

1 year ago

Attachment #9328160 - Attachment description: WIP: Bug 1743983 - Rewrite the minidump-analyzer in Rust r=gsvelto → Bug 1743983 - Rewrite the minidump-analyzer in Rust r=gsvelto

Mike Hommey [:glandium]

Updated

•

1 year ago

Blocks: syn-2

Alex Franchuk

Assignee

Comment 7

•

1 year ago

Attached file Bug 1743983 pt2 - Remove the old minidump-analyzer r=gsvelto (deleted) — Details

Depends on D175252

Alex Franchuk

Assignee

Comment 8

•

1 year ago

Due to gsvelto's discovery that the new minidump-analyzer uses an unacceptable amount of memory (with a large portion being from internal symbolic data structures), we need to rework the approach here. We cannot use symbolic to enable stack walking.

Instead, I am planning to use framehop. As of now, minidump-analyzer only needs the return addresses (no symbolication). It would be possible to use framehop alone with minidump to read the stack/registers, without using minidump-unwind at all. However, framehop doesn't do stack scanning, so this would not work for platforms without CFI. So we should likely use framehop with minidump-unwind, and ensure that will work without requiring symbolication.

framehop also doesn't support PE unwind info, so that will need to be added.

gsvelto, how does this approach sound to you?

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Reporter

Comment 9

•

1 year ago

(In reply to Alex Franchuk from comment #8)

Due to gsvelto's discovery that the new minidump-analyzer uses an unacceptable amount of memory (with a large portion being from internal symbolic data structures), we need to rework the approach here. We cannot use symbolic to enable stack walking.

Instead, I am planning to use framehop. As of now, minidump-analyzer only needs the return addresses (no symbolication). It would be possible to use framehop alone with minidump to read the stack/registers, without using minidump-unwind at all. However, framehop doesn't do stack scanning, so this would not work for platforms without CFI. So we should likely use framehop with minidump-unwind, and ensure that will work without requiring symbolication.

framehop also doesn't support PE unwind info, so that will need to be added.

gsvelto, how does this approach sound to you?

This SGTM but before adding functionality to framehop I think it would be better to start with a minimum viable implementation on Linux - where it should already work well. We'll use that to validate if the approach works and then jump into adding whatever we need for the other platforms.

Flags: needinfo?(gsvelto)

Alex Franchuk

Assignee

Comment 10

•

1 year ago

Excellent idea, I will try Linux first.

Alex Franchuk

Assignee

Comment 11

•

1 year ago

I've confirmed that this works using framehop (and wholesym for completeness, which actually was able to get some symbols that symbolic couldn't). framehop sometimes performed functionally better than symbolic (getting a few extra frames, usually at the beginning). As far as performance, in debug builds framehop was about 10x faster than symbolic, and had very low memory usage.

There's one drawback right now, which is that framehop doesn't restore all registers for each unwound frame. This can be added in the future; I'm fine with adding a note to the documentation of the minidump-unwind DebugInfoSymbolProvider to that effect.

I'll move forward with adding the other platforms we need to framehop.

EDIT
I just did a release build comparison.
framehop + wholesym: 146MB peak memory usage, ~170ms usr time
symbolic: 322MB peak memory usage, 1.44s usr time

Debug builds of the framehop solution run faster than release builds of symbolic :)

Note that the memory usage is just for the simple case of an about:crashcontent minidump. This only loads libc.so and libxul.so (on linux). However the framehop solution works differently than the old way which lazily loaded modules: it loads all modules up-front, so it shouldn't use much more memory than that for any firefox minidumps.

Mike Hommey [:glandium]

Updated

•

1 year ago

No longer blocks: syn-2

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Rewrite the minidump-analyzer in Rust

Categories

(Toolkit :: Crash Reporting, task)

Tracking

()

People

(Reporter: gsvelto, Assigned: afranchuk)

References

(Blocks 2 open bugs)

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files)

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Attachment

General

Description

File Name

Content Type