Closed Bug 1449518 Opened 7 years ago Closed 7 years ago

Win64 CFI unwind tests failing due to stack overflow on clang builds

Categories

(Toolkit :: Crash Reporting, defect, P1)

x86_64
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: ccorcoran, Assigned: ccorcoran)

References

Details

Attachments

(1 file)

The following tests fail due to a stack overflow as discovered by dmajor in Bug 1420947 comment 9: test_crash_win64cfi_alloc_small.js test_crash_win64cfi_infinite_entry_chain.js test_crash_win64cfi_infinite_code_chain.js test_crash_win64cfi_invalid_exception_rva.js test_crash_win64cfi_not_a_pe.js These tests all begin by crashing firefox via CRASH_X64CFI_ALLOC_SMALL. That crash attempts to trash RBP to obfuscate the stack, which works fine on MSVC builds, but under clang causes this stack overflow. https://searchfox.org/mozilla-central/rev/028cd8342899357d80fafba2d528a0fd5819e316/toolkit/crashreporter/test/win64UnwindInfoTests.asm#179 It's not really worth digging into why this difference occurs; this stack obfuscation is not essential to the test anyway.
Great find yet again! I see there are still some fail-if ccov lines left in the file. Were they fixed by your previous bug? If so then I'd recommend pushing a followup fix on that bug. Or, are they still an outstanding issue?
Comment on attachment 8963040 [details] Bug 1449518: More accurate testing for invalid CFI in Win64 unwind info CFI tests; https://reviewboard.mozilla.org/r/231878/#review237518 I don't really grok the details of why the bp-trashing is or isn't needed, but as long as this passes on both MSVC and clang-cl, and it still tests the thing you're wanting to test, then go for it.
Attachment #8963040 - Flags: review?(dmajor) → review+
Yep I submitted a patch to pass those other fail-if ccov tests in bug 1449392
Keywords: checkin-needed
Pushed by cbrindusan@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/21db86504db5 Remove unnecessary RBP trashing in x64CrashCFITest_ALLOC_SMALL; r=dmajor
Keywords: checkin-needed
Comment on attachment 8963040 [details] Bug 1449518: More accurate testing for invalid CFI in Win64 unwind info CFI tests; https://reviewboard.mozilla.org/r/231878/#review237846 Code analysis found 1 defect in this patch: - 1 defect found by mozlint You can run this analysis locally with: - `./mach lint path/to/file` (JS/Python) If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx ::: toolkit/crashreporter/test/unit/head_win64cfi.js:128 (Diff revision 2) > + dumpThisFrame(); > + info("Expected frame trust matched when it should not have."); > + ok(false); > + } > + } else { > - if (frame.trust !== expectedFrame.trust) { > + if (frame.trust !== expectedFrame.trust) { Error: Unexpected if as the only statement in an else block. [eslint: no-lonely-if]
Some background on these tests and the fix... These 4 tests... - test_crash_win64cfi_infinite_entry_chain.js - test_crash_win64cfi_infinite_code_chain.js - test_crash_win64cfi_invalid_exception_rva.js - test_crash_win64cfi_not_a_pe.js ...all crash firefox.exe, but instead of reading stack unwind info from firefox.exe itself, they get it from a test EXE file that's been doctored with certain corruptions. The point of testing these is to ensure that even under the worst conditions, the stack walker will at least continue to work as well as it used to before we started supporting Win64 unwind info. For example in "test_crash_win64cfi_infinite_code_chain.js", the EXE providing unwind info has an infinite loop in its entry chains. Here we want to make sure the unwind info parser 1) doesn't freeze in an infinite loop, 2) bails out and continues to walk the stack as it used to. Trashing RBP was an attempt to force the behavior of checking for the "as it used to" behavior. Without CFI, breakpad's stack scanner will try to deduce RBP. If it can, it will reconstruct the stack frame with "frame_pointer" trust. If not, it will scan the stack ("scan" trust). The problem with our method is that trashing RBP is unpredictable and caused the stack overflow as seen above. And since trashing RBP is not essential to this test, I stopped trashing RBP, but this means we cannot 100% predict when breakpad will choose to stack scan or use the frame pointer, as seen in the latest failure (comment 6). A more accurate way to test for fallback behavior is to just ensure it's NOT using CFI, so it doesn't matter whether breakpad chooses to stack scan or frame_pointer. This change is reflected in the latest patch.
Flags: needinfo?(ccorcoran)
Attachment #8963040 - Flags: review?(gsvelto)
Comment on attachment 8963040 [details] Bug 1449518: More accurate testing for invalid CFI in Win64 unwind info CFI tests; https://reviewboard.mozilla.org/r/231878/#review238160 This looks robust, great!
Attachment #8963040 - Flags: review?(gsvelto) → review+
Keywords: checkin-needed
Pushed by csabou@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/48dab6fbabf7 More accurate testing for invalid CFI in Win64 unwind info CFI tests; r=dmajor,gsvelto
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: