Closed Bug 1649668 Opened 4 years ago Closed 2 years ago

RustMozCrash [@ webrender::spatial_tree::SpatialTree::get_relative_transform]

Categories

(Core :: Graphics: WebRender, defect, P2)

defect

Tracking

()

RESOLVED FIXED
103 Branch
Tracking Status
firefox-esr68 --- disabled
firefox-esr78 --- disabled
firefox-esr91 --- disabled
firefox-esr102 --- disabled
firefox77 --- wontfix
firefox78 --- wontfix
firefox79 --- disabled
firefox80 --- disabled
firefox84 --- disabled
firefox85 --- disabled
firefox86 --- disabled
firefox87 --- disabled
firefox88 - disabled
firefox89 - disabled
firefox92 --- disabled
firefox93 --- disabled
firefox94 --- disabled
firefox101 --- disabled
firefox102 --- disabled
firefox103 --- fixed

People

(Reporter: tsmith, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs, Regression)

Details

(Keywords: crash, regression, testcase, Whiteboard: [bugmon:bisected,confirmed])

Crash Data

Attachments

(3 files, 1 obsolete file)

Attached file testcase.html (deleted) —
==48375==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x7f20bfe6c28b bp 0x7f2099493900 sp 0x7f20994934b0 T27)
==48375==The signal is caused by a WRITE memory access.
==48375==Hint: address points to the zero page.
    #0 0x7f20bfe6c28a in RustMozCrash (/home/user/workspace/browsers/m-c-20200629154604-fuzzing-asan-opt/libxul.so+0x1445728a)
    #1 0x7f20bea159dc in mozglue_static::panic_hook::h49c6b7e77d9abe99 src/mozglue/static/rust/lib.rs:89:8
    #2 0x7f20bea158ab in core::ops::function::Fn::call::h486500c193845745 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libcore/ops/function.rs:72:4
    #3 0x7f20befb3df3 in std::panicking::rust_panic_with_hook::hb976084785e50594 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/panicking.rs:474:16
    #4 0x7f20befb3bd9 in rust_begin_unwind /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/panicking.rs:378:4
    #5 0x7f20be0de3af in core::panicking::panic_fmt::h45f7d6868edb5678 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libcore/panicking.rs:85:13
    #6 0x7f20be0e4901 in core::option::expect_failed::h9a8bff6ff005b30d /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libcore/option.rs:1203:4
    #7 0x7f20bf88e1d9 in core::option::Option$LT$T$GT$::expect::h522d31ffa1f42323 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libcore/option.rs:347:20
    #8 0x7f20bf88e1d9 in webrender::spatial_tree::SpatialTree::get_relative_transform::h61ef8427f95075e0 src/gfx/wr/webrender/src/spatial_tree.rs:321:35
    #9 0x7f20bf8755b1 in webrender::prim_store::SpaceMapper$LT$F$C$T$GT$::set_target_spatial_node::h1cdd2a59ce0223eb src/gfx/wr/webrender/src/prim_store/mod.rs:256:28
    #10 0x7f20bfaf0a5f in webrender::picture::PicturePrimitive::post_update::hd2b524f5f78af2a6 src/gfx/wr/webrender/src/picture.rs:6157:12
    #11 0x7f20bfa83e29 in webrender::picture::PictureUpdateState::update::h3e49dd3f4d89b782 src/gfx/wr/webrender/src/picture.rs:4000:12
    #12 0x7f20bfa83d69 in webrender::picture::PictureUpdateState::update::h3e49dd3f4d89b782 src/gfx/wr/webrender/src/picture.rs:3989:20
    #13 0x7f20bfa83d69 in webrender::picture::PictureUpdateState::update::h3e49dd3f4d89b782 src/gfx/wr/webrender/src/picture.rs:3989:20
    #14 0x7f20bfa83d69 in webrender::picture::PictureUpdateState::update::h3e49dd3f4d89b782 src/gfx/wr/webrender/src/picture.rs:3989:20
    #15 0x7f20bfa7aa7f in webrender::picture::PictureUpdateState::update_all::had848fb1c75c812b src/gfx/wr/webrender/src/picture.rs:3901:8
    #16 0x7f20bfa7aa7f in webrender::frame_builder::FrameBuilder::build_layer_screen_rects_and_cull_layers::h5188ce4868864adf src/gfx/wr/webrender/src/frame_builder.rs:347:8
    #17 0x7f20bfa7aa7f in webrender::frame_builder::FrameBuilder::build::h978bbcf1d90801a1 src/gfx/wr/webrender/src/frame_builder.rs:593:34
    #18 0x7f20bfa66a54 in webrender::render_backend::Document::build_frame::h68f085ca7fba3b8b src/gfx/wr/webrender/src/render_backend.rs:643:24
    #19 0x7f20bfa581a3 in webrender::render_backend::RenderBackend::update_document::h266dc88e36fc7934 src/gfx/wr/webrender/src/render_backend.rs:1555:40
    #20 0x7f20bfa5421b in webrender::render_backend::RenderBackend::prepare_transactions::h502987b6e761b4fa src/gfx/wr/webrender/src/render_backend.rs:1392:31
    #21 0x7f20bfa5421b in webrender::render_backend::RenderBackend::process_api_msg::hb21a1245248c5af7 src/gfx/wr/webrender/src/render_backend.rs:1335:16
    #22 0x7f20bfa3f145 in webrender::render_backend::RenderBackend::run::hf179e180e0bd7e37 src/gfx/wr/webrender/src/render_backend.rs:961:20
    #23 0x7f20bfa39747 in webrender::renderer::Renderer::new::_$u7b$$u7b$closure$u7d$$u7d$::ha83aabab8cff91ca src/gfx/wr/webrender/src/renderer.rs:2629:12
    #24 0x7f20bfa39747 in std::sys_common::backtrace::__rust_begin_short_backtrace::h338c3b6f227cbc73 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/sys_common/backtrace.rs:130:4
    #25 0x7f20bfa38ded in std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h24989f571fd7dc4d /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/thread/mod.rs:475:16
    #26 0x7f20bfa38ded in _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::ha28cee2f14347c46 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/panic.rs:318:8
    #27 0x7f20bfa38ded in std::panicking::try::do_call::h5a67ad17d9149a22 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/panicking.rs:303:39
    #28 0x7f20bfa38ded in __rust_maybe_catch_panic /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libpanic_abort/lib.rs:30:4
    #29 0x7f20bfa38ded in std::panicking::try::h5de4d66cd712ff59 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/panicking.rs:281:12
    #30 0x7f20bfa38ded in std::panic::catch_unwind::h70fe94df26504fbf /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/panic.rs:394:13
    #31 0x7f20bfa38ded in std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h7f6186accc6d77b1 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/thread/mod.rs:474:29
    #32 0x7f20bfa38ded in core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hcff5e3ea22d786c6 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libcore/ops/function.rs:232:4
    #33 0x7f20befc547d in _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h553ef812d1929d1b /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/liballoc/boxed.rs:1017:8
    #34 0x7f20befc90cf in _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h51b51bce029ae491 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/liballoc/boxed.rs:1017:8
    #35 0x7f20befc90cf in std::sys_common::thread::start_thread::hca943f45f04c8e46 /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/sys_common/thread.rs:13:4
    #36 0x7f20befc90cf in std::sys::unix::thread::Thread::new::thread_start::h352e8a5875b189ee /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/sys/unix/thread.rs:80:16
    #37 0x7f20d58066b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
    #38 0x7f20d482c41c in clone /build/glibc-LK5gWL/glibc-2.23/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Flags: in-testsuite?

A Pernosco session is available here: https://pernos.co/debug/z2c8Y3ctuaGPShS2SlBF9Q/index.html

Attached file prefs.js (deleted) —
Keywords: bugmon

(I did not repro on Linux & Windows using WR and Nightly, but I'm using a default build, maybe the panic got compiled out).

Severity: -- → S3
Flags: needinfo?(gwatson)
OS: Unspecified → Linux
Priority: -- → P3
Hardware: Unspecified → Desktop
Whiteboard: [bugmon:bisected,confirmed]
Bugmon Analysis: Verified bug as reproducible on mozilla-central 20200702152109-2d709e60c76e. The bug appears to have been introduced in the following build range: > Start: 79ac597adb58c9b4946b82a78eebe7b5aa71e19b (20190924010420) > End: dcf8bb9eb0db981075bc8864e0a0115b21980d89 (20190924010624) > Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=79ac597adb58c9b4946b82a78eebe7b5aa71e19b&tochange=dcf8bb9eb0db981075bc8864e0a0115b21980d89
Attached file without-offset-shorthand.html (deleted) —

dcf8bb9eb0db981075bc8864e0a0115b21980d89 Boris Chiou — Bug 1567330 - Add offset shorthand. r=emilio,birtles

bp-1c8bdc22-5314-42a1-94a2-7468d0200702

MOZ_CRASH Reason (Sanitized) invalid parent!

Release build gets main process crash if backdrop filter is enabled.

Debug build gets tab crash even without backdrop filter. Debug builds are kept for 1 year, so I can't search for an exact range, but it looks the same with 2019-07-05:

mozregression --launch 2020-07-01 --pref gfx.webrender.all:true layers.gpu-process.enabled:false layout.css.backdrop-filter.enabled:true -a https://bugzilla.mozilla.org/attachment.cgi?id=9161076 -B debug

0:38.11 INFO: b'Assertion failure: IsAncestor(aOne, aTwo) || IsAncestor(aTwo, aOne), at /builds/worker/checkouts/gecko/layout/painting/nsDisplayList.h:292'

https://searchfox.org/mozilla-central/rev/31d8600b73dc85b4cdbabf45ac3f1a9c11700d8e/layout/painting/nsDisplayList.h#292
bug 1298218 added the assert.

Crash Signature: [@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform ]
Summary: RustMozCrash [@ webrender::spatial_tree::SpatialTree::get_relative_transform] → backdrop-filter: RustMozCrash [@ webrender::spatial_tree::SpatialTree::get_relative_transform]

Test case contains clip-path. Is bug 1579957 relevant?

This also doesn't appear to reproduce for me on a local build. Is it still occurring on a currently nightly for anyone else?

Flags: needinfo?(gwatson)

Yes: bp-0c7840ea-26ae-4900-8539-a6fe30200712

This command opens and instantly crashes Firefox: mozregression --launch 20200712091716 --pref gfx.webrender.all:true layers.gpu-process.enabled:false layout.css.backdrop-filter.enabled:true -a https://bugzilla.mozilla.org/attachment.cgi?id=9161076

Crash signature changed:
bp-6bc53c5f-ab73-4c26-9cd3-c2e8e0201219 [@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform_with_face ]

Crash Signature: [@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform ] → [@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform ] [@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform_with_face ]
OS: Linux → All
Hardware: Desktop → All

This got more frequent last week, potentially starting with 20210223085042. Could you check what triggered this? (E.g. bug 1684781 landed for that build.)

Flags: needinfo?(gwatson)

Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.

I'm not sure what would have caused this increase. The referenced bug affects on mix-blend-mode, which I would not expect to have any effect on a test that uses backdrop-filter. I'm planning to spend some time next week looking into the current backdrop-filter impl again.

Flags: needinfo?(gwatson)

This has been hitting me for a while, mostly while using GitLab.

I can reproduce it with the attached testcase using mozregression --launch 2021-03-09 --pref layout.css.backdrop-filter.enabled:true -a https://bugzilla.mozilla.org/attachment.cgi\?id=9161076 (bp-0a080e17-a94f-41cd-b8d8-d160b0210309).

Would an updated Pernosco session here help?

Flags: needinfo?(gwatson)

With a full debug build, I see an assertion failure at [1] when running this test case on current m-c.

I'm not very familiar with the Gecko display list building code, but the assertion failure here is exactly the thing that would cause a panic to occur later in the WR code.

Specifically, get_relative_transform in WR is trying to build a transform matrix between two spatial nodes - it relies on one of the spatial nodes being an ancestor of the other to do this, which is the same condition the Gecko DL building assert is failing on.

So, I think this is a case of the Gecko DL being invalid and causing the panic down the pipeline in WR.

Markus, Matt, any ideas what could cause this or who would be a good candidate to investigate the Gecko assertion failure here?

[1] https://searchfox.org/mozilla-central/source/layout/painting/nsDisplayList.h#276

Flags: needinfo?(mstange.moz)
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(gwatson)

I don't think I'll have time to look at this more closely, maybe Miko can take a look?

Flags: needinfo?(mstange.moz) → needinfo?(mikokm)

(In reply to Glenn Watson [:gw] from comment #17)

With a full debug build, I see an assertion failure at [1] when running this test case on current m-c.

I'm not very familiar with the Gecko display list building code, but the assertion failure here is exactly the thing that would cause a panic to occur later in the WR code.

Specifically, get_relative_transform in WR is trying to build a transform matrix between two spatial nodes - it relies on one of the spatial nodes being an ancestor of the other to do this, which is the same condition the Gecko DL building assert is failing on.

So, I think this is a case of the Gecko DL being invalid and causing the panic down the pipeline in WR.

Markus, Matt, any ideas what could cause this or who would be a good candidate to investigate the Gecko assertion failure here?

[1] https://searchfox.org/mozilla-central/source/layout/painting/nsDisplayList.h#276

I can reproduce this locally, investigating.

Assignee: nobody → mikokm
Status: NEW → ASSIGNED
Flags: needinfo?(mikokm)
Flags: needinfo?(matt.woodrow)

Note there is bug 1427792 open for the same assert, so the problem could be similar/related.

This testcase has clippath and fixed position. Any time we have a clippath that affects a placeholder for a fixed frame we are probably going to have a bad time because the content clip chain will have an asr from the non-fixed frame tree, and the containing block clip inside the fixed subtree will be unrelated to that asr.

(In reply to Timothy Nikkel (:tnikkel) from comment #20)

Note there is bug 1427792 open for the same assert, so the problem could be similar/related.

This testcase has clippath and fixed position. Any time we have a clippath that affects a placeholder for a fixed frame we are probably going to have a bad time because the content clip chain will have an asr from the non-fixed frame tree, and the containing block clip inside the fixed subtree will be unrelated to that asr.

Indeed, this very much seems like a dupe. If we encounter a backdrop-filter crash without the ASR assertion, we should open a new bug for that.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE

I'm not sure this is a dupe of bug 1427792, because the "fix" I implemented at https://bugzilla.mozilla.org/show_bug.cgi?id=1427792#c10 fixes bug 1427792, but it does not fix for example bug 1695957 which is also a case of hitting the same assert, but in a different way. From looking at the stack here this seems more similar to bug 1695957 then bug 1427792.

(In reply to Timothy Nikkel (:tnikkel) from comment #22)

I'm not sure this is a dupe of bug 1427792, because the "fix" I implemented at https://bugzilla.mozilla.org/show_bug.cgi?id=1427792#c10 fixes bug 1427792, but it does not fix for example bug 1695957 which is also a case of hitting the same assert, but in a different way. From looking at the stack here this seems more similar to bug 1695957 then bug 1427792.

This is good to know, I'm changing the duped bug to 1695957.

The main reason I marked this as a dupe was that I had locally reduced this testcase further, and determined that back-drop filter was not relevant at all here. I am expecting us to hit the assert in webrender::spatial_tree::SpatialTree::get_relative_transform in various different ways as long as we have this ASR bug around.

(In reply to Miko Mynttinen [:miko] from comment #23)

I am expecting us to hit the assert in webrender::spatial_tree::SpatialTree::get_relative_transform in various different ways as long as we have this ASR bug around.

Depending on how we fix it, it may not be one ASR bug, it may be several related ASR bugs. ie I can write a patch that fixes bug 1427792 (and regresses a reftest) but does not fix bug 1695957. So it might make sense to keep these bugs open until we can fix them all.

(In reply to Timothy Nikkel (:tnikkel) from comment #24)

Depending on how we fix it, it may not be one ASR bug, it may be several related ASR bugs. ie I can write a patch that fixes bug 1427792 (and regresses a reftest) but does not fix bug 1695957. So it might make sense to keep these bugs open until we can fix them all.

Sounds reasonable. I might have jumped the gun on this one.

Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee: mikokm → nobody

FWIW, I just hit this (at least, assuming this is the one that's relevant) in https://crash-stats.mozilla.org/report/index/6c41c537-22c8-45c6-ab91-1be6a0210326 with 89.0a1 (2021-03-26; 20210326093737).

Seems to happen reliably when loading https://arstechnica.com/gadgets/2021/03/the-fairphone-2-hits-five-years-of-updates-with-some-help-from-lineageos/, and clicking "Show Purposes" on the cookie dialog I get (I'm in Europe).

(In reply to Dirkjan Ochtman (:djc) from comment #27)

FWIW, I just hit this (at least, assuming this is the one that's relevant) in https://crash-stats.mozilla.org/report/index/6c41c537-22c8-45c6-ab91-1be6a0210326 with 89.0a1 (2021-03-26; 20210326093737).

That crash seems to have build id 20210325085523. Do you hit this on 20210326093737 as well? I asked because I landed fatal asserts in that nightly only, and knowing that we can hit this crash in webrender without hitting those fatal asserts is very useful data.

(I don't get the cookie dialog here in North America, but I'll try to vpn to Europe.)

Flags: needinfo?(dirkjan)

Nevermind, I was able to reproduce using a vpn. We hit this crash in webrender in a build with all of the display list building asserts enabled without hitting those asserts.

Flags: needinfo?(dirkjan)
Summary: backdrop-filter: RustMozCrash [@ webrender::spatial_tree::SpatialTree::get_relative_transform] → RustMozCrash [@ webrender::spatial_tree::SpatialTree::get_relative_transform]

Since we already have a reproducible testcase here and the issues in comment 27 and on seem to be a different problem I filed bug 1701361 to track.

[Tracking Requested - why for this release]:
I think the tracking 89 flag should be moved from bug 1695957 to this bug (possibly).

Changing the priority to p2 as the bug is tracked by a release manager for the current nightly.
See What Do You Triage for more information

Priority: P3 → P2

Is this crash signature spiking in 88.0b? A new Fission experiment just launched in 88.0b last week. But only about 6% of these crash reports from 88.0b have Fission enabled (while less than 3% of 88.0b users have Fission enabled), so Fission is probably unrelated.

Okay, so it's probably just bug 1701361 and fission is not involved. Thanks for the info.

I have just pushed the fix for bug 1701361 to autoland. If it sticks, we should hopefully be able to uplift to beta in a few days.

Status: REOPENED → NEW

Confirmed that bug 1701361 fixed the recent spike in 88/89 attributed to this bug. Looking back through the history, it looks like there's still a preexisting issue here, however? I'm untracking this for 88/89 since bug 1701361 already is but I'll leave any further resolution to someone who better understands the current status here.

Still reproducible: bp-90b36696-31d3-412e-a847-f10e60210909

[@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform_with_face ]
MOZ_CRASH Reason (Sanitized): invalid parent!
Crash Address 0x0000000000000000

https://searchfox.org/mozilla-central/rev/aa46c2dcccbc6fd4265edca05d3d00cccdfc97b9/gfx/wr/webrender/src/spatial_tree.rs#371

Setting firefox94 to affected due to bug 1578503 comment 14.

Fixed by bug 1729581, but will add a crashtest.

Assignee: nobody → botond
Depends on: 1729581
Attached file Bug 1649668 - Add a crashtest. r=tnikkel (obsolete) (deleted) —

Unfortunately the crashtest is failing with a different assertion ("item should have finite clip with respect to aASR").

That's also come up in bug 1729586, will wait for a resolution there in the hopes that fixes it.

Depends on: 1729586

It's unclear when a resolution to bug 1729586 will be forthcoming. There's a possibility it will involve backing out bug 1729581, which would cause the original crash to occur again, which would need additional investigation on the WebRender side.

Assignee: botond → nobody

Bugmon Analysis
Testcase crashes using the initial build (mozilla-central 20210417095008-4f124a8d83d1) but not with tip (mozilla-central 20220415213125-86271ddb1099.)
The bug appears to have been fixed in the following build range:

Start: fb938cb58404c2fcb957debda1cbfcfe76e99ef1 (20210917223519)
End: 8dc2af40602cb3eea6d5f3e505f7ba15b7f3ed4c (20210917223714)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=fb938cb58404c2fcb957debda1cbfcfe76e99ef1&tochange=8dc2af40602cb3eea6d5f3e505f7ba15b7f3ed4c
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon

No longer reproduces with updated backdrop-filter implementation.

Status: NEW → RESOLVED
Closed: 4 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 103 Branch
Flags: in-testsuite? → in-testsuite-

Setting regressed_by field after analyzing regression range found by bugmon.

Regressed by: 1567330
Attachment #9241883 - Attachment is obsolete: true
Keywords: regression
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: