Closed Bug 1466613 Opened 6 years ago Closed 6 years ago

Crash in static bool mozilla::wr::Moz2DRenderCallback MOZ_RELEASE_ASSERT(ret)

Categories

(Core :: Graphics: WebRender, defect, P2)

Unspecified
All
defect

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox-esr60 --- unaffected
firefox62 --- disabled
firefox63 --- disabled
firefox64 --- disabled
firefox65 --- fixed

People

(Reporter: MatsPalmgren_bugz, Assigned: kats)

References

(Blocks 1 open bug, Regression, )

Details

(4 keywords)

Crash Data

Attachments

(6 files, 1 obsolete file)

This bug was filed from the Socorro interface and is report bp-a1423f52-5a1d-4120-876c-39ad40180527. ============================================================= MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(ret) Top 9 frames of crashing thread: 0 xul.dll static bool mozilla::wr::Moz2DRenderCallback gfx/webrender_bindings/Moz2DImageRenderer.cpp:324 1 xul.dll wr_moz2d_render_cb gfx/webrender_bindings/Moz2DImageRenderer.cpp:363 2 xul.dll static void rayon_core::job::{{impl}}::execute<closure> third_party/rust/rayon-core/src/job.rs:156 3 xul.dll static void rayon_core::registry::WorkerThread::wait_until<rayon_core::latch::CountLatch> third_party/rust/rayon-core/src/registry.rs:540 4 xul.dll static void std::sys_common::backtrace::__rust_begin_short_backtrace<closure, src/libstd/sys_common/backtrace.rs:133 5 xul.dll static void alloc::boxed::{{impl}}::call_box< src/liballoc/boxed.rs:815 6 xul.dll static void std::sys::windows::thread::{{impl}}::new::thread_start src/libstd/sys/windows/thread.rs:54 7 kernel32.dll BaseThreadInitThunk 8 ntdll.dll RtlUserThreadStart =============================================================
Blocks: wr-stability
Crash Signature: [@ static bool mozilla::wr::Moz2DRenderCallback] → [@ static bool mozilla::wr::Moz2DRenderCallback] [@ mozilla::wr::Moz2DRenderCallback ]
OS: Windows 10 → All
Crash Signature: [@ static bool mozilla::wr::Moz2DRenderCallback] [@ mozilla::wr::Moz2DRenderCallback ] → [@ static bool mozilla::wr::Moz2DRenderCallback] [@ mozilla::wr::Moz2DRenderCallback ] [@ wr_moz2d_render_cb ]
Bug 1470437 has a reproducible testcase
bp-294a6c6a-c031-45f4-85bb-cd6fd0180622 "good" = thread 'WRRenderBackend#1' panicked at 'Vector image error Unknown error', gfx/webrender/src/resource_cache.rs:854:29 "bad" = Assertion failure: ret, at /builds/worker/workspace/build/src/gfx/webrender_bindings/Moz2DImageRenderer.cpp:247 RUST_BACKTRACE=1 mozregression --good 2018-03-05 --bad 2018-03-16 -B debug --pref gfx.webrender.all:true startup.homepage_welcome_url:'https://bug1470437.bmoattachments.org/attachment.cgi?id=8987065' > 11:30.57 INFO: Last good revision: 28707a88967e92bee769fcbae13584adc75ced84 > 11:30.57 INFO: First bad revision: 7dad306cf8fede99816cf454466ff16cbf38053c > 11:30.57 INFO: Pushlog: > https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=28707a88967e92bee769fcbae13584adc75ced84&tochange=7dad306cf8fede99816cf454466ff16cbf38053c > 7dad306cf8fe Jeff Muizelaar — Bug 1391255. Crash earlier if recording playback fails. r=kats > + MOZ_RELEASE_ASSERT(ret);
Blocks: 1391255
Has Regression Range: --- → yes
Has STR: --- → yes
Keywords: testcase
Priority: -- → P1
Please also set the 'regression' keyword when you find a regression.
Keywords: regression
We can't release this to the field, but we can let this ride to beta -- unless the crash rate goes up when we enable WR in Nightly.
Priority: P1 → P2
Attached file ASAN crash report (deleted) —
I can reproduce it reliably while editing a google slide presentation, attached is the ASAN crash report.
(In reply to Francois Guerraz from comment #7) > Created attachment 9007176 [details] > ASAN crash report > > I can reproduce it reliably while editing a google slide presentation, > attached is the ASAN crash report. Any chance you can give repeatable steps to reproduce?
Flags: needinfo?(kubrick)
Yes, create a new presentation: https://docs.google.com/presentation/ Click the "blank template", start typing a title in the big title box, and voilà. It's pretty much unusable, anything I try to do with slides leads to a crash.
Flags: needinfo?(kubrick)
(In reply to Francois Guerraz from comment #9) > Yes, create a new presentation: > https://docs.google.com/presentation/ > Click the "blank template", start typing a title in the big title box, and > voilà. Those steps don't reproduce for me. What platform are you on?
Flags: needinfo?(kubrick)
If Google Slides is broken we need to block nightly.
Blocks: stage-wr-nightly
No longer blocks: stage-wr-trains
Priority: P2 → P1
I can't reproduce on Mac or Windows.
Also are you using the latest nightly?
I'm running the latest nightly, on Linux, (X)Wayland, HiDPI, webrender enabled, Iris Graphics 540, arch. but... I've tried to reproduce it with a fresh profile and it doesn't crash, but it very reliably crashes with my "usual" profile. I'll try to find out what it is that makes it die.
Flags: needinfo?(kubrick)
Summary: Crash in static bool mozilla::wr::Moz2DRenderCallback → Crash in static bool mozilla::wr::Moz2DRenderCallback [Google Slides]
Assignee: nobody → jmuizelaar
(In reply to Kartikaya Gupta (email:kats@mozilla.com) (parental leave) from comment #3) > Bug 1470437 has a reproducible testcase RUST_BACKTRACE=1 mozregression --launch 2018-08-08 -B debug --pref gfx.webrender.all:true -a https://bug1470437.bmoattachments.org/attachment.cgi?id=8987065 > 0:34.83 INFO: Assertion failure: ret, at /builds/worker/workspace/build/src/gfx/webrender_bindings/Moz2DImageRenderer.cpp:336 This bug still has its original crash reason while MOZ_RELEASE_ASSERT(aBlob.length() > sizeof(size_t)) from comment 15 and comment 17 is tracked in bug 1486198.
Summary: Crash in static bool mozilla::wr::Moz2DRenderCallback [Google Slides] → Crash in static bool mozilla::wr::Moz2DRenderCallback MOZ_RELEASE_ASSERT(ret) [Google Slides]
(In reply to Mayank Bansal from comment #17) > I got crashes like these when opening blank google sheets and typing something. But its not 100% reproducible. > https://crash-stats.mozilla.com/report/index/85aaeae6-440d-4664-ac92-35f760180909 Google Slides has the crash reason of bug 1486198. Moving it over there.
Assignee: jmuizelaar → nobody
Blocks: stage-wr-trains
No longer blocks: stage-wr-nightly
Priority: P1 → P2
Summary: Crash in static bool mozilla::wr::Moz2DRenderCallback MOZ_RELEASE_ASSERT(ret) [Google Slides] → Crash in static bool mozilla::wr::Moz2DRenderCallback MOZ_RELEASE_ASSERT(ret)
Flags: needinfo?(jmuizelaar)
Priority: P2 → P3
Priority: P3 → P2
I'll take a look at this. I can repro some sort of lockup using the testcase in bug 1470437.
Assignee: nobody → kats
The recorded stream has a CreateSimilarDrawTarget with a giant size, and trying to do that operation on a skia draw target fails. Presumably the better thing to do is to fail earlier on the content side when creating the stream.
I'm not sure this is the right approach.
Do you want to (a) try to detect and avoid it on the content side, or (b) handle it more gracefully on the compositor side? If (a) is your concern that we're going to be creating these temporary drawtargets? Or that it might still work on the content side but fail on the compositor side? Or something else? Another option for (a) side is to somehow extract the logic from skia drawtarget creation where it does the sanity checking of parameters and reuse that. If you prefer (b) I can try to write something that does that, I just assumed (a) was the way to go.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #27) > I'm not sure this is the right approach. To elaborate I don't want to pay the cost of allocating the surface during recording. What does the caller do to handle the error?
On the content side the call stack looks like this: #01: gfxCallbackDrawable::MakeSurfaceDrawable(gfxContext*, mozilla::gfx::SamplingFilter)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x121888e] #02: gfxCallbackDrawable::Draw(gfxContext*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits, double> const&, mozilla::gfx::ExtendMode, mozilla::gfx::SamplingFilter, double, mozilla::gfx::BaseMatrix<double> const&)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1218bb8] #03: gfxUtils::DrawPixelSnapped(gfxContext*, gfxDrawable*, mozilla::gfx::SizeTyped<mozilla::gfx::UnknownUnits, double> const&, mozilla::image::ImageRegion const&, mozilla::gfx::SurfaceFormat, mozilla::gfx::SamplingFilter, unsigned int, double, bool)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x12c32fd] #04: mozilla::image::DynamicImage::Draw(gfxContext*, mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, mozilla::image::ImageRegion const&, unsigned int, mozilla::gfx::SamplingFilter, mozilla::Maybe<mozilla::SVGImageContext> const&, unsigned int, f[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1340227] #05: DrawImageInternal(gfxContext&, nsPresContext*, imgIContainer*, mozilla::gfx::SamplingFilter, nsRect const&, nsRect const&, nsPoint const&, nsRect const&, mozilla::Maybe<mozilla::SVGImageContext> const&, unsigned int, mozilla::gfx::ExtendMode, float)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x388dd65] #06: nsLayoutUtils::DrawImage(gfxContext&, mozilla::ComputedStyle*, nsPresContext*, imgIContainer*, mozilla::gfx::SamplingFilter, nsRect const&, nsRect const&, nsPoint const&, nsRect const&, unsigned int, float)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3890395] #07: mozilla::nsImageRenderer::Draw(nsPresContext*, gfxContext&, nsRect const&, nsRect const&, nsRect const&, nsPoint const&, nsSize const&, mozilla::gfx::IntRectTyped<mozilla::CSSPixel> const&, float)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3c22970] #08: mozilla::nsImageRenderer::DrawLayer(nsPresContext*, gfxContext&, nsRect const&, nsRect const&, nsPoint const&, nsRect const&, nsSize const&, float)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3c23e5e] #09: nsCSSRendering::PaintStyleImageLayerWithSC(nsCSSRendering::PaintBGParams const&, gfxContext&, mozilla::ComputedStyle*, nsStyleBorder const&)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3bbe52a] #10: nsCSSRendering::PaintStyleImageLayer(nsCSSRendering::PaintBGParams const&, gfxContext&)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3bbd26e] #11: nsDisplayBackgroundImage::PaintInternal(nsDisplayListBuilder*, gfxContext*, nsRect const&, nsRect*)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3be8c50] #12: nsDisplayCanvasBackgroundImage::Paint(nsDisplayListBuilder*, gfxContext*)[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x39141ad] #13: mozilla::layers::PaintItemByDrawTarget(nsDisplayItem*, mozilla::gfx::DrawTarget*, mozilla::gfx::PointTyped<mozilla::LayoutDevicePixel, float> const&, nsDisplayListBuilder*, RefPtr<mozilla::layers::BasicLayerManager> const&, mozilla::gfx::SizeTyped<mozilla[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x10c8521] #14: mozilla::layers::WebRenderCommandBuilder::GenerateFallbackData(nsDisplayItem*, mozilla::wr::DisplayListBuilder&, mozilla::wr::IpcResourceUpdateQueue&, mozilla::layers::StackingContextHelper const&, nsDisplayListBuilder*, mozilla::gfx::RectTyped<mozilla::L[/Users/kats/zspace/mozilla-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x10c72a6] And the first couple of frames gracefully handle not getting back a drawtarget by just skipping the drawing. So we end up with a blank page on the testcase, which is the same as we get with non-WR. I can try to extract the logic for determining if the surface allocation parameters are valid and just use that without actually allocating the surface.
Is that the callstack for the crashtest?
Flags: needinfo?(kats)
Yeah. I don't have any other STR beyond that test case. it's possible there are other causes.
Flags: needinfo?(kats)
Ah, I see now that the test case use -moz-element. I'd be interested to know what mstange's thoughts on the best way to fix this are.
Flags: needinfo?(mstange)
Attached patch Alternative part 1 (deleted) — Splinter Review
Here's an alternative fix that works on the test case as well, and avoids the texture allocation by testing the size. If this approach is acceptable I can update the phabricator patch.
Attachment #9016717 - Attachment description: cancreate → Alternative part 1
There should be an optimization that avoids allocating a surface in the case where the image is not repeated. I don't know if that optimization still exists. It would also to be good to have a size check in -moz-element-specific code that refuses to create "dynamic images" of sizes that are larger than what we would allow for regular pixel images. And we probably want to keep some detection in the DrawTargetRecording code for large DTs, but maybe it should just be a crash so that we can root out badly-behaved consumers.
Flags: needinfo?(mstange)
(In reply to Markus Stange [:mstange] from comment #35) > There should be an optimization that avoids allocating a surface in the case > where the image is not repeated. I don't know if that optimization still > exists. I looked for something like this but it's hard to look for something that may or may not exist. I didn't find anything but that doesn't mean much. > It would also to be good to have a size check in -moz-element-specific code > that refuses to create "dynamic images" of sizes that are larger than what > we would allow for regular pixel images. Just so I'm clear, does this mean nsImageRenderer::DrawableForElement should return null for large image sizes? I'm not too familiar with this code. It looks like that function is also invoked from some border rendering code, do we want that affected as well? If not the check could go at the relelvant call site [1]. > And we probably want to keep some detection in the DrawTargetRecording code > for large DTs, but maybe it should just be a crash so that we can root out > badly-behaved consumers. Ok, but we should avoid crashing on the gpu side, right? It's better to crash on the content side when the bad consumer tries to create the DrawTargetRecording, that way we get possibly-useful stacks. So we'd want something like my patch (the one in phabricator) but that crashes instead of returns null when the underlying drawtarget returns null. [1] https://searchfox.org/mozilla-central/rev/39cb1e96cf97713c444c5a0404d4f84627aee85d/layout/painting/nsImageRenderer.cpp#507
Flags: needinfo?(mstange)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #36) > (In reply to Markus Stange [:mstange] from comment #35) > > There should be an optimization that avoids allocating a surface in the case > > where the image is not repeated. I don't know if that optimization still > > exists. > > I looked for something like this but it's hard to look for something that > may or may not exist. I didn't find anything but that doesn't mean much. We create a gfxCallbackDrawable for drawing -moz-element here: https://searchfox.org/mozilla-central/rev/6e0e603f4852b8e571e5b8ae133e772b18b6016e/layout/svg/nsSVGIntegrationUtils.cpp#1314 And here's gfxCallbackDrawable::Draw: https://searchfox.org/mozilla-central/rev/6e0e603f4852b8e571e5b8ae133e772b18b6016e/gfx/thebes/gfxDrawable.cpp#161-172 It only calls MakeSurfaceDrawable if the image is repeated (and in a few other cases). Otherwise, it draws directly onto the destination. At least that's the idea. > > It would also to be good to have a size check in -moz-element-specific code > > that refuses to create "dynamic images" of sizes that are larger than what > > we would allow for regular pixel images. > > Just so I'm clear, does this mean nsImageRenderer::DrawableForElement should > return null for large image sizes? I'm not too familiar with this code. It > looks like that function is also invoked from some border rendering code, do > we want that affected as well? If not the check could go at the relelvant > call site [1]. I think it would be reasonable to do that to nsImageRenderer::DrawableForElement in general. > > And we probably want to keep some detection in the DrawTargetRecording code > > for large DTs, but maybe it should just be a crash so that we can root out > > badly-behaved consumers. > > Ok, but we should avoid crashing on the gpu side, right? It's better to > crash on the content side when the bad consumer tries to create the > DrawTargetRecording, that way we get possibly-useful stacks. So we'd want > something like my patch (the one in phabricator) but that crashes instead of > returns null when the underlying drawtarget returns null. "when the underlying drawtarget would return null", yes. We don't actually want to allocate pixels on the recording side, but we do want the size check on the recording side.
Flags: needinfo?(mstange)
So it looks like we're hitting that MakeSurfaceDrawable call because the image is repeated, that seems to happen at [1]. The aExtendMode argument to that function is set to CLAMP but doTile is true. [1] https://searchfox.org/mozilla-central/rev/17f55aee76b7c4610a974cffd3453454e0c8de7b/layout/base/nsLayoutUtils.cpp#6948
Attachment #9015981 - Attachment is obsolete: true
Badly-behaved consumers of DrawTargetRecording can trigger recording of draw calls that will fail to allocate required draw targets when the recording is replayed. This patch tries to guard against this by detecting these situations at record-time rather than crashing at replay-time. When such a situation is detected, it will crash (for content processes, to catch such scenarios) or gracefully fail (for other processes). Depends on D8257
Pushed by kgupta@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e80a0b2502e9 Add a gfxCriticalNote to provide more details on replay failure. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/75de649833be Robustify DrawTargetRecording codepaths that create new drawtargets. r=mstange https://hg.mozilla.org/integration/autoland/rev/dae86d421021 Prevent creation of DynamicImage instances that are excessively large. r=mstange
No longer blocks: 1391255
Regressed by: 1391255
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: