Avoid SIGBUS from shared memory allocation failures on Linux/BSD
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | disabled |
firefox78 | --- | wontfix |
firefox79 | --- | wontfix |
firefox80 | --- | fixed |
People
(Reporter: gwarser, Assigned: aosmond)
References
(Blocks 3 open bugs)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(2 files)
This bug is for crash report bp-9ed48f43-c4ba-4dd1-8a2e-847cc0190921.
Top 10 frames of crashing thread:
0 libc-2.29.so __memcpy_sse2_unaligned_erms
1 libxul.so mozilla::image::BlendAnimationFilter<mozilla::image::SurfaceSink>::DoAdvanceRow image/SurfaceFilters.h:683
2 libxul.so mozilla::image::nsGIFDecoder2::ReadLZWData image/decoders/nsGIFDecoder2.cpp:1003
3 libxul.so mozilla::Maybe<mozilla::Variant<mozilla::image::TerminalState, mozilla::image::Yield> > mozilla::image::StreamingLexer<mozilla::image::nsGIFDecoder2::State, 16ul>::ContinueUnbufferedRead<mozilla::image::nsGIFDecoder2::DoDecode image/StreamingLexer.h:554
4 libxul.so mozilla::image::nsGIFDecoder2::DoDecode image/decoders/nsGIFDecoder2.cpp:445
5 libxul.so mozilla::image::Decoder::Decode image/Decoder.cpp:133
6 libxul.so mozilla::image::AnimationSurfaceProvider::Run image/AnimationSurfaceProvider.cpp:210
7 libxul.so mozilla::image::DecodePoolWorker::Run image/DecodePool.cpp:271
8 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1225
9 libxul.so <name omitted> xpcom/threads/nsThreadUtils.cpp:486
Bisected to https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=afcc4787126167bc6c395bc11c97158724658704&tochange=5ad0fb1caddddb365936dc8e89ca85bba57c886f for now
I want to say that I see this very often, but today was first time I get crash report dialog. This crash can be more frequent, but not noticed.
How it looks like: my screen starts flashing black, I see some graphic distortions, cursor freezes and finally I only get notification from KWin: "Desktop effects were restarted due to a graphics reset"
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
There are crashes before the bisection. They are slightly different because they have SIGSEGV instead SIGBUS, but are otherwise very similar. They appear to have a related GPU crash, but I can't find them in crash stats.
There are crashes before the bisection.
And not all crashes are reported(!) - nothing in about:crashes
.
Here, Swizzle somehow increases chance to trigger this, and it's easily reproducible on https://blog.google/products/chrome/get-more-done-with-google-chrome/ (I was bisecting with this URL)
Crashed from GIF decoder and WebP decoder, however both in BlendAnimationFilter
.
Anyway - unsupported platform/hardware, so not important.
Comment 4•5 years ago
|
||
Looks like swizzling is no longer a suspect to causing this,
I cannot reproduce anymore. Mozregression with --find-fix ends in 8th October, probably https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=fbbdc8a6447094a7cc5ab2cf02eafc26eeeb2f03&tochange=ac3bcdd939b430cf82492c342f13038509d1387c
I still see crashes in "Crash data" above, so crash is probably unrelated and this bug should be closed as invalid.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 6•5 years ago
|
||
We still saw a crash on Oct 30th, so I don't think it can be duped against it. But probably changed a code path and avoided the problem in some cases.
Comment 7•5 years ago
|
||
Around 25 of these crashes in the last week, all on Nightly.
Comment 8•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Assignee | ||
Comment 9•5 years ago
|
||
We do a ftruncate on files for shared memory we allocate on Linux:
An alternative is fallocate or posix_fallocate:
https://linux.die.net/man/2/fallocate
https://linux.die.net/man/3/ftruncate
Comment 10•5 years ago
|
||
The crash reason is SIGBUS and the crashing address is almost always a multiple of the page size so this is very likely to be an OOM crash were the kernel could not find a free physical page to page in.
Comment 11•5 years ago
|
||
I just experienced this issue while browsing this page. At the time of the crash, I was reading the bottom of the page and the GIF was not visible.
The GIF which is played on this webpage is: https://media.giphy.com/media/JpG2A9P3dPHXaTYrwu/giphy.gif
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 12•4 years ago
|
||
WebRender makes extensive use of shared memory buffers, particularly for
images decoded in the content process. These images can be arbitrarily
large, and there being insufficient memory for an allocation must be
handled gracefully.
On Linux, we will currently crash with a SIGBUS signal during image
decoding instead of just displaying the broken image tag. This is
because the pages backing the shared memory are only allocated when we
write to them. This blocks shipping WebRender on Linux.
This patch uses posix_fallocate to force the reservation of the pages,
and allows failing gracefully if they are unavailable.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Reporter | ||
Comment 13•4 years ago
|
||
Also [@ __memcpy_ssse3 | mozilla::image::BlendAnimationFilter<T>::DoAdvanceRow ]
? https://crash-stats.mozilla.org/report/index/055156b6-165c-4735-9b1b-740a00200626
I can reproduce tab crash by opening four copies of https://blog.google/products/chrome/get-more-done-with-google-chrome/ (2GB /dev/shm
)
This does not seem to be related to screen flickering/distortions and KWin reporting graphic reset as I thought it is when I created this bug. However desktop is micro-freezing when page loads when /dev/shm
has low available.
Assignee | ||
Updated•4 years ago
|
Comment 14•4 years ago
|
||
Comment 15•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Updated•4 years ago
|
Description
•