Crash in [@ OOM | large | mozalloc_abort | hashbrown::raw::RawTable<T>::new_uninitialized<T> | webrender_bindings::moz2d_renderer::{{impl}}::create_blob_rasterizer ]
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox77 | --- | unaffected |
firefox78 | --- | wontfix |
firefox79 | --- | wontfix |
firefox80 | --- | wontfix |
firefox81 | --- | fixed |
People
(Reporter: gsvelto, Assigned: aosmond)
References
(Blocks 2 open bugs)
Details
(Keywords: crash, perf-alert)
Crash Data
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
Details |
This bug is for crash report bp-e4802059-f714-46b0-84c6-e76460200523.
Top 9 frames of crashing thread:
0 mozglue.dll mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:33
1 mozglue.dll mozalloc_handle_oom memory/mozalloc/mozalloc_oom.cpp:51
2 xul.dll gkrust_shared::oom_hook::hook toolkit/library/rust/shared/lib.rs:221
3 xul.dll std::alloc::rust_oom ../4fb7144ed159f94491249e86d5bbd033b5d60550//src/libstd/alloc.rs:240
4 xul.dll alloc::alloc::handle_alloc_error ../4fb7144ed159f94491249e86d5bbd033b5d60550//src/liballoc/alloc.rs:268
5 xul.dll static hashbrown::raw::RawTable< /builds/worker/workspace/obj-build/toolkit/library/build/C:/Users/VssAdministrator/.cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.6.2/src/raw/mod.rs:393
6 xul.dll webrender_bindings::moz2d_renderer::{{impl}}::create_blob_rasterizer gfx/webrender_bindings/src/moz2d_renderer.rs:729
7 xul.dll webrender_api::resources::ApiResources::update gfx/wr/webrender_api/src/resources.rs:159
8 @0x49b86bea57
These are new OOM crashes which started with the introduction of hashbrown. The crashes aren't strange - these machines were running out of memory on their own - however it might be worth keeping an eye on them. Some of the OOM allocation sizes are several megabytes in size, in the case of this crash over 50 MiB. I understand hashbrown uses open addressing and stores elements inline, so it's normal that it will try and allocate large arrays for tables with a lot of entries, but still several megabytes sounds like the keys & entries are either very large or we're storing a ton of them.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Jeff - do you think there is anything we should look at here?
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
I think we can do better. We are asking for huge allocations sometimes, >10MB, for what is likely a blob update for a fraction of them. We don't need to clone the entire table, just the entries we need.
However, the entries themselves are relatively small. A rect, size and Arc. The table can only have a large footprint if there is a huge number of entries.
From my own testing, it seems easy to bloat the number of entries:
E.g. To go https://www.cbc.ca/news and scroll down to the bottom of the page and back up. If you add a print to watch the number of entries, you can see it is only ever updating a handful, but it can grow into the thousands quite quickly with only a few scrolling repetitions.
That seems more like a bug than something we should paper over without understand more.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 6•4 years ago
|
||
We probably should be checking self.blob_image_handler and deleting the cached blob commands here. It appears nothing ever removes it, unless we clear the entire map by closing the tab, etc.
Assignee | ||
Comment 7•4 years ago
|
||
The patch to fix the leak is nominally trivial (notwithstanding any issues we may run into once we start cleaning up data that might have complicated life expectancies).
I've noticed here:
... that we often have duplicate keys in the array. I'm investigating this as well now.
Assignee | ||
Comment 8•4 years ago
|
||
Never mind, we get bailed out in the tile calculations.
Assignee | ||
Comment 9•4 years ago
|
||
Comment 10•4 years ago
|
||
Comment 11•4 years ago
|
||
bugherder |
Comment 12•4 years ago
|
||
Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•4 years ago
|
Comment 13•4 years ago
|
||
== Change summary for alert #26799 (as of Wed, 26 Aug 2020 04:09:55 GMT) ==
Improvements:
35% Heap Unclassified windows10-64-shippable-qr opt 94,848,234.77 -> 61,557,475.02
34% Heap Unclassified windows10-64-shippable-qr opt 94,131,926.94 -> 61,727,578.04
10% Explicit Memory windows10-64-shippable-qr opt 401,881,782.15 -> 360,629,770.98
6% Resident Memory windows10-64-shippable-qr opt 606,783,131.93 -> 567,448,454.36
6% Heap Unclassified linux1804-64-shippable-qr opt 312,788,771.65 -> 293,997,667.17
5% Heap Unclassified windows10-64-shippable-qr opt tp6 76,177,792.89 -> 72,087,993.19
4% Explicit Memory linux1804-64-shippable-qr opt 620,928,900.77 -> 598,200,568.41
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26799
Comment 14•4 years ago
|
||
Wowza, those are some big improvements. Also, are we missing some memory reporters here given the massive reduction in heap unclassified?
Assignee | ||
Comment 15•4 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #14)
Wowza, those are some big improvements. Also, are we missing some memory reporters here given the massive reduction in heap unclassified?
It wasn't a leak in so far as would never free the memory, but if you stay on a page, it will accumulate over time as you interact with it. So I'm not too surprised to see changes.
Comment 16•4 years ago
|
||
That didn't answer my question about missing memory reporters. Changes of that magnitude in heap unclassified would suggest to me that we're missing something here.
Assignee | ||
Comment 17•4 years ago
|
||
Sorry, bug 1655039 is supposed to address blob images in general in the memory report. It is probably true if we had that completed, it would have identified this issue. In this case, it was a structure growing that should never have gotten that big. I will make a note on that bug to make sure the blob commands hash map is included in this effort.
Assignee | ||
Updated•4 years ago
|
Description
•