Commit-space usage investigation
Categories
(Core :: General, enhancement, P3)
Tracking
()
People
(Reporter: gsvelto, Unassigned)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [MemShrink:P2][overhead:noted])
Comment 1•6 years ago
|
||
Updated•6 years ago
|
Reporter | ||
Comment 2•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Updated•4 years ago
|
Reporter | ||
Comment 3•4 years ago
|
||
After discussing this with my team today I had another quick look at this and here's a few data-points and pointers on how to conduct this investigation.
First of all the measurements: in about:memory what you're looking for is a (significant) discrepancy between the memory we've explicitly allocated (and accounted for) and the memory that Windows considers committed. The former is the explicit entry under Explicit Allocations and the latter is address-space > commit > private. Here's an example from my main process:
Explicit Allocations
462.85 MB (100.0%) ++ explicit
Other Measurements
134,217,727.94 MB (100.0%) -- address-space
└────────1,478.58 MB (00.00%) -- commit
├────692.89 MB (00.00%) ++ mapped
├────513.42 MB (00.00%) -- private
│ ├──506.80 MB (00.00%) ── readwrite(segments=2564)
│ ├────2.82 MB (00.00%) ── execute-read(segments=23)
│ ├────2.40 MB (00.00%) ── readwrite+stack(segments=107)
│ ├────1.37 MB (00.00%) ── readwrite+guard(segments=107)
│ ├────0.02 MB (00.00%) ── noaccess(segments=5)
│ └────0.02 MB (00.00%) ── readonly(segments=3)
└────272.27 MB (00.00%) ++ image
This is not so bad, there's only around 50MiB of private committed memory that's not accounted for. Note that under the private entry you will find multiple ones. readwrite is usually regular annotations, execute-read are probably going to be buffers for JIT'd code, readwrite+stack is obviously the stack, readwrite+guard are guard pages.
Here's another example, this time it's the gpu process:
Explicit Allocations
121.29 MB (100.0%) ++ explicit
Other Measurements
134,217,727.94 MB (100.0%) -- address-space
└────────1,080.50 MB (00.00%) -- commit
├────582.56 MB (00.00%) -- private
│ ├──299.79 MB (00.00%) ── readwrite+writecombine(segments=215)
│ ├──280.58 MB (00.00%) ── readwrite(segments=439)
│ ├────1.34 MB (00.00%) ── readwrite+stack(segments=38)
│ ├────0.74 MB (00.00%) ── readwrite+guard(segments=38)
│ ├────0.07 MB (00.00%) ── execute-read(segments=2)
│ ├────0.03 MB (00.00%) ── readonly(segments=6)
│ └────0.02 MB (00.00%) ── noaccess(segments=6)
├────255.39 MB (00.00%) ++ mapped
└────242.55 MB (00.00%) ++ image
Different story here, we've explicitly allocated ~120MiB of memory but there's over 580MiB that are committed! The readwrite entry is over twice the size of our explicitly allocated memory so there's something else allocating memory - my guess is that's the graphics drivers or DirectX runtime. Then there's the readwrite+writecombine entry which is the most suspicious of all. This is uncacheable memory with write-combining enabled which is the hallmark of a buffer that must have been allocated by the graphics driver for use with the GPU. As you can see this is very large.
Last but not least this is a content process:
web (pid 14288)
Explicit Allocations
493.66 MB (100.0%) ++ explicit
Other Measurements
134,217,727.94 MB (100.0%) -- address-space
└────────1,466.72 MB (00.00%) -- commit
├────643.57 MB (00.00%) ++ mapped
├────590.52 MB (00.00%) -- private
│ ├──524.88 MB (00.00%) ── readwrite(segments=769)
│ ├───59.04 MB (00.00%) ── readwrite+writecombine(segments=24)
│ ├────3.63 MB (00.00%) ── execute-read(segments=12)
│ ├────2.08 MB (00.00%) ── readwrite+stack(segments=65)
│ ├────0.86 MB (00.00%) ── readwrite+guard(segments=65)
│ ├────0.03 MB (00.00%) ── readonly(segments=6)
│ └────0.01 MB (00.00%) ── noaccess(segments=2)
└────232.64 MB (00.00%) ++ image
This is a mixed-bag, the readwrite chunk is only a bit larger than our explicit allocations but we've got a fairly hefty readwrite+writecombine chunk. I have no idea what that could be for: textures to back canvas elements maybe? Or buffers for video decoding?
Reporter | ||
Comment 4•4 years ago
|
||
In order to figure out where that memory is going the best way would be to hook up a debugger to one of the affected processes and use it to get stack traces out of VirtualAlloc()
and VirtualAllocEx()
calls. In particular we're interested in calls that have the MEM_COMMIT
flag set in the flAllocationType
parameter for those are the ones that are actually committing the memory they're requesting. Additionally one could look for calls that have the PAGE_WRITECOMBINE
flag set in the flProtect
parameter. See this page for more info about that flag.
Reporter | ||
Comment 5•4 years ago
|
||
As discussed on Matrix it might also be worth checking some of the file mappings we have under the mapped and image entries (though the latter should be largely made up of xul.dll). In that case we'd have to get stacks for MapViewOfFile()
and friends.
Comment 6•4 years ago
|
||
I updated the query in comment 0 [1]. It still shows the low-commit-space-situation is the highest. Looking at per-process view [2], the main process shows the biggest number.
[1] https://sql.telemetry.mozilla.org/queries/79144#196656
[2] https://sql.telemetry.mozilla.org/queries/79145#196658
Comment 7•4 years ago
|
||
I should also point out that Windows does not overcommit (ie, does not have an OOM killer). Reserved memory may not be committed, but it may not necessarily be excluded from commit space, either.
Reporter | ||
Comment 8•4 years ago
|
||
(In reply to Toshihito Kikuchi [:toshi] from comment #6)
I updated the query in comment 0 [1]. It still shows the low-commit-space-situation is the highest. Looking at per-process view [2], the main process shows the biggest number.
[1] https://sql.telemetry.mozilla.org/queries/79144#196656
[2] https://sql.telemetry.mozilla.org/queries/79145#196658
Thanks Toshihito!
(In reply to Aaron Klotz [:aklotz] from comment #7)
I should also point out that Windows does not overcommit (ie, does not have an OOM killer). Reserved memory may not be committed, but it may not necessarily be excluded from commit space, either.
Yes, what got me into this was the realization that the majority of our users experiencing OOM crashes on Windows had plenty of physical memory available at the time of the crash.
Comment 9•2 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #0)
Query #79144 appears to have been replaced with a different query(?), so here's an up-to-date version:
Reporter | ||
Comment 11•2 years ago
|
||
Number of crashes in the nightly channel
Comment 12•2 years ago
|
||
^ Yes, that.
(In reply to Gabriele Svelto [:gsvelto] from comment #3)
First of all the measurements: in about:memory what you're looking for is a (significant) discrepancy between the memory we've explicitly allocated (and accounted for) and the memory that Windows considers committed. The former is the explicit entry under Explicit Allocations and the latter is address-space > commit > private.
It appears that (nowadays?) one should also subtract, from the explicit entry, any decoded-nonheap entries found thereunder. (I infer from context that these are image-data stored in temporary-file-backed shared memory, which wouldn't count towards the commit charge.)
Updated•2 years ago
|
Description
•