1446519 - Need a tool for measuring non-heap process memory

Reporter

Description

•

7 years ago

For bug 1436250 we will care about non-heap process memory. This includes whatever TEXT and DATA segment bits we can't share with other processes, for example. Right now we don't have a measurement for this in about:memory. Do we have some other tool that does the job? If not, we probably want to create one. A bare minimum output for such a tool is a "total" number; that will at least allow one to test the impact of a change. Better would be to have some sort of breakdown or some sort of guidance as to what might be taking a lot of non-heap memory or whatnot.

Andrew McCreight [:mccr8]

Comment 1

•

7 years ago

resident-unique is in about:memory and I think sounds like what you are interested in (well, you'd have to remove explicit from the total), but there's no breakdown of any kind.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 2

•

7 years ago

resident-unique may not cover everything we care about here. For example, if there's per-content-process shared memory that is only shared with the parent, then we'd want to include it in this metric, right? But yes, "resident-unique - explicit" is at least a start.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

7 years ago

Also, "resident-unique" is smaller than "explicit" for me, probably because some things are swapped out?

Andrew McCreight [:mccr8]

Comment 4

•

7 years ago

decommitted-arenas might account for some of that. (When an arena isn't being used, but it is in a JS GC chunk that still has live arenas, we decommit to release the physical memory.)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 5

•

7 years ago

So for my main process right now I have: 1,905.98 MB (100.0%) -- explicit 701.14 MB ── resident-unique 47.96 MB (100.0%) -- decommitted For one of the web content processes, I have: 461.17 MB (100.0%) -- explicit 217.75 MB ── resident-unique 134.25 MB (100.0%) -- decommitted So yes, it can account for some of it. More so for the second case than the first one... ;) My point is that we should have a tool that, like the main heap measurement tree of about:memory can be used by non-experts. Even something as simple as putting all the relevant numbers in one spot would be a big help.

Nicholas Nethercote [inactive]

Comment 6

•

7 years ago

The script attached to bug 1254777 is quite useful. It analyzes Linux libraries and binaries. Bloaty looks like a more advanced take on the same basic idea. It's available here: https://github.com/google/bloaty.

Marissa (Reese) Wood

Comment 7

•

7 years ago

It appears this is being worked and it is not blocking a release. Emma indicated that I should put this in memory allocator.

Component: General → Memory Allocator

Priority: -- → P3

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 8

•

7 years ago

(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #2) > resident-unique may not cover everything we care about here. For example, > if there's per-content-process shared memory that is only shared with the > parent, then we'd want to include it in this metric, right? On Linux we have access to the Proportional Set Size, which is the sum over resident pages of (page size / n) where n is the number of places the page is mapped, and that's reported for each virtual memory area. We could also do different things for different types of memory: USS for data/relro, RSS for shared memory. Also on Linux, we could add support for tagging IPC shared memory segments with names (that can be read out from procfs) if that turns out to be something we need more visibility into.

Eric Rahm [:erahm]

Comment 9

•

7 years ago

(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #3) > Also, "resident-unique" is smaller than "explicit" for me, probably because > some things are swapped out? |explicit| can contain non-heap entries. The delta you're interested in is probably something like |resident-unique| - |heap-allocated| (In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #2) > resident-unique may not cover everything we care about here. For example, > if there's per-content-process shared memory that is only shared with the > parent, then we'd want to include it in this metric, right? Shared memory *should* be reported (possibly only in the parent process, I'd have to take a look at that reporter again) [1]. (In reply to Jed Davis [:jld] (⏰UTC-6) from comment #8) > We could also do different things for different types of memory: USS for > data/relro, RSS for shared memory. Seems like we should just resurrect the system memory reporter, I think that covered a fair amount of this. I can probably add part of that back, but I might tag you or glandium in to flesh out the handling of smaps info though. [1] https://searchfox.org/mozilla-central/rev/78dbe34925f04975f16cb9a5d4938be714d41897/ipc/glue/SharedMemory.cpp#31-39

Eric Rahm [:erahm]

Updated

•

6 years ago

Whiteboard: [overhead:noted]

Eric Rahm [:erahm]

Comment 10

•

6 years ago

A quick update on where we're at: - Section sizes are now being tracked as build metrics as of bug 1463296 - Committed stack sizes are being worked on in bug 1446519 - Shared memory should already reported I'm not sure what else we want to add at this point.

Mike Hommey [:glandium]

Comment 11

•

6 years ago

> I'm not sure what else we want to add at this point. System allocator-allocated memory? bug 828844 did it for Linux, bug 1194061 for Windows, but AFAIK we're still short on Android and Mac. I'm also not sure we track GPU memory on all platforms if at all.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 12

•

6 years ago

In theory we could have random mmap(MAP_ANONYNOUS) calls that are happening behind jemalloc's back. In practice, it's not clear how we'd detect those. What might be interesting is comparing the sum of all the bits we know about with what the OS thinks is going on, if we can ask the OS for the information we actually care about here. If they're close enough, we're done. If not, we need to think about what could be causing the discrepancy...

Kris Maglione [:kmag]

Comment 13

•

6 years ago

(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #12) > In theory we could have random mmap(MAP_ANONYNOUS) calls that are happening > behind jemalloc's back. In practice, it's not clear how we'd detect those. I've been considering the possibility of interposing malloc calls in third-party libraries so we can get some handle of how much is being allocated by things like fontconfig. In theory, we could do the same for mmap. That's a non-trivial but doable problem on Linux. I don't know enough about Windows or mach linkers to know how doable it is on those platforms. > What might be interesting is comparing the sum of all the bits we know about > with what the OS thinks is going on, if we can ask the OS for the > information we actually care about here. If they're close enough, we're > done. If not, we need to think about what could be causing the > discrepancy... We already do that, to various degrees on various platforms. Windows apparently has the concept of multiple heaps, and we have accounting for how much space the non-jemalloc heaps use. On other platforms, we have accounting for how much virtual memory is allocated. The extra allocations are basically the difference between the sum of explicit allocations and the resident-unique numbers. I suppose having a separate reporter for that, similar to heap-unclassified, might make sense...

Mike Hommey [:glandium]

Comment 14

•

6 years ago

We *are* interposing malloc calls from third-party libraries on mac and linux. We just can't tell them apart.

Kris Maglione [:kmag]

Comment 15

•

6 years ago

(In reply to Mike Hommey [:glandium] from comment #14) > We *are* interposing malloc calls from third-party libraries on mac and > linux. We just can't tell them apart. I mean specifically interposing calls from specific libraries, like we do for our bundled Hunspell.

Eric Rahm [:erahm]

Comment 16

•

6 years ago

(In reply to Kris Maglione [:kmag] from comment #13) > (In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from > comment #12) > > In theory we could have random mmap(MAP_ANONYNOUS) calls that are happening > > behind jemalloc's back. In practice, it's not clear how we'd detect those. > > I've been considering the possibility of interposing malloc calls in > third-party libraries so we can get some handle of how much is being > allocated by things like fontconfig. In theory, we could do the same for > mmap. > > That's a non-trivial but doable problem on Linux. I don't know enough about > Windows or mach linkers to know how doable it is on those platforms. It seems like DMD is good enough for this, do we need an always-on thing?

Kris Maglione [:kmag]

Comment 17

•

6 years ago

(In reply to Eric Rahm [:erahm] from comment #16) > It seems like DMD is good enough for this, do we need an always-on thing? So, there are two problems with DMD: 1) It requires a special build, which basically means that it's easy to use it to find information about our own configurations, but extremely difficult to get information about what happens in the wild. What kind of memory are random graphics drivers using? How much memory are fontconfig and GTK using for ordinary users, compared to stock Ubuntu, or computers of people like me or jld? 2) It's kind of easy to ignore things that only show up in DMD. Even those of us who run it don't run it that often, and we generally have to do different ad-hoc analyses when we do. It's way easier to ignore, to use the same examples, the megabytes of data that GTK and fontconfig use when they only show up in obscure DMD reports than when they show up at the top of about:memory every time you open it.

Mike Hommey [:glandium]

Comment 18

•

6 years ago

It actually doesn't require a special build. We just need to finish bug 1409739.

Nicholas Nethercote [inactive]

Comment 19

•

6 years ago

> I mean specifically interposing calls from specific libraries, like we do for our bundled Hunspell. To expand on that: we have `CountingAllocatorBase`. For third-party libraries that let you plug in your own allocator, we use it to count the memory allocated in that library. It's used by Hunspell, ICU, some media stuff, and (on Android) Freetype.

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Bugzilla

Need a tool for measuring non-heap process memory

Categories

(Core :: Memory Allocator, enhancement, P3)

Tracking

()

People

(Reporter: bzbarsky, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [overhead:noted])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Comment 18

Comment 19

Updated