Open Bug 813771 Opened 12 years ago Updated 2 years ago

Add memory reporter for zlib, for memory report dumping

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

People

(Reporter: n.nethercote, Unassigned)

References

Details

(Whiteboard: [MemShrink:P3])

Attachments

(2 files, 6 obsolete files)

Attachment 683738 [details] (from bug 801580) shows 272 KiB taken up by the gzipper for the memory dumping, under stacks like this: MOZ_Z_deflateInit2_ modules/zlib/src/deflate.c:297 gz_init src/modules/zlib/src/gzwrite.c:44 MOZ_Z_gzwrite src/modules/zlib/src/gzwrite.c:197 nsGZFileWriter::Write(nsACString_internal const&) xpcom/base/nsGZFileWriter.cpp:73 nsIGZFileWriter::Write(char const*, unsigned int) B2G/objdir-gecko/dist/include/nsIGZFileWriter.h:51 nsMemoryInfoDumper::DumpMemoryReportsToFileImpl(nsAString_internal const&) xpcom/base/nsMemoryInfoDumper.cpp:651 nsMemoryInfoDumper::DumpMemoryReportsToFile(nsAString_internal const&, bool, bool) xpcom/base/nsMemoryInfoDumper.cpp:367 That's big enough that it would be worth having a reporter for. zlib is unmodified 3rd party code, which is bad. But you can provide a custom allocator (z_stream_s.{zalloc,zfree}), which is good. But we use gzdopen()/gz_write(), and that doesn't provide access to the relevant z_stream_s, which is bad. Hmm.
Whiteboard: [MemShrink]
Summary: Add support for dumping a heap profile using bionic's trace-malloc equivalent on B2G → Add memory reporter for zlib, for memory report dumping
Since the gz* functions have a serious lack of flexibility, it seems like the best thing to do would be to write the gz "by hand". We don't need a fancy, fully functional gzipper, all we need a the minimalistic container, which is just a 10-bytes header (magic number, version number, timestamp), a deflate stream, and a 8-bytes footer (crc32, uncompressed length)
OS: Gonk (Firefox OS) → All
Hardware: ARM → All
The header would be something like this: 0x1f, 0x8b, 0x08, 0x01, 0x00, 0x00, 0x00, 0x00, 0x02, 0xff 4th byte might be 0x00 if gzipping something other than text, and the 9th 0x04 if not compressing with maximum compression. See RFC 1952 for more details.
No longer depends on: 808932
Assignee: nobody → slin
Whiteboard: [MemShrink] → [MemShrink:P2]
Attached patch v1: Memory report for gzwrite (obsolete) (deleted) — Splinter Review
Reporting the memory allocation specific for the GZFileWriter, which uses gzwrite() and gzdopen().
(In reply to Shelly Lin [:shelly] from comment #3) > Created attachment 686066 [details] [diff] [review] > v1: Memory report for gzwrite > > Reporting the memory allocation specific for the GZFileWriter, which uses > gzwrite() and gzdopen(). This won't work on android and on linux distros building with system zlib.
(and we should keep zlib pristine)
It's a real shame that gzopen doesn't let us specify our own allocator. I wonder why that is.
Yes, gzopen and gzwrite don't let us specify a custom allocator, in fact, gzopen only allocates a data structure for gzFile, but the deflateInit2 (called from gzwrite) allocates about 272 KB by default. And since gzopen/gzwrite lack of providing access to the relevant z_stream_s, how can I get the internal allocated value without adding a return function to zlib?
By not using gzopen/gzwrite altogether, which should not be so difficult (see comment 1 and 2)
I see, I thought we want to add the memory usage report of gzopen/gzwrite to about:memory.
There's only one place using gzopen/gzwrite.
Attached patch v1: Add memory report for gzwrite (obsolete) (deleted) — Splinter Review
Add the default memory used by gzwrite from zlib.
Attachment #686066 - Attachment is obsolete: true
Comment on attachment 686977 [details] [diff] [review] v1: Add memory report for gzwrite Unfortunately this only sort of works for our purposes. Even if this memory reporter reports the correct values, it doesn't integrate with a tool we use called DMD. We use that tool to find heap blocks which aren't covered by a memory reporter. The way DMD figures out whether a heap block is covered by a memory reporter is by checking whether we call a special version of malloc_usable_size for a given heap block. See for example the code in extensions/spellcheck/hunspell/src/mozHunspell around HunspellReportMemoryAllocation. The approach here can't do that, since it never sees zlib's heap blocks. That's why I think we need to give zlib a custom allocator which (like the Hunspell one) increments and decrements a counter. That will work properly with DMD.
Comment on attachment 686977 [details] [diff] [review] v1: Add memory report for gzwrite Review of attachment 686977 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/base/nsGZFileWriter.cpp @@ +26,5 @@ > +{ > + return int64_t(sGZWriteMemAlloc); > +} > + > +static size_t getGZWritDefaulteMemAlloc() { Typo: "Writ" --> "Write". @@ +46,5 @@ > + return (1 << windowBits) * 2 * sizeof(Byte) + // Size of sliding window. > + // Size of array of window index. > + (1 << windowBits) * sizeof(unsigned short) + > + //Size of hash table. > + (1 << (def_mem_level + 7)) * sizeof(unsigned short) + This is presumably determined from zlib's internals. It's hard to understand and verify. Computing the size analytically like this is error-prone. gzopen/gzwrite are just convenience functions layered on top of lower-level functions like deflateInit2. So the right way to do this is to rewrite the code that uses gzopen/gzwrite in terms of these lower-level functions. That allows us to use a custom allocator which can measure exactly how much memory is used.
Thanks for all your advices, this draft of patch adds the memory report for zlib with DMD feature. moz_zlib_alloc.c contains the custom memory allocation functions and the function pointers for use of DMD memory report. moz_zutil.h is the bridge for nzGZFileWriter to set its function pointers. As for re-writing the gzopen/gzwrite, I would suggest it be a separate bug in this case.
Attachment #686977 - Attachment is obsolete: true
Comment on attachment 690254 [details] [diff] [review] v2: Draft of adding memory report for gzwrite with DMD Review of attachment 690254 [details] [diff] [review]: ----------------------------------------------------------------- This is definitely getting closer. ::: modules/zlib/src/Makefile.in @@ +23,4 @@ > > +DEFINES += -Dmalloc=moz_zlib_malloc > +DEFINES += -Dcalloc=moz_zlib_calloc > +DEFINES += -Dfree=moz_zlib_free This reminds me of hunspell_alloc_hooks.h and how it's used in mozilla-config.h. If you're going to force-replace malloc and free in zlib, it might be better to follow more closely the established approach used for hunspell. ::: modules/zlib/src/moz_zlib_alloc.c @@ +16,5 @@ > +void* moz_zlib_malloc(uInt size) > +{ > + void* result = malloc(size); > + if (zlibReportMemoryAllocation) { > + (*zlibReportMemoryAllocation)(result); You can just write |zlibReportMemoryAllocation(result);| ::: xpcom/base/nsGZFileWriter.cpp @@ +19,5 @@ > > +static size_t sGZWriteMemAlloc = 0; > + > +NS_MEMORY_REPORTER_MALLOC_SIZEOF_FUN(GZWriteMallocSizeOfForCounterInc, "gzwrite") > +NS_MEMORY_REPORTER_MALLOC_SIZEOF_FUN_UN(GZWriteMallocSizeOfForCounterDec) Be warned that bug 717853 will change this stuff slightly -- the first macro is renamed NS_MEMORY_REPORTER_MALLOC_SIZEOF_ON_ALLOC_FUN and the second is renamed NS_MEMORY_REPORTER_MALLOC_SIZEOF_ON_FREE_FUN. @@ +36,5 @@ > + return int64_t(sGZWriteMemAlloc); > +} > + > +NS_MEMORY_REPORTER_IMPLEMENT(GZWrite, > + "zlib-gzwrite", This should be "explicit/zlib-gzwrite". @@ +37,5 @@ > +} > + > +NS_MEMORY_REPORTER_IMPLEMENT(GZWrite, > + "zlib-gzwrite", > + KIND_OTHER, This should be KIND_HEAP. @@ +40,5 @@ > + "zlib-gzwrite", > + KIND_OTHER, > + nsIMemoryReporter::UNITS_BYTES, > + GetGZWriteSize, > + "Default memory used by gzwrite from zlib.") Remove "Default"? @@ +79,5 @@ > // gzdopen returns NULL on error. > NS_ENSURE_TRUE(mGZFile, NS_ERROR_FAILURE); > mInitialized = true; > > + NS_RegisterMemoryReporter(new NS_MEMORY_REPORTER_NAME(GZWrite)); You're registering a new memory reporter for every nsGZFileWriter that is created. You should save the created nsIMemoryReporter object in a nsCOMPtr member in nsGZFileWriter, and then unregister it in ~nsGZFileWriter.
Comment on attachment 690254 [details] [diff] [review] v2: Draft of adding memory report for gzwrite with DMD Review of attachment 690254 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/base/nsGZFileWriter.cpp @@ +70,5 @@ > nsresult rv = aFile->OpenANSIFileDesc("w", &file); > NS_ENSURE_SUCCESS(rv, rv); > > + mozSetGzMemReportFunc(GZWriteReportMemoryAllocation, > + GZWriteReportMemoryDeallocation); If another thread is using zlib, you'll be misreporting. That being said, is there a reason to restrict to the memory used by gzwrite? We have various code using zlib, and knowing the memory amounts they use could be useful. Note this unconditional call to mozSetGzMemReportFunc will break --with-system-zlib builds, including the android tinderbox builds (btw, i wonder why b2g doesn't use system zlib).
(In reply to Mike Hommey [:glandium] from comment #16) > Note this unconditional call to mozSetGzMemReportFunc will break > --with-system-zlib builds, including the android tinderbox builds (btw, i > wonder why b2g doesn't use system zlib). I just never got around to turning it on. We should probably do it since the system copy can have neon optimizations which.. we probably don't have, I'm guessing.
> I just never got around to turning it on. We should probably do it since the system copy > can have neon optimizations which.. we probably don't have, I'm guessing. I filed bug 820169.
An alternative approach here would be to use what's being discussed in bug 820133 comment 7. You'd do something like { MallocMemoryOwner owner("zlib"); gzwrite(...); } and then any malloc's made by gzwrite would be "charged" to "zlib", without any further hassle.
Attached patch v2: Add memory report for gzwrite with DMD (obsolete) (deleted) — Splinter Review
This is a revised version per previous feedback. Thanks for all your advices, and I think both approaches of replace-malloc discussed in bug 820133 is a better solution than using function pointers in this case.
Attachment #690254 - Attachment is obsolete: true
Comment on attachment 691185 [details] [diff] [review] v2: Add memory report for gzwrite with DMD Review of attachment 691185 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/base/nsGZFileWriter.cpp @@ +24,5 @@ > + > +static void GZWriteReportMemoryAllocation(void* ptr) > +{ > + sGZWriteMemAlloc += GZWriteMallocSizeOfOnAlloc(ptr); > +} After I suggested you look at hunspell, I realized that this isn't thread-safe. (Maybe that's ok for hunspell because it's only used from the main thread?) I'm not sure how much of a problem that is in this case.
> I'm not sure how much of a problem that is in this case. Might as well use atomic operations to be safe; it's not any harder. Shelly, the atomic ops are in pratom.h.
Attached patch v2: Add memory report for gzwrite with DMD (obsolete) (deleted) — Splinter Review
Added atomic protection. Thanks Justin :)
Attachment #691185 - Attachment is obsolete: true
That's still not protected. If another thread uses zlib while gzwrite is being called, you'll get wrong counts.
(In reply to Mike Hommey [:glandium] from comment #24) > That's still not protected. If another thread uses zlib while gzwrite is > being called, you'll get wrong counts. Hi Mike, did you mean the part of setting function pointers? Should it be something like: PR_Lock(); if (!funcPtrFromZlib) { funcPtrFromZlib = GZWritefunc; PR_Unlock(); } else { PR_Unlock(); } Also, the value |sGZWriteMemAlloc| in getter function |GetGZWriteSize()| should be atomic too.
No, I mean that if some other thread calls in zlib while you're doing gzwrite, then their use of memory will be misattributed to the gzwrite.
(In reply to Mike Hommey [:glandium] from comment #26) > No, I mean that if some other thread calls in zlib while you're doing > gzwrite, then their use of memory will be misattributed to the gzwrite. Then I think the discussion in bug 820133 would be the root solution for reporting all memory usage of zlib. Further more, it makes more sense to report all memory used by zlib that does not use a custom memory allocator, not just memory used by gzwrite, but it would be separate issue then.
Attached patch v2: Add memory report for gzwrite with DMD (obsolete) (deleted) — Splinter Review
Include |moz_zlib_alloc.h|, which I forgot in the last one.
Attachment #691239 - Attachment is obsolete: true
Attachment #691667 - Attachment is obsolete: true
Attachment #699289 - Flags: feedback-
Attachment #699289 - Flags: feedback-
Comment on attachment 699289 [details] [diff] [review] v2: Add memory report for gzwrite with DMD Hi Justin, I actually can't generate DMD reports(by running get_about_memory.py) if I enable the MOZ_DMD, is that because I'm overwriting the malloc in zlib?
Attachment #699289 - Flags: feedback?(justin.lebar+bug)
> is that because I'm overwriting the malloc in zlib? Probably not... Did you do |export MOZ_DMD=1| in your .userconfig and then rm -rf objdir-gecko? We can look at this tomorrow if you'd like.
Attached file backtrace with thread info (deleted) —
Hi Justin, thanks for the look up, this is the back trace of thread info.
Comment on attachment 699289 [details] [diff] [review] v2: Add memory report for gzwrite with DMD >+static PRInt32 sGZWriteMemAlloc = 0; >+static PRInt32 sMemReportFuncHasSet = 0; >+ >+NS_MEMORY_REPORTER_MALLOC_SIZEOF_ON_ALLOC_FUN(GZWriteMallocSizeOfOnAlloc) >+NS_MEMORY_REPORTER_MALLOC_SIZEOF_ON_FREE_FUN(GZWriteMallocSizeOfOnFree) >+ >+static void GZWriteReportMemoryAllocation(void* ptr) >+{ >+ PR_AtomicAdd(&sGZWriteMemAlloc, GZWriteMallocSizeOfOnAlloc(ptr)); >+} >+ >+static void GZWriteReportMemoryDeallocation(void* ptr) >+{ >+ PR_AtomicAdd(&sGZWriteMemAlloc, -GZWriteMallocSizeOfOnFree(ptr)); >+} >+ >+static int64_t GetGZWriteSize() >+{ >+ return int64_t(PR_AtomicAdd(&sGZWriteMemAlloc, 0)); >+} >+ >+NS_MEMORY_REPORTER_IMPLEMENT(GZWrite, >+ "explicit/zlib-gzwrite", >+ KIND_HEAP, >+ nsIMemoryReporter::UNITS_BYTES, >+ GetGZWriteSize, >+ "Memory used by gzwrite from zlib.") >+ > NS_IMPL_ISUPPORTS1(nsGZFileWriter, nsIGZFileWriter) > > nsGZFileWriter::nsGZFileWriter() > : mInitialized(false) > , mFinished(false) >-{} >+{ >+ if (!PR_AtomicSet(&sMemReportFuncHasSet, 1)) { >+ mozSetGzMemReportFunc(GZWriteReportMemoryAllocation, >+ GZWriteReportMemoryDeallocation); >+ } >+ mGZWriteMemReporter = new NS_MEMORY_REPORTER_NAME(GZWrite); >+ NS_RegisterMemoryReporter(mGZWriteMemReporter); >+} I like mozSetGZMemReportFunc (we should see if glandium does), but there are two problems with how it's used. The most basic one is that you have one memory reporter per nsGZFileWriter, but each memory reporter reports the full sGZWriteMemAlloc value. So if we have two nsGZFileWriter objects, each of them will have a reporter reporting all the memory from zlib, which causes us to blame zlib for twice as much memory usage! If you simply register the memory reporter once (if sMemReportFuncHasSet is 0, for example), that will solve this problem. You don't need to keep a reference to the memory reporter; NS_RegisterMemoryReporter will hold the reporter alive. The other problem is that, any memory allocated by zlib while a nsGZFileWriter object is alive will be counted under explicit/zlib-gzwrite, even if the allocation was not due to a nsGZFileWriter. Even worse, suppose I do the following sequence: 1. Create nsGZFileWriter 2. Create a raw gzwrite stream 3. Destroy nsGZFileWriter 4. Destroy raw gzwrite stream In this case, the memory allocated during step (2) is counted in the memory reporter, but when we free that memory in step (4), it is /not/ reflected in the memory reporter! So it looks like we're leaking when we're not. I'd be OK with a situation where we simply called mozSetGZMemReportFunc when we start up. I'm not sure how glandium would feel about that, though.
Attachment #699289 - Flags: feedback?(justin.lebar+bug)
Flags: needinfo?(mh+mozilla)
comment 4 still applies.
Flags: needinfo?(mh+mozilla)
Hi Justin, thanks for pointing that out! I like the idea of setting up memory report functions when start up, is the Init() of nsMemoryReporterManager a good place to put at? Although this will still report the usage of raw gzwrite, which I think we might also want to report that out anyway? But it'll be harder to find out the "exact" module who is using gzwrite.
> I like the idea of setting up memory report functions when start up, is the Init() of > nsMemoryReporterManager a good place to put at? That may be a bit too late; we want to register the report functions before anyone else has a chance to create gzip streams. Otherwise when those gzip streams close (potentially after we register the report functions), our count will go negative. If you put it somewhere in nsXPComInit.cpp, that would probably be early enough. But perhaps we can reconsider glandium's suggestion of writing the gz header manually? That's an ideal solution in that a) it works even if we use the system zlib, and b) we can expose nsGZWriter::SizeOfIncludingThis, which would let callers report /one specific/ gz writer in about:memory, giving us more insight into where their memory is going.
> But perhaps we can reconsider glandium's suggestion of writing the gz header > manually? Yes, that really is The Right Way To Do This.
Another thing: we want to measure the allocations done by the first nsGZFileWriter in nsMemoryInfoDumper::DumpMemoryReportsToFileImpl (the one used for the memory reports) but not those done by the second one (the one used by DMD).
Reading the zlib documentation is enlightening (i was doing that for something unrelated). It's possible to get deflate() to emit a simple gz header by using deflateInit2 and giving a windowBits value greater than 15: windowBits can also be greater than 15 for optional gzip encoding. Add 16 to windowBits to write a simple gzip header and trailer around the compressed data instead of a zlib wrapper. The gzip header will have no file name, no extra data, no comment, no modification time (set to zero), no header crc, and the operating system will be set to 255 (unknown). If a gzip stream is being written, strm->adler is a crc32 instead of an adler32.
Assignee: slin → nobody
Moving to xpcom, this isn't Firefox OS specific. Also marking as P3 as this is less important in the non-FxOS world.
Component: General → XPCOM
Product: Firefox OS → Core
Whiteboard: [MemShrink:P2] → [MemShrink:P3]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: