Closed Bug 503249 Opened 15 years ago Closed 13 years ago

fix the Valgrind annotations for jemalloc

Categories

(Core :: Memory Allocator, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: n.nethercote, Unassigned)

References

Details

Attachments

(4 files, 1 obsolete file)

Attached patch a good starting point (obsolete) (deleted) — Splinter Review
jemalloc's Valgrind annotations aren't quite right.  This might even be causing Valgrind to give some wrong messages:

- Bug 502474: these warnings don't look particularly suspect -- they're
all complaints about block overruns -- I'd lean towards believing
them.  Running with --disable-jemalloc will give a definitive answer,
though.

- Bug 502477: these look strange;  it's complaining about unaddressable
memory in the middle of a malloc'd block, which doesn't seem right.

- Bug 443965, Bug 457223: when Valgrind complains about something within
jemalloc itself it's almost certainly due to missing/incorrect
annotations within jemalloc.

There are two main problems I've identified with the annotations in jemalloc.c:

- MALLOCLIKE_BLOCK/FREELIKE_BLOCK are overused.  They should only be used for memory blocks that are being given to the client program, but they are currently also being used for memory blocks used only within jemalloc itself.  I can see why this is done like this -- FREELIKE_BLOCK marks memory as unaddressable, and so if jemalloc subsequently writes to it Memcheck will complain.  But there is another client request, VALGRIND_MAKE_MEM_UNDEFINED, that is more appropriate.

- The annotations are deep within the allocator (eg. in base_alloc()) rather than at the outer layer (eg. in malloc()).  This means that stack traces for annotated heap blocks have several internal jemalloc entries in them, which isn't of interest to most users.

I attach a patch that fixes these problems, but I'm not completely confident about it, because I don't understand jemalloc very well, particularly how it recycles freed blocks.  Also, there are lots of compile-time switches and I was only testing one configuration.  But hopefully it should be a good enough start that Jason (or someone else who knows jemalloc well) can improve upon it.  Key features:

- MALLOCLIKE_BLOCK/FREELIKE_BLOCK only appear at the outermost levels, ie. in malloc(), calloc(), free(), etc.  This results in better stack traces.

- VALGRIND_MAKE_MEM_UNDEFINED is used so that when jemalloc writes to blocks that have been freed (and thus marked by Memcheck as unaddressable) Memcheck won't complain.  In particular, the new malloc_fill() factors out a lot of cases where blocks are filled with 0x00 or 0xa5.  It also appears in arena_bin_nonfull_run_get();  that was necessary to get rid of lots of spurious Memcheck errors but I don't really understand the recycling that's going on here.

Overall it's about half as many annotations and results in nicer error messages.  There may be some extra MAKE_MEM_UNDEFINED annotations needed if there are other recycle-and-write points within jemalloc.

----

Another issue is redzones:  Valgrind's default malloc-replacement allocates blocks with 16-byte redzones at either end.  This is crucial for catching heap block overruns/underruns -- Memcheck can mark these redzones as unaddressable.  Because jemalloc packs heap blocks contiguously there is no such checking.  It's possible that you might get lucky and when you overrun a heap block you hit another one that was recently freed, in which case Memcheck will complain, but you might hit another allocated block and Memcheck won't complain.

It would be nice to add redzones to blocks when Valgrind is being used.  (Actually, it's only needed when Memcheck is being used, but we don't have a way of distinguishing those two cases...)  I was too timid to attempt that, it sounds like a job for the experts...

----

Anyway, I hope this all makes sense!  Please feel free to ask questions, there are two Valgrind developers CC'd to this bug report so you should get answers :)
Attached file better documentation (deleted) —
FWIW, I just updated the documentation for these client requests in Valgrind in an attempt to improve it.  I attach the new version, which may make things clearer... it overlaps quite a bit with what I wrote in comment 0.
Attached patch fixed patch (deleted) — Splinter Review
Somehow the patch headers were screwed up.  Here's another version that fixes them.
Attachment #387596 - Attachment is obsolete: true
Attached patch merged diff with m-c (deleted) — Splinter Review
This was an attempt at merging attachment 389842 [details] [diff] [review] with trunk, but I guess there are other changes necessary.
Blocks: 586962
FTR, I kinda think this is a losing proposition, because Memcheck's
requirements are fundamentally in conflict with Jemalloc's requirements.

Memcheck wants blocks to be widely spread out over the address space,
so it can paint no-access redzones before and after blocks, so as to
detect block overruns.  It also wants to delay the re-use of freed
blocks as long as possible, so that it can paint freed blocks as
no-access and keep them that way a long time, so as to detect accesses
to freed blocks long after they are freed.

Jemalloc wants the exact opposite: pack blocks tightly together, with
minimal or zero gaps.  This makes the redzone game largely ineffective.
It also wants to bring freed memory back into circulation ASAP, hence
making detection of freed memory less effective.

Possible alternatives:

* ensure we can always build with --disable-jemalloc (iow, make sure
  that stays alive, 

* and/or, teach valgrind to intercept calls to jemalloc's malloc/free
  etc, as it does now for glibc's versions.  This will require not 
  inlining jemalloc's malloc/free etc.
Everybody's using --disable-jemalloc, AFAICT.  That's what I do.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: