Closed Bug 547736 Opened 15 years ago Closed 10 years ago

Make SpiderMonkey helgrind-friendlier

Categories

(Core :: JavaScript Engine, defect)

Other Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 551155

People

(Reporter: jorendorff, Assigned: jorendorff)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 3 obsolete files)

There are some great unsupported, undocumented helgrind features we can use in MOZ_VALGRIND builds to make helgrind understand ThinLocks. This is necessary to use helgrind usefully (and it's very useful—it found bug 547274, not to mention several bugs in shell workers). Julian explicitly told me *not* to check these in, but I want to anyway. This stuff is too valuable to live in a rotting patch queue on my disk. I'll fix them up or take them out when a new version of valgrind breaks them. Beyond that, the patch adds a JS_NON_THREADSAFE macro that marks fields that race in ways we don't care about (jitstats) or that are safe for subtle reasons beyond helgrind's understanding (protoHazardShape, gcPoke). The point of this change is that you can then suppress all messages about these fields by adding a few lines to your valgrind.supp file.
Attached patch v1 (obsolete) (deleted) — Splinter Review
Assignee: general → jorendorff
Attachment #428231 - Flags: review?
Attachment #428231 - Flags: review? → review?(igor)
(In reply to comment #1) > Created an attachment (id=428231) [details] > v1 Nitlet: > void > js_FinishLock(JSThinLock *tl) > { > #ifdef NSPR_LOCK > tl->owner = 0xdeadbeef; > if (tl->fat) > JS_DESTROY_LOCK(((JSLock*)tl->fat)); > #else > JS_ASSERT(tl->owner == 0); > JS_ASSERT(tl->fat == NULL); >+ ANNOTATE__MUTEX_DESTROY_PRE(tl); > #endif > } Really, the _PRE should happen before the real lock-destroy action, not after it(that's what the _PRE/_POST convention is for). Not that it makes any difference in this case. > #else >+ ANNOTATE__MUTEX_DESTROY_PRE(tl); > JS_ASSERT(tl->owner == 0); > JS_ASSERT(tl->fat == NULL); > #endif
(In reply to comment #0) > Julian explicitly told me *not* to check these in, but I want to > anyway. On consideration, it's probably harmless to land it. I was concerned that it would cause problems for people using older valgrinds (3.3.x or before). But I think that's ok because an older Memcheck should ignore any Helgrind annotations it sees. -------- One other comment. You could land this as-is, and have something that works with Helgrind in 3.4.x and 3.5.x. However, I have uncommitted (V) patches to add two more annotations, to disable and re-enable checking for a given address range. You can use them to mark stats counters or whatever that are non-thread-safe, with the result you would no longer need a suppressions file. Said patches are in a relatively good state and I would be happy to commit them. Downside is they would only take effect with trunk or future-3.6.x releases of Valgrind. +/* Tell H that an address range is not to be "tracked" until further + notice. This puts it in the NOACCESS state, in which case we + ignore all reads and writes to it. Useful for ignoring ranges of + memory where there might be races we don't want to see. If the + memory is subsequently reallocated via malloc/new/stack allocation, + then it is put back in the trackable state. Hence it is safe in + the situation where checking is disabled, the containing area is + deallocated and later reallocated for some other purpose. */ +#define VALGRIND_HG_DISABLE_CHECKING(_qzz_start, _qzz_len) \ + do { \ + unsigned long _qzz_res; \ + VALGRIND_DO_CLIENT_REQUEST( \ + (_qzz_res), 0, _VG_USERREQ__HG_ARANGE_MAKE_UNTRACKED, \ + (_qzz_start), (_qzz_len), 0, 0, 0 \ + ); \ + (void)0; \ + } while(0) +/* And put it back into the normal "tracked" state, that is, make it + once again subject to the normal race-checking machinery. This + puts it in the same state as new memory allocated by this thread -- + that is, basically owned exclusively by this thread. */ +#define VALGRIND_HG_ENABLE_CHECKING(_qzz_start, _qzz_len) \ + do { \ + unsigned long _qzz_res; \ + VALGRIND_DO_CLIENT_REQUEST( \ + (_qzz_res), 0, _VG_USERREQ__HG_ARANGE_MAKE_TRACKED, \ + (_qzz_start), (_qzz_len), 0, 0, 0 \ + ); \ + (void)0; \ + } while(0) + +
(In reply to comment #2) > > JS_ASSERT(tl->owner == 0); > > JS_ASSERT(tl->fat == NULL); > >+ ANNOTATE__MUTEX_DESTROY_PRE(tl); > > Really, the _PRE should happen before the real lock-destroy action, > not after it(that's what the _PRE/_POST convention is for). Not that > it makes any difference in this case. Those are just asserts, though. Actually destroying the lock is a no-op. Thanks very much for all the help and handholding, Julian. I'd like to get rid of the JS_NON_THREADSAFE junk, but balanced against requiring a pre-release valgrind it's sort of a toss-up. Please let me know when that feature is in, though, and I'll give it a spin...
Comment on attachment 428231 [details] [diff] [review] v1 Sorry for a long overdue review.
Attachment #428231 - Flags: review?(igor) → review+
Helgrind's annotation set got revamped/updated in valgrind trunk r11062 (2010-03-04 00:03:40 +0100 (Thu, 04 Mar 2010)) and I am hoping that it will be stable from now on, at least past a V 3.6.0 release. As a consequence the patch can be tidied up considerably: * the DO_CREQ_*_* macros can be removed * the ANNOTATE__ macros can be removed; any call to ANNOTATE__XX_YY now becomes VALGRIND_HG_XX_YY instead * added new macros VALGRIND_HG_{DIS,EN}ABLE_CHECKING(address, size) to disable/enable tracking on arbitrary address ranges. This could possibly be used to reduce or remove the need for js::NonThreadSafe<>, although it would need to be used carefully. See comments in helgrind.h about how to use, and caveats. With some care it should also be possible to remove the need for a suppressions file when Helgrinding SM. * I also added definitions of #define __VALGRIND__ 3 #define __VALGRIND_MINOR__ 6 so it's possible to compile in the new stuff without breaking on headers from older versions of valgrind
(In reply to comment #1) Jason: I'd be happy to update the patch to the new annotations if you can advise me how to run a test workload that exercises js in whatever way it needs to be exercised. I just about managed to build a threaded js following your guidance on irc yesterday.
(In reply to comment #8) > V1 patch re-done for annotation set for helgrind in valgrind-3.6 (WIP) This is much shorter and cleaner. The JS_NON_THREADSAFE macro and class NonThreadSafe<> are gone, and replaced by 4 lines in jscntxt.cpp, which tell the checker at runtime which fields it should ignore. Also, this patch does not require the use of a special suppressions file.
Oh, and there's a bug in js_InitTitle. js_FinishTitle calls js_FinishLock on title->lock, but js_InitTitle never calls js_InitLock on it. So Helgrind complains that js_FinishLock is called on a bunch of locks it doesn't know about, which is true.
(In reply to comment #10) > Oh, and there's a bug in js_InitTitle. File as a separate bug?
One conclusion from this is that jsengine races on at least the following fields, presumably harmless and by design, but worth documenting: JSRuntime::gcPoke JSRuntime::protoHazardShape JSContext::operationCallbackFlag JSGCStats::arenaStats I'm sure I also saw races on some debug printing stuff in nanojit/, but I can't reproduce those now.
Blocks: 551155
(In reply to comment #12) > > I'm sure I also saw races on some debug printing stuff in nanojit/, > but I can't reproduce those now. This is a total guess, but... try making LIR.cpp:LabelMap::buf[] smaller, say 1000 or 500 chars. That might make it more reproducible. I'm in the middle of fixing up nanojit's debug printing for bug 531687. Currently a couple of the printing functions share LabelMap::buf[] and always put stuff into it so they're almost certainly not re-entrant. I'm changing things so that a buffer is always passed in to these functions from outside.
(In reply to comment #12) > One conclusion from this is that jsengine races on at least the > following fields, presumably harmless and by design, but worth > documenting: > > JSRuntime::gcPoke > JSRuntime::protoHazardShape > > JSContext::operationCallbackFlag > > JSGCStats::arenaStats We only write to gcPoke, and from within requests, which are serialized with the GC (Igor, please check my memory here). rt->protoHazardShape is regenerated and read within requests, which could race, but when read it is then compared only to samples taken on the same thread, in thread-local storage, by loading the same member, in order to check assumptions about other writes to shared memory that happen after the protoHazardShapeWrite. We assume weak store order. operationCallbackFlag is safe, the reader is sampling often and races with writers on purpose. We'll see it set soon enough if a script is taking too much realtime. arenaStats are DEBUG only metering. /be
A revised version, which moves the "don't check this field" annotations for JSRuntime::gcPoke/protoHazardShape/gcStats to the right place -- JSRuntime::JSRuntime().
Attachment #430773 - Attachment is obsolete: true
These bugs are all part of a search I made for js bugs that are getting lost in transit: http://tinyurl.com/jsDeadEndBugs They all have a review+'ed, non-obsoleted patch and are not marked fixed-in-tracemonkey or checkin-needed but have not seen any activity in 300 days. Some of these got lost simply because the assignee/patch provider never requested a checkin, or just because they were forgotten about.
Version of patch used for race-checking SpiderMonkey during Jan 2011. Works pretty well. Note you'll need to use NSPR as built in the tree, not a system NSPR, since some of the annotations in the patch are for NSPR. Do not commit -- needs cleanup first.
Attachment #428231 - Attachment is obsolete: true
Attachment #431845 - Attachment is obsolete: true
Currently assuming the races on the following fields are harmless: JSRuntime::gcMallocBytes JSRuntime::gcPoke JSRuntime::protoHazardShape JSRuntime::stringMemoryUsed JSRuntime::mjitMemoryUsed JSRuntime::shapeGen Note this bug should be closed in favour of bug 551155, which contains a more comprehensive and up to date patch.
In comment 18, Julian says this Helgrind bug should be closed obsoleted by bug 551155.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: