378918 - Corrupted ThreadPrivate gcFreeLists with multiple JSRuntimes

Reporter

Description

•

18 years ago

After pulling down the latest JS trunk I now get the following abort: Assertion failure: requestDebit <= rt->requestCount, at .\jsgc.c:2638 Here's the API's being used. pContext1 = JS_NewContext(...) JS_SetContextThread(pContext1); JS_BeginRequest(pContext1); run script for awhile... object in script requests to run another script. Thread runs sub-script using same method as above. pContext2 = JS_NewContext(...) JS_SetContextThread(pContext2); //Associate the context with our thread. JS_BeginRequest(pContext2); script runs ok... JS_EndRequest(pContext2); JS_MaybeGC(pContext2); <- Dies here with ASSERT This always worked fine before I pulled down the latest from the trunk. Is the ASSERT broken or is my API usage? This code was working prior to the latest update from CVS (my last pull was a few months back) CC'ing those involved in GC work in the past few months. Mike M.

Brendan Eich [:brendan]

Comment 1

•

18 years ago

Mike, can you cvs up -D "last week" and other such formulations ("last month") and see when this broke? You will want to cvs up -A after to "unstick" from the last date and get back to the top of trunk. Thanks, /be

Mike Moening

Reporter

Comment 2

•

18 years ago

It looks like a long dormant bug that is only showing now since JS_MaybeGC is triggering a real GC, where before it was not. Its also more complicated that I wrote in the opening description. There are actually TWO JS_Runtimes at play here involving a SINGLE thread. Here it is again with more details added: 2 runtimes are created (RT-1 and RT-2) Runtime #1 is actually a debugger which uses JS internally and calls to Runtime #2 to do work. RT-1 pContext1 = JS_NewContext(...) RT-1 JS_SetContextThread(pContext1); RT-1 JS_BeginRequest(pContext1); RT-1 run script for awhile... RT-1 object in script requests to compile/execute a script on RT-2 RT-1 Thread calls class holding RT-2 and asks for compile/exec of script RT-1 Class calls RT-2 for compile/exec RT-2 pContext2 = JS_NewContext(...) RT-2 JS_SetContextThread(pContext2); //Associate the context with our thread. RT-2 JS_BeginRequest(pContext2); RT-2 script runs ok... RT-2 JS_EndRequest(pContext2); RT-2 JS_MaybeGC(pContext2); <- Dies here with ASSERT My guess is that somehow the threads contextList is getting messed by by moving between runtimes. (Just a guess here. I'm probably all wet.) I'm centering my energy on this code in GC: /* * Check all contexts on cx->thread->contextList for active requests, * counting each such context against requestDebit. */ head = &cx->thread->contextList; for (link = head->next; link != head; link = link->next) { acx = CX_FROM_THREAD_LINKS(link); JS_ASSERT(acx->thread == cx->thread); if (acx->requestDepth) requestDebit++; Any inside information about what this actually does would be helpful. Donka.

Mike Moening

Reporter

Comment 3

•

18 years ago

The fact that GC that is getting called is very weird since Runtime #2 has actually done no work at all (other than basically compiling a script) JS_MaybeGC runs the following code: if ((bytes > 8192 && bytes > lastBytes + lastBytes / 3) || rt->gcMallocBytes >= rt->gcMaxMallocBytes) { JS_GC(cx); The variable "bytes" is set to 55392. However "lastBytes" is set to a ZERO. Which means GC is running when it doesn't need to. Somebody please verify. We still need to fix the ASSERT but this appears to be cause of GC running. Mike M.

Mike Moening

Reporter

Comment 4

•

18 years ago

I found the problem! See internals of js_GetCurrentThread() JS is using ThreadPrivate storage to directly store a JSThread struct. This will not work when multiple runtimes are in use. The JSThread->contextList and JSThread->gcFreeLists get corrupted and contain contexts or free items from more than 1 JSRuntime. There needs to be one more level of indirection in the ThreadPrivate storage. It needs to be JSRuntime specific. In other words the ThreadPrivate storage needs to be a hash of JSRuntimes to JSThread struct. That way when the gcFreeLists and contextLists cannot be corrupted by usage of one or more JSRuntime's. Could somebody please recommend an appropriate storage mechanism for the hash table? Is there something you always use? I would be happy to attempt a patch with a little guidance... It looks like most of the changes will be in js_GetCurrentThread()... Mike M

Summary: JS_MaybeGC aborts on (requestDebit <= rt->requestCount) → Corrupted ThreadPrivate gcFreeLists with multiple JSRuntimes

Brendan Eich [:brendan]

Comment 5

•

18 years ago

A workaround: JS_ClearContextThread before switching runtimes in the same thread. /be

Assignee: general → brendan

Mike Moening

Reporter

Comment 6

•

18 years ago

That would fix the problem with cx->threadLinks and the GC ASSERT I was hitting. However, I'm afraid the gcFreeLists might still be messed up since the freeList could contain gc things from another runtime. Please correct me if I'm wrong...

Mike Moening

Reporter

Comment 7

•

18 years ago

Also this code: memset(cx->thread->gcFreeLists, JS_FREE_PATTERN, sizeof(cx->thread->gcFreeLists)); which happens in JS_ClearContextThread() might not be the safest thing to do given two runtimes. Does that introduce a leak?

Brendan Eich [:brendan]

Comment 8

•

18 years ago

Mike, the workaround applies in all cases because sharing of cx->thread among one or more cx in a runtime is managed by testing JS_CLIST_IS_EMPTY(&cx->thread->contextList) The gcFreeLists member of cx->thread cannot contain gc-things from another runtime if you always JS_ClearContextThread before using a different cx from another runtime on the same thread. The GC rebuilds the global freelist every time, throwing away thread-local freelists. Please try the workaround before trying to shoot it down, ok? /be

Mike Moening

Reporter

Comment 9

•

18 years ago

I'm not shooting it down at all. Just thinking through all possible ramifications. For example: What happens if some stuff is added to gcFreeList from runtime #2 and control is returned to runtime #1 without GC running on #2? BTW I did try it. I use a JS_ClearContextThread JS_SetContextThread pair around any thread calls that jump to a different runtime. So far it fixed the ASSERT I was seeing so far with GC being run on the second runtime. Do you have any anwsers on the question in comment #3? (lastBytes being zero) Also, what should I do with the bug?

Mike Moening

Reporter

Comment 10

•

18 years ago

Brendan, I just noticed the call to clear the gcFreeLists has a #ifdef DEBUG around it. I think that will need to be removed. Agree?

Brendan Eich [:brendan]

Comment 11

•

18 years ago

(In reply to comment #10) > Brendan, I just noticed the call to clear the gcFreeLists has a #ifdef DEBUG > around it. > I think that will need to be removed. Agree? No. JS_FREE_PATTERN is not even defined unless DEBUG is defined. This is just to make crash skidmarks recognizable. Again, the thread-local freelists are not consulted by the GC; it rebuilds only the runtime-global freelists, linking every single collected gc-thing on the appropriate list. There's no leaking, and no need to memset in js_ClearContextThread when js_SetContextThread will memset to 0 (null pointers) on reactivation of the JSThread. I took the bug. I'll work on a patch tomorrow. /be

Status: NEW → ASSIGNED

Brendan Eich [:brendan]

Comment 12

•

18 years ago

Attached patch proposed fix (obsolete) (deleted) — Details — Splinter Review

I haven't tested this, but I wanted to get design feedback as well as code review. /be

Attachment #263085 - Flags: review?(igor)

Igor Bukanov

Assignee

Comment 13

•

18 years ago

(In reply to comment #12) > Created an attachment (id=263085) [details] > proposed fix > > I haven't tested this, but I wanted to get design feedback as well as code > review. What about simply calling status = PR_NewThreadPrivateIndex(&threadTPIndex, ptr); once per runtime?

Brendan Eich [:brendan]

Comment 14

•

18 years ago

(In reply to comment #13) > (In reply to comment #12) > > Created an attachment (id=263085) [details] [details] > > proposed fix > > > > I haven't tested this, but I wanted to get design feedback as well as code > > review. > > What about simply calling > status = PR_NewThreadPrivateIndex(&threadTPIndex, ptr); > once per runtime? We'd run out of TPindexes in some embeddings. Some embeddings use an unbounded number of runtimes, one or more per thread. TPIndexes are allocated from a small fixed size array. /be

Igor Bukanov

Assignee

Comment 15

•

18 years ago

(In reply to comment #14) > > What about simply calling > > status = PR_NewThreadPrivateIndex(&threadTPIndex, ptr); > > once per runtime? > > We'd run out of TPindexes in some embeddings. Some embeddings use an unbounded > number of runtimes, one or more per thread. TPIndexes are allocated from a > small fixed size array. I was under impression that PR_NewThreadPrivateIndex uses a hashtable itself. But it indeed uses a fixed size array. Then I think using a hashtable for JSThread is a better option rather then trying to workaround the problem with gcRuntime. The code would be more straightforward and I suspect less in size.

Brian Crowder

Updated

•

18 years ago

Blocks: 365048

Brendan Eich [:brendan]

Comment 16

•

18 years ago

I don't want every user of cx->thread->gc* to have to hash rt to an entry struct. That recomputes what is fixed, since the cx is in only that particular rt and (if the storage is stable) we could have a cx->rtThread pointer computed once per cx. But I really don't see how hashing could yield a smaller patch. The multimple rt per thread case is frankly weird. So why should we bend the code much for it? /be

Igor Bukanov

Assignee

Comment 17

•

18 years ago

(In reply to comment #16) > I don't want every user of cx->thread->gc* to have to hash rt to an entry > struct. My suggestion was to create a separated JSThread per runtime in js_GetCurrentThread. > But I really don't see how hashing could yield a smaller patch. After looking into this more closely I realized that the code size would be definitely bigger. First it would be necessary to kill the hashtable and JSThread it holds in js_ThreadDestructorCB. Then separated code to remove JSThread is necessary in DestroyRuntime. > The multimple > rt per thread case is frankly weird. So why should we bend the code much for > it? My problem is that the patch adds extra checks per each JS_malloc and js_NewGCThing just to support this weired case. Perhaps a simpler solution would be to push the hashing above to the application? That is, the code can detect multiple runtimes that wants to use the same JSThread and report error requiring to implement some callback?

Brendan Eich [:brendan]

Comment 18

•

17 years ago

Mike, have you been running with this patch, or some other patch, or only at most one runtime per thread, or what? /be

Mike Moening

Reporter

Comment 19

•

17 years ago

I was running with the hack around from comment #5...calling JS_ClearContextThread before switching runtimes in the same thread. I don't like this hack but its better than nothing. I have not tested the patch.

timeless

Comment 20

•

17 years ago

Comment on attachment 263085 [details] [diff] [review] proposed fix I hit an assert using jsdb which this patch fixes.

timeless

Comment 21

•

17 years ago

Comment on attachment 263085 [details] [diff] [review] proposed fix I hit an assert using jsdb which this patch fixes. * Check all contexts on cx->thread->contextList for active requests, * counting each such context against requestDebit. This comment needs to be updated to specify "associated with this runtime" or something. * Set all thread local freelists to NULL. And this comment.

Igor Bukanov

Assignee

Comment 22

•

17 years ago

(In reply to comment #20) > (From update of attachment 263085 [details] [diff] [review]) > I hit an assert using jsdb which this patch fixes. Is this with multiple JSRuntime instances?

timeless

Comment 23

•

17 years ago

yes, jsdb has 4 :). unfortunately i don't currently have a nice js_spray thing with which to test the threaded portion of this patch in jsdb (but bug XXXX will provide one to js shell which would give jsdb a way for free), and i'd probably need to write some amusing debugger script that responded automatically to exceptions. But I believe this patch is correct. js_ClearContextThread(JSContext *cx) { JS_REMOVE_AND_INIT_LINK(&cx->threadLinks); #ifdef DEBUG if (JS_CLIST_IS_EMPTY(&cx->thread->contextList)) { memset(cx->thread->gcFreeLists, JS_FREE_PATTERN, sizeof(cx->thread->gcFreeLists)); + cx->thread->gcRuntime = NULL; That this is in a DEBUG statement whereas its counterpart isn't worries me. i'd rather: -#ifdef DEBUG if (JS_CLIST_IS_EMPTY(&cx->thread->contextList)) { +#ifdef DEBUG memset(cx->thread->gcFreeLists, JS_FREE_PATTERN, sizeof(cx->thread->gcFreeLists)); +#endif + cx->thread->gcRuntime = NULL; } -#endif I'm also slightly uncertain about cx->thread->gcRuntime. I'm fairly certain there's no api to access it, which means that anyone who calls SetContext will step on the previous owner, and the previous owner would magically need to know that they've been stepped on. In js_NewGCThing, I'm wondering why you don't just claim cx->thread->gcRuntime temporarily for the duration of the call. Only one piece of code can run on a thread at a time, so the read-save, write, write-restore pattern shouldn't really hurt anyone.

Brian Crowder

Comment 24

•

17 years ago

Bug 404879 is the bug you want for threaded testing in js-shell.

Mike Moening

Reporter

Comment 25

•

17 years ago

Brian, That multi-threaded js-shell will not uncover bugs like this one. It's and entirely different animal. timless, I share your concern in comment #23.

Brendan Eich [:brendan]

Comment 26

•

17 years ago

Comment on attachment 263085 [details] [diff] [review] proposed fix Better patch coming soon. It will require embeddings to #define the maximum number of runtimes used by a thread, which should not be too large. Timeless: jsdb is its own build with custom SpiderMonkey build, right? It better be, given the above. /be

Attachment #263085 - Attachment is obsolete: true

Attachment #263085 - Flags: review?(igor)

Igor Bukanov

Assignee

Comment 27

•

17 years ago

(In reply to comment #26) > (From update of attachment 263085 [details] [diff] [review]) > Better patch coming soon. It will require embeddings to #define the maximum > number of runtimes used by a thread, which should not be too large. But this still would not address the scalability problem with one runtime and many threads that Mike reported when the free lists are killed when there are other threads that would like to use them. Perhaps it is better to decouple thread local free-lists from the JSThread and use a global pool of thread local lists with JS_BeginRequest/JS_EndRequest getting/returning entries to the pool?

Brendan Eich [:brendan]

Updated

•

17 years ago

Flags: wanted1.9+

Priority: -- → P3

Target Milestone: --- → mozilla1.9beta4

Brendan Eich [:brendan]

Comment 28

•

17 years ago

(In reply to comment #27) > (In reply to comment #26) > > (From update of attachment 263085 [details] [diff] [review] [details]) > > Better patch coming soon. It will require embeddings to #define the maximum > > number of runtimes used by a thread, which should not be too large. > > But this still would not address the scalability problem with one runtime and > many threads that Mike reported when the free lists are killed when there are > other threads that would like to use them. No, it would -- what I'm working on does not share JSThread.gcFreeList among more than one runtime, it splits out a JSRuntimeThread per runtime using a JSThread. Patch soon. > Perhaps it is better to decouple thread local free-lists from the JSThread and > use a global pool of thread local lists with JS_BeginRequest/JS_EndRequest > getting/returning entries to the pool? That's what my patch does, more or less. It uses JS_BeginRequest to select the JSRuntimeThread used by the GC for fast lock-free allocation, etc., and "pops" that JSRuntimeThread (not really on a stack, an MRU list) in JS_EndRequest. But again, you have to #define JS_RUNTIMES_PER_THREAD to be > 1 to get this code. Without that macro, you get no extra indirection overhead -- you get today's code. This is the decision I'd like to make "stick". /be

Mike Moening

Reporter

Comment 29

•

17 years ago

>Perhaps it is better to decouple thread local free-lists from the JSThread and >use a global pool of thread local lists with JS_BeginRequest/JS_EndRequest >getting/returning entries to the pool? Yes that would be much better. Another thread running GC should not be sweeping all other threads free lists every time GC is run. That creates too much thrashing. Once a thread spins up let it keep its own free lists until the end of the request or maybe thread. When running many threads like this the total amount of memory consumed is not really that important. Speed and thread contention is the major concern.

Igor Bukanov

Assignee

Comment 30

•

17 years ago

(In reply to comment #27) > Perhaps it is better to decouple thread local free-lists from the JSThread and > use a global pool of thread local lists with JS_BeginRequest/JS_EndRequest > getting/returning entries to the pool? That in fact suggests a simple approach: 1. Modify JS_GC to assemble a free list as a list of lists where the number of entries on the each sublist corresponds to the current MAX_THREAD_LOCAL_THINGS. 2. Modify js_NewGCThing to transfer the sublist into JSContext (not JSThread as it is done now). Since the sublist is already assembled, it should speedup the allocation. 3. Modify js_EndRequest/js_DestroyContext to return the unused list back to the global pool. This should resolve this bug and address the scalability issues as well.

Priority: P3 → --

Target Milestone: mozilla1.9beta4 → ---

Igor Bukanov

Assignee

Comment 31

•

17 years ago

(In reply to comment #30) > This should resolve this bug and address the scalability issues as well. And this would avoid arbitrary limits on numbers of runtimes per thread.

Brendan Eich [:brendan]

Comment 32

•

17 years ago

Igor, there's more at stack in JSThread than the gcFreeList member -- especially the propertyCache should not be thrashed by GC on one of N (small N) runtimes. It is true we could put gcFreeList in each JSContext, but that's a lot of bloat and potential hoarding in embeddings like Firefox that have many (many) contexts per thread. Let me get this patch together, but please comment (MikeM, timeless: this means you) on the #define JS_RUNTIMES_PER_THREAD compile-time configuration requirement that I am proposing. /be

Brendan Eich [:brendan]

Comment 33

•

17 years ago

So long as JSThread exists, we have the # JSRuntimes > 1 vs. JSThread pigeon-hole problem. You can't avoid it just by moving gcFreeList into JSContext. We cannot have a property cache per context in Firefox. We must have lock-free, thread-local storage. Therefore JSThread is not going away. Therefore we must have a sane way of multiplexing its members that vary per runtime. By sane I mean "arbitrary", because it is not desirable to support indefinite number of runtimes per thread. /be

Brendan Eich [:brendan]

Comment 34

•

17 years ago

(In reply to comment #32) > Igor, there's more at stack Heh, "at stake". /be

Brendan Eich [:brendan]

Comment 35

•

17 years ago

If the number of runtimes that might share a thread can be bounded small enough, then we *can* just allocate more thread-private indexes from NSPR. That would be an even simpler solution. /be

Igor Bukanov

Assignee

Comment 36

•

17 years ago

(In reply to comment #29) Another thread running GC should not be > sweeping all other threads free lists every time GC is run. That creates too > much thrashing. This translates into keeping the existing free lists untouched during the GC. But this will prevent releasing arenas without live things back to the system if one wants to avoid scanning the free lists an extra time during the GC. If memory consumption is not that important, it can be a good trade off. But this is definitely not for a browser embedding.

Brendan Eich [:brendan]

Comment 37

•

17 years ago

(In reply to comment #36) > (In reply to comment #29) > Another thread running GC should not be > > sweeping all other threads free lists every time GC is run. That creates too > > much thrashing. I misunderstood MikeM here to be saying unrelated runtimes sharing a thread should not affect one anothers' per-thread-viewed-by-runtime freelists, which I agree with and my patch addresses. But you are right, that's not what he wrote :-/. > This translates into keeping the existing free lists untouched during the GC. > But this will prevent releasing arenas without live things back to the system > if one wants to avoid scanning the free lists an extra time during the GC. I don't buy it anyway, since the freelist is refilled on next js_NewGCThing, so it is purely an optimization to amortize runtime freelist locking overhead. > If memory consumption is not that important, it can be a good trade off. But > this is definitely not for a browser embedding. It's not clear MikeM needs it either. Or perhaps I should say: he may really need copying generational GC etc. -- the works. That is not coming except far down the road, probably via ActionMonkey. /be

Igor Bukanov

Assignee

Comment 38

•

17 years ago

(In reply to comment #32) > but that's a > lot of bloat and potential hoarding in embeddings like Firefox that have many > (many) contexts per thread. The bloat is exactly 10*sizeof(void*) bytes per context. And it can be made 4*sizeof(void*) if one allocates the lists for xml support lazily.

Igor Bukanov

Assignee

Comment 39

•

17 years ago

(In reply to comment #37) > I don't buy it anyway, since the freelist is refilled on next js_NewGCThing, so > it is purely an optimization to amortize runtime freelist locking overhead. The idea is to assemble during the GC the freed cells as a list of lists to skip refiling overhead in js_NewGCThing.

Mike Moening

Reporter

Comment 40

•

17 years ago

>1. Modify JS_GC to assemble a free list as a list of lists where the number of >entries on the each sublist corresponds to the current MAX_THREAD_LOCAL_THINGS. Right now MAX_THREAD_LOCAL_THINGS is set to 8. That doesn't seem like very many things to me. Maybe I'm not understanding right... Not sure I like the JS_RUNTIMES_PER_THREAD idea. If I right a piece of software that has 2 runtimes and I sit on top of somebody else's code that has X runtimes I really don't know that to set this value to. So long as we can start and thread and do as much work as possible with as many JSContexts as is required without GC interfence I'd be happy. The architecture is a pool of threads and a pool of contexts and a cache of pre-compiled scripts. Each thread runs through this basic sequence many times on 1 thread from the pool. 1) Get free JS context from the pool. 2) JS_SetContextThread //Associate the context with our thread. 3) JS_BeginRequest 4) Run 1+ scripts for awhile. 5) JS_SuspendRequest 6) wait for long running thing (like SQL query) 7) JS_ResumeRequest 8) JS_EndRequest(pContext2); 9) JS_MaybeGC(pContext2); 10) JS_SetContextThread(NULL) //Un-associate the context with our thread 11) Return free JS context to pool again. Unless the thread is shutting down I don't want to give back my GC things. I'm just going to need them again anyway in a very short while. That means once the thread is starting let it keep its free lists do GC only on that thread. Don't create a contention problem between threads over GC. In regards to your comment: >If memory consumption is not that important, it can be a good trade off. But >this is definitely not for a browser embedding. Memory consumption is not important at all. We are talking big servers with 2-12 Gigs of ram. Lock free allocations and GC is what is important. The multiple runtime thing is still a problem in the scenario too.

Mike Moening

Reporter

Comment 41

•

17 years ago

Why should 1 thread ever have to wait for another thread to run GC? Think massively parallel here.

Brendan Eich [:brendan]

Comment 42

•

17 years ago

Again, you can shrink the gcFreeList space hit on JSContext with more work, but that does not solve other JSThread members. And I do not understand the proposal to amortize runtime freelist locking harder by making the GC assemble a list of lists. How many free things would be pre-filled on the list of lists? We do not want to hoard. At some point the allocator exhausts the list -- then what? We have to be willing to run a GC or simply go to the runtime allocator and refill the thread-local list. Running the GC is costly. We already have code to refill the thread-local list. That is simple (relative to alternatives) and sufficient to amortize the runtime-allocator overhead down to the noise (we can adjust the number of things prefilled on the thread-local list). I'm leery of over-engineering for one case here, while still not solving the valid use-cases for fast/lock-free thread-local (not per-context) storage. /be

Brendan Eich [:brendan]

Comment 43

•

17 years ago

Mike: GC is a runtime-wide procedure. If you do not want threads to share objects, give them their own runtimes and they'll GC independently. You cannot have it both ways. /be

Mike Moening

Reporter

Comment 44

•

17 years ago

>Mike: GC is a runtime-wide procedure. If you do not want threads to share >objects, give them their own runtimes and they'll GC independently. You cannot >have it both ways. Why not? Ask for the sky... Yes we DO want to be able to share objects between threads. However right now every time GC() runs on any thread it steals the free things away from and blocks all other threads. Just do this at the end or the thread or the end of the request. Don't keep stealing free things from all threads and then put them back in again in the threads free list 1 microsecond later. That thrashing helps nobody. Right now we do not have fast/lock-free thread local allocation because the GC steals the memory back too fast. Think "How would I architect memory allocation if had to support 3000 concurrent requests with 8 processors running at the same time?" >At some point the allocator exhausts the list -- then what? Make more. Memory is cheap. I agree this is not something the typical browser would need.

Brendan Eich [:brendan]

Comment 45

•

17 years ago

MikeM: it sounds like you are running the GC too often. True? /be

Mike Moening

Reporter

Comment 46

•

17 years ago

Hard to say. Right now I do JS_MaybeGC() after every request. JS_GC() is run every 200 requests. This might be way too often...???? I do not know how to form a better rule... How can I get metrics on usage of memory so I can form a better plan? Here's the runtime and context setup. JS_NewRuntime() is being passed 1024*1024*128 or 128 megs JS_NewContext(rt, 8192) is passed 8K

Brendan Eich [:brendan]

Comment 47

•

17 years ago

JS_MaybeGC after every request is probably overkill, but is it actually GC'ing? That would be interesting to know. Typically it is used from the branch callback, which is not the operation count callback, and more efficient and tuning-friendly than before. Joe-Bob says check it out. Forcing a GC every 200 requests does not make sense to me. There's nothing about 200 that says you've got enough garbage for a GC to be worthwhile. JS_NewRuntime's parameter may be causing JS_MaybeGC to GC. Suggest making it 0xffffffff for now. JS_NewContext's parameter has nothing to do with GC. /be

Brendan Eich [:brendan]

Comment 48

•

17 years ago

(In reply to comment #47) > JS_MaybeGC after every request is probably overkill, but is it actually GC'ing? > That would be interesting to know. Typically it is used from the branch > callback, which is not s/not/now/ > the operation count callback, and more efficient and > tuning-friendly than before. Joe-Bob says check it out. See http://lxr.mozilla.org/mozilla/source/js/src/jsapi.h#2130 /be

Mike Moening

Reporter

Comment 49

•

17 years ago

I'm already using operation callback to shut down long running scripts. My limit is set to (256*4096). How can I ask the runtime how much it has left in an efficient way? Can I use this JS_SetGCCallback() to tell me when GC is being run? The remainder of this tuning stuff can be done offline via private email. Lets get back to the bug at hand. After all that was said how can we solve the bug and improve performance at the same time without imposing limits on # of runtimes?

Brendan Eich [:brendan]

Comment 50

•

17 years ago

(In reply to comment #49) > After all that was said how can we solve the bug and improve performance at the > same time without imposing limits on # of runtimes? There's no such thing as a free lunch. In other words, you can't always get what you want (Heinlein and The Stones, wow). Why do you "need" unlimited numbers of runtimes containing contexts executing scripts in the same thread? If you don't, what limited number do you expect, or can you live with? /be

Brendan Eich [:brendan]

Comment 51

•

17 years ago

"There ain't no such ...". Not my day. Happy to stick to the topic in this bug, but if it is distorted by your embedding doing too much GC, then we should not design around that distortion! /be

Mike Moening

Reporter

Comment 52

•

17 years ago

>embedding doing too much GC, then we should not design around that distortion! Agreed. >Why do you "need" unlimited numbers of runtimes containing contexts executing >scripts in the same thread? If you don't, what limited number do you expect, or >can you live with? I use two max. However if somebody used our JSMonkeyWrench DLL in their embedding it could bring up the total of runtimes depending on what they are doing. Sounds like others us 4? (don't know why...) So I guess 4...??? Maybe GC shouldn't go through ALL the threads and steal their GC things from the free lists. Just iterate as many of the thread freelists as it needs but no more. That would cut down the thrash quite a bit.

Brendan Eich [:brendan]

Comment 53

•

17 years ago

As Igor already noted, the GC could rebuild thread-local freelists, but it cannot just leave (some of) them to dangle -- they are not roots (must not be or you'll have bad leaks), therefore the things they point to will be swept up from under them and possible allocated to someone else, behind their back. /be

timeless

Comment 54

•

17 years ago

Attached patch the changes i have to js/src (for wes to see) (deleted) — Details — Splinter Review

brendan: i assume you're asking whether I'm running w/ a patch from this bug to deal w/ the fact that otherwise I'd be dead, the answer is yes. as for whether it's truly private, no, i patch a mozilla source tree and use the js3250 generated from it for jsdb.

Brendan Eich [:brendan]

Comment 55

•

17 years ago

The patch you're using is not the right approach. I'm not supporting it, but I do want to fix this bug -- after Firefox 3 / Gecko 1.9 priorities are done. /be

timeless

Comment 56

•

17 years ago

yeah, i understand, i'm not shipping it to any consumers. i just need something that doesn't crash/corrupt and works well enough to let me investigate other problems, which it does (i haven't actually gotten around to having much real concurrency, so i have no idea whether it actually scales, but for the time being not being patched was fatal).

Igor Bukanov

Assignee

Comment 57

•

17 years ago

Attached patch fix via local list pools (obsolete) (deleted) — Details — Splinter Review

The new patch implements that idea about pool of free list sets. The context gains a set from the pool during allocation and release the set back to the pool when it exits the requests. This way a set with few free lists available can be reused by another requests/context. The new implementation does not store anything in JSThread. As such it is immune to the multiple runtime problem.

Assignee: brendan → igor

Igor Bukanov

Assignee

Comment 58

•

17 years ago

Attached patch v2 (obsolete) (deleted) — Details — Splinter Review

fixing bugs: now the patch passes shell tests.

Attachment #308730 - Attachment is obsolete: true

Mike Moening

Reporter

Comment 59

•

17 years ago

Igor, can you land this fix?

Igor Bukanov

Assignee

Comment 60

•

17 years ago

(In reply to comment #59) > Igor, can you land this fix? > First this has to be tested and reviewed.

Brian Crowder

Comment 61

•

17 years ago

Igor: Any testing I can help with? You just want suite/mochis?

timeless

Comment 62

•

17 years ago

Comment on attachment 308772 [details] [diff] [review] v2 + if (keepCount != 0) { + do { + } while (--keepCount != 0); + } given that you aren't using keepCount outside or inside this block, while (keepCount--) { ... } should be fine. ReleaseGCFreeListsPool is this correctly named? the early return makes me worry. + freeLists = (JSGCFreeListSet *) malloc(sizeof *freeLists); + if (!freeLists) + return NULL; + memset(freeLists, 0, sizeof *freeLists); what's wrong w/ calloc? a comment explaining why you aren't using JS_malloc might be a good idea (I'm assuming it's because the callers handle error reporting themselves).

Igor Bukanov

Assignee

Comment 63

•

17 years ago

(In reply to comment #61) > Igor: Any testing I can help with? You just want suite/mochis? > suite/mochi/sunspider numbers (should not regress in the browser)/testing with firebug (it uses multiple runtimes)

Igor Bukanov

Assignee

Comment 64

•

17 years ago

Attached patch v3 (obsolete) (deleted) — Details — Splinter Review

The patch addresses the issues from the comment 62, see the delta below. timeless: I renamed ReleaseGCFreeListsPool into EmptyGCFreeListsPool. Does the new name reflects "remove all except the given amount of lists from the pool" semantics of the function? --- /home/igor/m/ff/mozilla/quilt.F21961/js/src/jsgc.c 2008-03-28 14:24:15.000000000 +0100 +++ js/src/jsgc.c 2008-03-28 14:23:03.000000000 +0100 @@ -1410,3 +1410,3 @@ CheckLeakedRoots(JSRuntime *rt); static void -ReleaseGCFreeListsPool(JSRuntime *rt, uintN keepCount); +EmptyGCFreeListsPool(JSRuntime *rt, uintN keepCount); @@ -1424,3 +1424,3 @@ js_FinishGC(JSRuntime *rt) #ifdef JS_THREADSAFE - ReleaseGCFreeListsPool(rt, 0); + EmptyGCFreeListsPool(rt, 0); JS_ASSERT(!rt->gcFreeListsPool); @@ -1689,3 +1689,3 @@ const JSGCFreeListSet js_GCEmptyFreeList static void -ReleaseGCFreeListsPool(JSRuntime *rt, uintN keepCount) +EmptyGCFreeListsPool(JSRuntime *rt, uintN keepCount) { @@ -1694,9 +1694,9 @@ ReleaseGCFreeListsPool(JSRuntime *rt, ui cursor = &rt->gcFreeListsPool; - if (keepCount != 0) { - do { - if (!(freeLists = *cursor)) - return; - memset(freeLists->array, 0, sizeof freeLists->array); - cursor = &freeLists->link; - } while (--keepCount != 0); + while (keepCount != 0) { + --keepCount; + freeLists = *cursor; + if (!freeLists) + return; + memset(freeLists->array, 0, sizeof freeLists->array); + cursor = &freeLists->link; } @@ -1739,6 +1739,8 @@ EnsureLocalFreeList(JSContext *cx) } else { - freeLists = (JSGCFreeListSet *) malloc(sizeof *freeLists); + /* + * No JS_malloc as the caller reports out-of-memory itself. + */ + freeLists = (JSGCFreeListSet *) calloc(1, sizeof *freeLists); if (!freeLists) return NULL; - memset(freeLists, 0, sizeof *freeLists); } @@ -3491,3 +3493,3 @@ js_GC(JSContext *cx, JSGCInvocationKind */ - ReleaseGCFreeListsPool(rt, 2); + EmptyGCFreeListsPool(rt, 2); #endif

Attachment #308772 - Attachment is obsolete: true

Attachment #312275 - Flags: review?(brendan)

Igor Bukanov

Assignee

Comment 65

•

17 years ago

Attached patch v3 with diff -U8, not -U1 (obsolete) (deleted) — Details — Splinter Review

Attachment #312275 - Attachment is obsolete: true

Attachment #312276 - Flags: review?(brendan)

Attachment #312275 - Flags: review?(brendan)

timeless

Comment 66

•

17 years ago

offhand, I think "Trim" is better than "Empty" or "Release", I think I *finally* understand what the function is doing. wrt jsdb, I finally have jsdb itself handling threads. Sadly jsd doesn't handle them properly and jsd is going to need some serious work to get there. (scatter currently asserts in jsd!jsd_GetValueString when called from any of the other threads). Once jsd itself is able to handle threads properly, I should have more feedback.

Igor Bukanov

Assignee

Comment 67

•

17 years ago

Attached patch v3b (obsolete) (deleted) — Details — Splinter Review

The new version syncs the patch with CVS head and renames EmptyGCFreeListsPool into uses TrimGCFreeListsPool.

Attachment #312276 - Attachment is obsolete: true

Attachment #312906 - Flags: review?(brendan)

Attachment #312276 - Flags: review?(brendan)

timeless

Updated

•

17 years ago

Blocks: 426529

Brendan Eich [:brendan]

Updated

•

17 years ago

Blocks: js1.8src

Igor Bukanov

Assignee

Updated

•

16 years ago

Blocks: 437325

Igor Bukanov

Assignee

Updated

•

16 years ago

No longer blocks: 437325

Igor Bukanov

Assignee

Comment 68

•

16 years ago

Attached patch v4 (obsolete) (deleted) — Details — Splinter Review

The patch is a port of the v3b version to mozilla-central. In addition I made sure that it compiles with !JS_THREADSAFE.

Attachment #312906 - Attachment is obsolete: true

Attachment #325543 - Flags: review?(brendan)

Attachment #312906 - Flags: review?(brendan)

Bob Clary [:bc] (inactive)

Updated

•

16 years ago

No longer blocks: js1.8src

Bob Clary [:bc] (inactive)

Updated

•

16 years ago

Blocks: js1.8

Bob Clary [:bc] (inactive)

Updated

•

16 years ago

No longer blocks: js1.8

Bob Clary [:bc] (inactive)

Updated

•

16 years ago

Blocks: js1.8src

Brendan Eich [:brendan]

Comment 69

•

16 years ago

Comment on attachment 325543 [details] [diff] [review] v4 Looks great -- want to get a cvs.mozilla.org version of this for js1.8src, feel free to attach one if handy. Thanks, /be

Attachment #325543 - Flags: review?(brendan) → review+

Igor Bukanov

Assignee

Comment 70

•

16 years ago

Attached patch v5 (obsolete) (deleted) — Details — Splinter Review

Here is a version for check in

Attachment #325543 - Attachment is obsolete: true

Attachment #326472 - Flags: review+

Igor Bukanov

Assignee

Comment 71

•

16 years ago

I pushed the patch from the comment 70 to mozilla-central: author Igor Bukanov <igor@mir2.org> Tue Jun 24 15:17:52 2008 +0200 (at Tue Jun 24 15:17:52 2008 +0200) changeset 15493 4699797259f1 parent 15487 0c8e64474660 parent 15492 76caed42cf7c child 15494 631c0cb99782 [Bug 378918] Scallable free lists for GC, r=brendan

Status: ASSIGNED → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

Igor Bukanov

Assignee

Comment 72

•

16 years ago

I backed out the patch to investigate the leak problem on tinderbox: changeset: 15509:0442f945a866 tag: tip user: Igor Bukanov <igor@mir2.org> date: Tue Jun 24 18:55:06 2008 +0200 summary: [Bug 378918] backing out to investigate the tinderbox leak problem

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Igor Bukanov

Assignee

Comment 73

•

16 years ago

Attached patch v6 (deleted) — Details — Splinter Review

The new version is v5 synchronized with the trunk. I cannot reproduce the original leak fails locally so I will try to land this again assuming that the original failures were unrelated to the patch.

Attachment #326472 - Attachment is obsolete: true

Attachment #339961 - Flags: review+

Igor Bukanov

Assignee

Comment 74

•

16 years ago

landed - http://hg.mozilla.org/mozilla-central/rev/5aaa5bcc6024

Status: REOPENED → RESOLVED

Closed: 16 years ago → 16 years ago

Resolution: --- → FIXED

Bob Clary [:bc] (inactive)

Updated

•

16 years ago

Flags: in-testsuite-

Flags: in-litmus-

ash_mozilla

Comment 75

•

15 years ago

I think this bug has been fixed in CVS (for a SM1.8RC2), but jscntxt.c seems quite different to the version in these patches.

proposed fix 18 years ago Brendan Eich [:brendan] (deleted), patch		Details \| Diff \| Splinter Review
the changes i have to js/src (for wes to see) 17 years ago timeless (deleted), patch		Details \| Diff \| Splinter Review
fix via local list pools 17 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v2 17 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v3 17 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v3 with diff -U8, not -U1 17 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v3b 17 years ago Igor Bukanov (deleted), patch		Details \| Diff \| Splinter Review
v4 16 years ago Igor Bukanov (deleted), patch	brendan : review+	Details \| Diff \| Splinter Review
v5 16 years ago Igor Bukanov (deleted), patch	igor : review+	Details \| Diff \| Splinter Review
v6 16 years ago Igor Bukanov (deleted), patch	igor : review+	Details \| Diff \| Splinter Review