Closed Bug 641025 (IncrementalGC) Opened 14 years ago Closed 13 years ago

Incremental GC

Categories

(Core :: JavaScript Engine, defect)

16 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16
blocking-kilimanjaro +
Tracking Status
firefox13 --- disabled
firefox14 --- disabled
firefox15 --- disabled
firefox16 --- fixed

People

(Reporter: billm, Assigned: billm)

References

(Blocks 2 open bugs, )

Details

(Keywords: addon-compat, Whiteboard: [Snappy:P1][k9o:p1:fx14][games:p1])

Attachments

(20 files, 17 obsolete files)

(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
gwagner
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
smaug
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
roc
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
(deleted), patch
cdleary
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
We would like to be able to divide GC time into short increments. Overall, we would still spend the same amount of time doing GC, but each individual pause would be shorter. Doing this for the sweep phase is mostly a matter of scheduling. For the mark phase, we need a write barrier and we need an explicit mark stack.
Alias: IncrementalGC
Keywords: footprint
Whiteboard: [MemShrink:P2]
Blocks: 505308
Whiteboard: [MemShrink:P2] → [MemShrink:P2][Snappy]
This won't affect memory consumption much, and is being tracked by project Snappy, so I'll un-MemShrink it.
Keywords: footprint
Whiteboard: [MemShrink:P2][Snappy] → [Snappy]
Blocks: GCHangs
Where is the work on the incremental GC tracked? This meta bug doesn't depend on any bug.
(In reply to Marco Castelluccio from comment #2) > Where is the work on the incremental GC tracked? This meta bug doesn't > depend on any bug. Bug 641027 was where the work that has landed so far was done.
I think it probably makes sense for this not to be a metabug. Also, I've pushed the current state of the incremental GC patch to the larch repo: http://hg.mozilla.org/projects/larch/ It's pretty unstable, but not entirely unusable.
Assignee: general → wmccloskey
Summary: [meta] Incremental GC → Incremental GC
Bill can you resolve this bug if you don't want to track this work here? It's snappy:p1 since people are already working on this.
Whiteboard: [Snappy] → [Snappy:P1]
Blocks: 703455
(In reply to Taras Glek (:taras) from comment #5) > Bill can you resolve this bug if you don't want to track this work here? > It's snappy:p1 since people are already working on this. My intention was to convert this from a metabug to a normal bug, not to close it. I still want to track incremental GC here. It's just that it's probably going to be one patch, so there's no need for any dependent bugs.
Depends on: 708091
Attached patch incremental GC patch (obsolete) (deleted) — Splinter Review
Hi Igor. We're trying to get incremental GC into Firefox 11. The patch has been mostly stable for the last few days, so I think it makes sense to put it up for review now. Could you let me know by Monday whether you'll be able to review it? The patch splits up a GC into multiple slices, with JS code allowed to run in between slices. The slices look like the following: slice 0: BeginMarkPhase, drainMarkStack slice 1: drainMarkStack ... slice n-1: drainMarkStack slice n: drainMarkStack, EndMarkPhase, SweepPhase So only marking is incremental. However, most sweeping is done on the background thread, so this is okay. Let me start by giving a few central rules for incremental GC: 1. During an incremental GC, if a memory location (except a root) is written to, then the value it previously held must be marked. The write barrier code (already landed) takes care of this. 2. Any object that is allocated during incremental GC must start out marked. 3. Roots are special memory locations that don't need write barriers. However, they must be marked in the first slice. Roots are things like the C stack and the VM stack, since it would be too expensive to put barriers on them. There are a few complications when implementing this in the JS engine. First, when JS is running, we accumulate additional objects to mark from the write barrier. The patch gives each compartment its own mark stack, and the write barrier pushes the objects it finds onto the mark stack of the object's compartment. This avoids threading issues. There is still a runtime-wide mark stack, which is used for most marking. Eventually, all the objects on a compartment's mark stack are popped off and pushed onto the runtime's mark stack, where their children will be marked. Since we don't want to allocate a big mark stack for each compartment, the compartment mark stacks are allocated dynamically as needed. If a malloc fails, we fall back to delayed marking. The runtime-wide mark stack is also resizable, although it has some ballast memory allocated when the runtime is created. The second complication is that any objects that are allocated during incremental GC (in between slices) must start out marked (rule 2 above). Instead of marking them during allocation, we set a bit on the arena that they're allocated into. Then during GC we mark the objects in arenas with this bit set. This is very similar to delayed marking, and it uses much of the same code. A third complication is with gray nodes that come from XPConnect. We're required to do gray marking only after all black marking is complete, as you know. However, the XPConnect roots don't have write barriers on them, so they must be marked in the first slice, according to rule 3 above. (I considered adding write barriers to them, but there are too many, and I don't understand them well.) Consequently, the patch uses a special JSTracer to accumulate gray roots from XPConnect into an array. This is done in the first slice. In the last slice, after all black marking is done, we mark all the elements in the array gray. This cannot be done incrementally, but usually there aren't very many gray nodes. Besides what I've just described, the patch has some additional pieces: * I had to rewrite the statistics gathering code to handle multiple GC slices. * I added some logging code to make it easier to see what's going on. I'm not sure whether this should land or not. I'll leave it up to you. * I added some validation code in DEBUG mode. At the end of incremental marking, it rescans the entire heap non-incrementally and checks that everything it finds is actually marked. * I split AutoGCSession into AutoHeapSession and AutoGCSession. The first one is used for anything that needs exclusive access to the runtime. The second one is only for GCs. Then I moved some of the starting and stopping logic for GCs into AutoGCSession. I think this is a little cleaner. * I instrumented the existing write barrier verifier to check rule 2 above, which says that objects allocated during an incremental GC must be marked. Please feel free to ask questions. I realize that this is a complex patch.
Attachment #580597 - Flags: review?(igor)
I've been using the /tinderbox-builds/larch-win64/ builds, and they're been reasonably stable, but I haven't see any noticeable changes/improvements in the incremental gc test from google spinning ball test. http://v8.googlecode.com/svn/branches/bleeding_edge/benchmarks/spinning-balls/index.html I'm guessing this will be improving soon?
(In reply to mdew from comment #8) > I'm guessing this will be improving soon? Yes. I still need to fix some issues with GC/CC scheduling. That's independent of this bug, though.
Few questions: 1. Is it possible to split the patch? For example, it seems the gray root changes can be factored into separated part. 2. Could you comment on the new heuristics and memory pressure accounting? 3. I do not see how large item marking plays together with object mutations during the GC. For example, if one calls Array.shift(1), then the current first element that will be discarded is properly marked using moveDenseArrayElements. However, other moved elements may just leave the range the big item on the mark stack was supposed to mark. 4. I am not sure on implications of the gray root change to weak map marking. Could you comment on that?
Attached patch gray marking change (obsolete) (deleted) — Splinter Review
This splits apart the gray marking changes.
Attachment #580597 - Attachment is obsolete: true
Attachment #580597 - Flags: review?(igor)
Attachment #581038 - Flags: review?(igor)
Attached patch incremental GC patch (obsolete) (deleted) — Splinter Review
This is everything else.
Attachment #581039 - Flags: review?(igor)
(In reply to Igor Bukanov from comment #10) > Few questions: > > 1. Is it possible to split the patch? For example, it seems the gray root > changes can be factored into separated part. I split out the gray marking. Let me know if you want me to do any other splits. > 2. Could you comment on the new heuristics and memory pressure accounting? The previous heuristic was to GC if gcBytes goes above gcLastBytes * GC_HEAP_GROWTH_FACTOR. Now there are three numbers: 1. We start an incremental GC when gcBytes >= gcLastBytes * GC_HEAP_SOFT_GROWTH_FACTOR 2. We try to end an incremental GC when gcBytes == gcLastBytes * GC_HEAP_TARGET_GROWTH_FACTOR 3. If gcBytes >= gcLastBytes * GC_HEAP_SOFT_GROWTH_FACTOR, we immediately switch to non-incremental GC and finish the GC In every GC slice, we mark approximately GC_INCREMENT_BYTES bytes (it's approximate because we only account for objects marked; if there are a lot of types or shapes, we'll GC for too long). Based on this, we have to decide how long to wait between slices. There are two triggers here. In MaybeGC, we make sure that we never go for more than 1/2 second (GC_IDLE_INCREMENTAL_SPAN) between incremental slices. The other trigger is based on allocation. Between each slice, we allow up to comp->gcIncrementBytes to be allocated. This trigger is designed to ensure that we never use more than GC_HEAP_TARGET_GROWTH_FACTOR overhead. I worked this out by some magic that I can no longer remember. I'll try to dredge up the equation I used. By the way, I realize I should add some comments to the patch about this, as well as some stuff in comment 7. I will do that. > 3. I do not see how large item marking plays together with object mutations > during the GC. For example, if one calls Array.shift(1), then the current > first element that will be discarded is properly marked using > moveDenseArrayElements. However, other moved elements may just leave the > range the big item on the mark stack was supposed to mark. The large mark stack stuff is not used for the moveDenseArrayElements barrier code. Instead, it calls prepareElementRangeForOverwrite, which invokes a write barrier on each element individually. So I don't think there's a problem here. There is a chance that we'll overflow the mark stack, but we'll still mark every element's arena for delayed marking, so we'll be safe. > 4. I am not sure on implications of the gray root change to weak map > marking. Could you comment on that? It should work the same as before. The gray root buffering doesn't actually do any marking until the end, which is when we did it before. At the end of black marking, we do weak marking. Then the buffered up gray roots are marked and drained. Then more weak marking happens for the gray objects.
(In reply to Bill McCloskey (:billm) from comment #13) > > 3. I do not see how large item marking plays together with object mutations > > during the GC. For example, if one calls Array.shift(1), then the current > > first element that will be discarded is properly marked using > > moveDenseArrayElements. However, other moved elements may just leave the > > range the big item on the mark stack was supposed to mark. > > The large mark stack stuff is not used for the moveDenseArrayElements > barrier code. Instead, it calls prepareElementRangeForOverwrite, which > invokes a write barrier on each element individually. So I don't think > there's a problem here. There is a chance that we'll overflow the mark > stack, but we'll still mark every element's arena for delayed marking, so > we'll be safe. Consider the following situation: a large item is pushed to the stack and points to a range [A, B) of the dense array slots. This implies that [0, A) is already scanned. Now between the incremental GC slices the code calls Array.shift(). That moves the elements in the range [1,length) to [0,length-1). As the result of that move the write barriers correctly schedule for marking the old values at indexes 0 and length-1. However, the large item would not be updated to point to [A-1,B-1). It still will point to [A,B) and, as such, will miss marking the object that was moved to the index A-1 from the index A. A simple remedy for this would be to reset all large items to point to the start of the slots array. That would also be compatible with changes from the bug 708382, although with changes from that bug it would be necessary to modify the structure of the stored large item to properly recover the object containing slots.
(In reply to Igor Bukanov from comment #14) I see now, you're right. It's also broken in a slightly different way. In between incremental slices, the slots for a dense array could be reallocated. Then the large mark item will be totally invalid. I'll work on a fix.
Comment on attachment 581038 [details] [diff] [review] gray marking change Review of attachment 581038 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsapi.h @@ +2809,5 @@ > > /* > * API for JSTraceCallback implementations. > */ > +# define JS_TRACER_INIT_AUX(trc, rt_, cx_, callback_) \ Hm, in all cases AFAICS you have non-null cx and in that case runtime is cx->runtime. So why this and its callers like rayRootMarker::start(JSRuntime *rt, JSContext *cx) that take both rt and cx are necessary? ::: js/src/jsgc.cpp @@ +1808,5 @@ > } > JS_ASSERT(!markLaterArenas); > } > > +GrayRootMarker::GrayRootMarker(JSRuntime *rt) Why the runtime argument is necessary here? @@ +1867,5 @@ > + } > +} > + > +void > +GrayRootMarker::push(void *thing, JSGCTraceKind kind) name nit: we do not have a stack here. So better name would be "append". ::: js/src/jsgc.h @@ +1727,5 @@ > + void start(JSRuntime *rt, JSContext *cx); > + void finish(); > + void reset(); > + > + void markAll(JSTracer *trc); This should take GCMarker I suppose. Alternatively, what about simply adding the fields from GrayRootMarker to GCMarker and making that one a field in JSRuntime? Then during collection of the grey roots JSTracer::callback can be simply set to the new callback and then reset back to NULL. @@ +1728,5 @@ > + void finish(); > + void reset(); > + > + void markAll(JSTracer *trc); > + bool canMarkAll() { return !allocFailed; } Comment that fallible gray root collection is a hack that will be removed after we get proper write barriers for xpconnect. @@ +1730,5 @@ > + > + void markAll(JSTracer *trc); > + bool canMarkAll() { return !allocFailed; } > + > + static void PushFunction(JSTracer *trc, void *thing, JSGCTraceKind kind); I read the name as "Push something related to JS function", not as "a function that oushes". I would prefer to call this either "PushingCallback" or just TracerCallback to emphasis that the function implements JSTracer::callback. @@ +1735,5 @@ > + > + private: > + void push(void *thing, JSGCTraceKind kind); > + > + static const size_t BlockSize = 512; Hm, what is wrong with Vector<StackElem... SystemAllocPolicy> here?
Attachment #581038 - Flags: review?(igor)
Attached patch gray marking v2 (obsolete) (deleted) — Splinter Review
I made the changes you requested, except I didn't fold the gray marker into the GCMarker. In the incremental GC patch, each compartment gets its own GCMarker. So I would like to keep it as small as possible.
Attachment #581038 - Attachment is obsolete: true
Attachment #581344 - Flags: review?(igor)
Comment on attachment 581344 [details] [diff] [review] gray marking v2 Review of attachment 581344 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.cpp @@ +2109,3 @@ > JS_REQUIRES_STACK void > MarkRuntime(JSTracer *trc) > { Assert here that trc->callback != GrayRootMarker::TracerCallback ::: js/src/jsgc.h @@ +1724,5 @@ > + * Gray marking must be done after all black marking is complete. However, we do > + * not have write barriers on XPConnect roots. Therefore, XPConnect roots must > + * be accumulated in the first slice of incremental GC. We accumulate these > + * roots in the GrayRootMarker and then mark them later, after black marking is > + * complete. Nit: Add FIXME bug xxx reference for the xpconnect write barriers bug. Also comment that the accumulation can fail. But in that case the incremental GC will be disabled and the roots will be scanned after the marking phase.
Attachment #581344 - Flags: review?(igor) → review+
Thanks. I'm working on rebasing over your mark stack changes. I'll try to split out that part of the patch as well. I should have it by the end of the day.
Attached patch mark stack changes, v1 (obsolete) (deleted) — Splinter Review
Here are the mark stack changes. I still need to make some fixes for the rest of the patch. I'll post that later.
Attachment #581457 - Flags: review?(igor)
Attached patch mark stack changes, v2 (obsolete) (deleted) — Splinter Review
Fixed a bug. Igor, I'm having difficulty figuring out a good way to fix the value range problem. I would like to be able to do it as you suggested. However, it's hard to know how to recover the starting and ending values. We use scan_value_array for several things: 1. dense array elements 2. fixed slots 3. dynamic slots I could add an entry for the object to the stack, but I would also have to tag it with the kind of slots array we want (1, 2, or 3). That seems ugly. Can you think of anything better? I thought of just pushing all the slots before leaving a GC slice, but that potentially will cause us to push a huge number of slots on the mark stack. That is unappealing. The easiest option is to never leave a slice until the stack contains no value ranges. However, that's also pretty unappealing.
Attachment #581457 - Attachment is obsolete: true
Attachment #581457 - Flags: review?(igor)
Attachment #581507 - Flags: review?(igor)
(In reply to Bill McCloskey (:billm) from comment #21) > I could add an entry for the object to the stack, but I would also have to > tag it with the kind of slots array we want (1, 2, or 3). That seems ugly. I do not think a tag is necessary as we simply can restart the scanning of the whole object replacing any triple (begin, end, object) on the stack with the object when we restart the GC. Another option is to record on the stack (obj, slot_kind, index) with slot_kind stored together with the index. That complicates the slot range reconstruction, but minimizes the stack usage. I have no idea what would be faster, but it should be easy to measure.
Comment on attachment 581507 [details] [diff] [review] mark stack changes, v2 Review of attachment 581507 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jscntxt.h @@ +422,5 @@ > uint32 gcEmptyArenaPoolLifespan; > /* We access this without the GC lock, however a race will not affect correctness */ > volatile uint32 gcNumFreeArenas; > uint32 gcNumber; > + js::GCMarker gcTracer; Any reason to call this gcTracer and not gcMarker? @@ +427,2 @@ > js::GCMarker *gcIncrementalTracer; > js::GrayRootMarker gcGrayTracer; So we get 2 different tracers stored in JSRuntime that tightly coupled with each other. IMO merging this into one class that would change JSTracer::callback during the gray root accumulation would result in less complex code. @@ +430,5 @@ > bool gcChunkAllocationSinceLastGC; > int64 gcNextFullGCTime; > int64 gcJitReleaseTime; > JSGCMode gcMode; > volatile jsuword gcBarrierFailed; The patch should remove the no longer necessary gcMarkStackArray. ::: js/src/jsgc.cpp @@ +2827,5 @@ > > + rt->gcTracer.start(cx); > + > + GCMarker *gcmarker = &rt->gcTracer; > + JS_ASSERT(IS_GC_MARKING_TRACER(gcmarker)); JS_ASSERT(!gcmarker->callback) gives stronger assert. ::: js/src/jsgc.h @@ +864,3 @@ > ArenaHeader::getNextDelayedMarking() const > { > + return reinterpret_cast<ArenaHeader *>(nextDelayedMarking << ArenaShift); Use &reinterpret_cast<Arena *>(nextDelayedMarking << ArenaShift)->aheader here. I have a patch that moves the headers out of of the arena and such usage simplifies merging with changes. @@ +871,4 @@ > { > JS_ASSERT(!hasDelayedMarking); > hasDelayedMarking = 1; > + nextDelayedMarking = aheader->address() >> ArenaShift; Use arenaAddress() here. @@ +1592,4 @@ > }; > > template<class T> > struct MarkStack { Have you considered to use Vector<, SystemAllocPolicy> in place of markstack? As the class supports preallocated fixed-sized buffer it could be a good fit here. The drawback is that currently the fixed space in the Vector is bounded to 1K, but I would like to see the data that show that we can do observably faster than Vector<uintptr_t, 128, SystemAllocPolicy>. @@ +1689,5 @@ > + limit = newStack + newcap; > + return true; > + } > + > + MarkStack() Move the constructor before ~MarkStack.
Attachment #581507 - Flags: review?(igor)
Attached patch using (obj, begin, end) for slots scanning (obsolete) (deleted) — Splinter Review
The patch expands (begin, end) pair on marking stack to (ownerObject, begin, end), so the incremental GC can properly reset all scanning ranges. I do not see a regression in the marking time with this change in various tests. As a bonus the patch allowed to remove DelayMarkingValueArray as on the full stack the code can simply delay marking for the object that owns the slot. As alternative I tried another approach with pushing just (obj, index) pairs. However, that observably regressed the marking time during V8 and on http://29a.ch/2010/6/2/realtime-raytracing-in-javascript .
Can we move "mark stack changes" patch that moves the gcmarker into JSRuntime to a separated bug? IMO it is quite worthy change on its own.
Blocks: 694883
Attachment #581039 - Attachment is obsolete: true
Attachment #581039 - Flags: review?(igor)
Attachment #581344 - Attachment is obsolete: true
Attachment #581507 - Attachment is obsolete: true
Attachment #581672 - Attachment is obsolete: true
Attached patch track arenas allocated during incremental GC (obsolete) (deleted) — Splinter Review
I'm going to start fresh here. I filed bug 718570, which subsumes the gray marking and mark stack changes. Hopefully that should be a quick review. This patch tracks objects that are allocated during an incremental GC. It reuses some of the delayed marking machinery for this purpose. Currently, all objects that are allocated during an incremental GC are traced through using the normal mechanisms. However, this isn't really necessary. In theory, we just need to avoid sweeping them--we shouldn't have to mark them or trace through them. However, not marking them could cause problems for the cycle collector. And not tracing through them can cause problems because some code assumes that the trace hook is called for all objects that survive a GC. In the future, I think we can work around these problems. But for Firefox 12, I'd rather be conservative. Consequently, we mark and trace through everything. But I've added rudimentary support for directly checking the allocatedDuringGC bit instead of checking isMarked(). This will allow us to optimize more easily in the future.
Attachment #589118 - Flags: review?(igor)
Attached patch change isMarked to !IsAboutToBeFinalized (obsolete) (deleted) — Splinter Review
This patch builds on top of the previous one and converts a bunch of isMarked calls outside the GC to IsAboutToBeFinalized. This is for the reason mentioned in the previous comment.
Attachment #589120 - Flags: review?(igor)
Attached patch time budgeting (obsolete) (deleted) — Splinter Review
This patch adds support for end the GC mark phase early if it takes too long.
Attachment #589121 - Flags: review?(igor)
This patch handles the problem brought up in comment 14 and comment 15. It solves the comment 14 problem by removing the optimization in moveDenseArrayElements. For the problem in comment 15, it converts all the slots pointers on the mark stack to use slot indices. This is only done when leaving an incremental GC slice. I wasn't able to use the simpler solution because I found a real-world web site with a gigantic array, where remarking the already-marked slots of an object took nearly the entire slice. So in each slice we were only marking a few hundred array elements before having to leave.
Attachment #589122 - Flags: review?(igor)
Attached patch code movement for GC phases (deleted) — Splinter Review
This patch is a pretty straightforward refactoring of the Begin/EndMarkPhase and SweepPhase code.
Attachment #589123 - Flags: review?(igor)
Attached patch AutoHeapSession/AutoGCSession (deleted) — Splinter Review
This patch renames AutoGCSession to AutoHeapSession. Then it makes a new AutoGCSession (that derives from AutoHeapSession) that does GC-specific work (i.e., not to be used by TraceRuntime). I think this is a nice cleanup.
Attachment #589124 - Flags: review?(igor)
Attached patch incremental GC (deleted) — Splinter Review
This is the actual implementation of incremental GC slices. Incremental GC is only possible if certain constraints (listed in IsIncrementalGCSafe) hold. If they ever cease to hold, we use ResetIncrementalGC to undo all the incremental GC work, and then we do a non-incremental GC.
Attachment #589125 - Flags: review?(igor)
I'll be out of town until Sunday, and I probably won't have much internet access. I'm really hoping to make Firefox 12, which means that I'll have a week to fix everything when I get back. Igor, even if one patch looks bad, could you still review the other ones? Otherwise I'm worried that we won't be able to get everything done by January 31st. There's still some orange to fix and some fuzzbugs. I also need to avoid benchmark and Talos regressions. I haven't posted the actual code to trigger incremental GCs yet. I'll do that next week, since I expect it will change somewhat during tuning. If you want to see all the code at once, it's in the larch repo, which is up to date as of now.
Comment on attachment 589118 [details] [diff] [review] track arenas allocated during incremental GC Review of attachment 589118 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.cpp @@ +1946,2 @@ > } > + aheader->markOverflow = 0; This is wrong - during the above JS_TraceChildren or PushArena we can experience the stack overflow for the current arena that sets markOverflow and the code clears it here. ::: js/src/jsgc.h @@ +414,5 @@ > */ > public: > size_t hasDelayedMarking : 1; > + size_t allocatedDuringIncremental : 1; > + size_t markOverflow : 1; I think using the delayed marking stack also to record arenas that requires marking of all unmarked things because they could have been allocated unnecessary complicates logic. First, those arenas with allocated things has to be scanned only once when the incremental GC restarts, not repeatedly as could be the case with the delayed marking. This suggests to scan all arenas in the compartment and push unmarked things from allocatedDuringIncremental arenas to the marking stack at the beginning of the the next incremental GC step. Second, we do not even need allocatedDuringIncremental flag and the code that sets it during allocation. Rather we can record in ArenaLists the arenas that ArenaLists::lists.cursor pointed to at the start of the last incremental GC step. Then at the start of the next step the code just needs to push unmarked things in the arenas between the recorded cursor and the new cursor. As a bonus both these options allows to remove IsAboutToBeFinalizedInCompartment as the marking code would only see properly marked things. @@ +1166,5 @@ > } > } > } > > + inline void setAllocatedDuringGC(JSCompartment *comp); The semantics here is to require marking of all allocated things in the arenas where the current allocation spans points to. Moreover, it is very unclear why one needs this. So, as this is only used from StartVerifyBarriers, rename this into startVerifyBarriers and comment in the implementation about what is going on.
Attachment #589118 - Flags: review?(igor) → review-
(In reply to Igor Bukanov from comment #34) > > + inline void setAllocatedDuringGC(JSCompartment *comp); > > The semantics here is to require marking of all allocated things in the > arenas where the current allocation spans points to. Moreover, it is very > unclear why one needs this. So, as this is only used from > StartVerifyBarriers, rename this into startVerifyBarriers and comment in the > implementation about what is going on. I was wrong here about the purpose of the function - you going to use it later when pausing the incremental GC. So a better name here would be prepareForIncrementalGC or something.
Comment on attachment 589120 [details] [diff] [review] change isMarked to !IsAboutToBeFinalized With IsAboutToBeFinalizedInCompartment eliminated as suggested in the previous patch I do not see why the patch here is necessary. Besides, currently IsAboutToBeFinalized and isMarked has clear separation - the former should be used during the finalization/sweeping when all the marking is done while the latter is for the marking phase.
Comment on attachment 589122 [details] [diff] [review] save value ranges when leaving incremental slice Review of attachment 589122 [details] [diff] [review]: ----------------------------------------------------------------- I was thinking to replace any array tag by the corresponding object and always rescan the object under the incremental GC for simplicity. However, SavedValueArrayTag implementation here is surprisingly compact and self-contained. Nice! ::: js/src/jsgcmark.cpp @@ +1089,5 @@ > + HeapValue *start; > + uintptr_t index; > + }; > + JSObject *obj; > +}; add static assert that sizeof(ValueArrayLayout) == 2 * sizeof(uintptr_t) ::: js/src/jsobjinlines.h @@ +607,5 @@ > JSObject::moveDenseArrayElements(uintN dstStart, uintN srcStart, uintN count) > { > JS_ASSERT(dstStart + count <= getDenseArrayCapacity()); > JS_ASSERT(srcStart + count <= getDenseArrayCapacity()); > Comment here that memmove is not used so HeapValue can take about the barriers.
Attachment #589122 - Flags: review?(igor) → review+
Comment on attachment 589121 [details] [diff] [review] time budgeting Review of attachment 589121 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.h @@ +1767,5 @@ > +struct TimeBudget { > + bool limited; > + int64_t budget; /* in microseconds */ > + int64_t startTime; > + int counter; What is the reason for the explicit limited field (that required an extra branch in a hot isOverBudget) rather than setting the budget field to int64_max and the counter to intptr_t_max? ::: js/src/jsgcmark.cpp @@ +1204,5 @@ > + runtime->gcCheckCompartment = runtime->gcCurrentCompartment; > + } > + ~AutoCheckCompartment() { runtime->gcCheckCompartment = NULL; } > + } acc(rt); > +#endif This is not related to the budget changes, right?
Thanks. These are all good suggestions. Is there any chance that you could write a patch for the allocatedDuringIncremental changes you suggested? They sound good to me. You could use larch for testing.
(In reply to Igor Bukanov from comment #38) > Comment on attachment 589121 [details] [diff] [review] > > What is the reason for the explicit limited field (that required an extra > branch in a hot isOverBudget) rather than setting the budget field to > int64_max and the counter to intptr_t_max? I will fix that. > ::: js/src/jsgcmark.cpp > @@ +1204, > > + runtime->gcCheckCompartment = runtime->gcCurrentCompartment; > > + } > > + ~AutoCheckCompartment() { runtime->gcCheckCompartment = NULL; } > > + } acc(rt); > > +#endif > > This is not related to the budget changes, right? Now that we return early from drain, I wanted to use RAII for setting gcCheckCompartment.
(In reply to Bill McCloskey (:billm) from comment #39) > Thanks. These are all good suggestions. Is there any chance that you could > write a patch for the allocatedDuringIncremental changes you suggested? They > sound good to me. You could use larch for testing. I will do it on Thursday.
Question about marking of things allocated between GC slices. The current code in hg.mozilla.org/projects/larch/ marks not only newly allocated GC things but also recursively any thing they point to (the PushArena code). I assume this is done because the code cannot distinguish between truly newly allocated things or any other unmarked GC things in the arena containing the newly allocated thing, right? That is, if the GC could have distinguish between those, then it could just mark the newly allocated things. If another GC thing is stored in a newly allocated thing, it should either be reachable from the initial roots or be newly allocated itself. In both cases it should be marked on its own without the need for the recursive marking. However, if the GC can mistake an unreachable thing for a newly allocated thing, then it must mark all the things recursively. Otherwise after the finalization is run, we would have a GC thing that points to dead things. This will lead to a crash during the next GC when we finally finalize that unreachable thing.
Attachment #589123 - Flags: review?(igor) → review+
Attachment #589124 - Flags: review?(igor) → review+
(In reply to Igor Bukanov from comment #42) > Question about marking of things allocated between GC slices. > > The current code in hg.mozilla.org/projects/larch/ marks not only newly > allocated GC things but also recursively any thing they point to (the > PushArena code). I assume this is done because the code cannot distinguish > between truly newly allocated things or any other unmarked GC things in the > arena containing the newly allocated thing, right? > > That is, if the GC could have distinguish between those, then it could just > mark the newly allocated things. If another GC thing is stored in a newly > allocated thing, it should either be reachable from the initial roots or be > newly allocated itself. In both cases it should be marked on its own without > the need for the recursive marking. > > However, if the GC can mistake an unreachable thing for a newly allocated > thing, then it must mark all the things recursively. Otherwise after the > finalization is run, we would have a GC thing that points to dead things. > This will lead to a crash during the next GC when we finally finalize that > unreachable thing. Yes, this is all correct. I forgot about the point in the last paragraph--I guess you could solve this (for the new patch) by always allocating into fresh arenas during an incremental gc. However, I think I migh prefer to be conservative and stick with the current code. We can always fix it in the next cycle.
Attached patch alternative arena marking (obsolete) (deleted) — Splinter Review
The patch is for larch to avoid using the delayed marking to record arenas with newly allocated things. It still uses ArenaHaeder::allocatedDuringIncremental, but that is decoupled from the delayed marking. Rather the code just enumerates all relevant compartment arenas and mark things there accordingly. Implementing that idea of recording a position into the arena list is turned out to be more complex than I expected. I will do it later.
Attachment #590480 - Flags: review?(wmccloskey)
Comment on attachment 589125 [details] [diff] [review] incremental GC Review of attachment 589125 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.cpp @@ +68,5 @@ > + * 2. Any object that is allocated during incremental GC must start out marked. > + * 3. Roots are special memory locations that don't need write > + * barriers. However, they must be marked in the first slice. Roots are things > + * like the C stack and the VM stack, since it would be too expensive to put > + * barriers on them. Explain also some details: conditions when we can run the incremental GC the role of JSCompartment::barrierMarker_ and how it is also used for the verification why active frames should be marked why the code needs to deal with newly created compartments specially @@ +3641,5 @@ > + IncrementalGCSlice(cx, gckind); > + else > + MarkAndSweep(cx, gckind); > + > + if (rt->gcIncrementalState == NO_INCREMENTAL) { Add #ifdef DEBUG here - this will help not only the compiler but also the reader to grasp that the code just checks.
Attachment #589125 - Flags: review?(igor) → review+
Comment on attachment 590480 [details] [diff] [review] alternative arena marking Review of attachment 590480 [details] [diff] [review]: ----------------------------------------------------------------- Thanks Igor. However, I'm afraid that this patch will perform worse than what's in larch. It looks like it iterates over every arena at the beginning of every GC slice. The reason I reused the delayed marking code is because I wanted to reuse the linked list of arenas. I think we should either keep the current approach or go with something that knows how to traverse only the arenas where allocatedDuringIncremental is known to be set (or at least likely to be set).
(In reply to Bill McCloskey (:billm) from comment #46) > I think we should either keep the current approach Ok, lets keep the current approach but fix the bug from the start of the comment 34 and improve in another bug. I have a patch in work that marks all unnallocated things when the code transfer the free list from the arena to JSCompartment::arenas and then unmark the unused part when the remaining free list is sent back to arena. But that requires some work to ensure that unmarking always happens.
Yes, but how does it do in producing the least number of page faults?
(In reply to Robert Claypool from comment #48) > Yes, but how does it do in producing the least number of page faults? If the goal is to minimize the number of page faults, then a non-incremental GC is a clear win over incremental one. If the goal to minimize the pause time, then moving the page faults from the worst contributor to the pause time to more snappy parts is a win even if the total number of page faults increases.
So one goal is to move the times that there are page faults to the times that the program isn't doing much of anything. How does this cause page faults to increase?
Attached patch MarkRoot assertions (deleted) — Splinter Review
This patch adds a bunch of asserts to MarkRoot to ensure that it's only called during root marking.
Attachment #591354 - Flags: review?(igor)
Attachment #591354 - Flags: review?(igor) → review+
Attached patch changes to statistics gathering (obsolete) (deleted) — Splinter Review
This patch revises the statistics code so that it works with incremental GC. It totally changes the format of both the textual output (MOZ_GCTIMER) and the error console output. I left the MOZ_GCTIMER=stdout output the same, since I think Gregor uses this for piping to gnuplot. The format of the output is now pretty much the same between MOZ_GCTIMER and the error console. It was a big pain having them be separate, and there's now too much data to fit on a single line anyway. The output has the form: TotalTime: <total time in GC>ms, Type: <compartment|global>, MMU(20ms): x%, MMU(50ms): x%, MaxPause: x ms Then there is a line for each slice, which lists how long the slice took and how much time it spent in each phase. Short phases are skipped. Then there is a line at the end for the total time spent in each phase over all slices. There is a new telemetry histogram for the length of a GC slice. The MMU requires some explanation. I think the comment in the patch gives the best description I can. Basically, it's a better way of measuring responsiveness than max pause time, since it penalizes you if you do a lot of short slices without giving JS code time to run in between.
Attachment #591559 - Flags: review?(igor)
Sorry, forgot to qref.
Attachment #591559 - Attachment is obsolete: true
Attachment #591559 - Flags: review?(igor)
Attachment #591562 - Flags: review?(igor)
Attached patch time budgeting, v2 (deleted) — Splinter Review
I removed the limit field as requested. As I said above, the AutoCheckCompartment is useful because now we can leave drainMarkStack in multiple places.
Attachment #589121 - Attachment is obsolete: true
Attachment #589121 - Flags: review?(igor)
Attachment #591567 - Flags: review?(igor)
This patch updates the original allocatedDuringIncremental patch, fixing the bug in it. It removes the IsAboutToBeFinalizedInCompartment stuff, which I don't think is really necessary. I'm also obsoleting the isMarked->IsAboutToBeFinalized patch. I would like to do this eventually, but I don't think it's required for incremental GC.
Attachment #589118 - Attachment is obsolete: true
Attachment #589120 - Attachment is obsolete: true
Attachment #590480 - Attachment is obsolete: true
Attachment #589120 - Flags: review?(igor)
Attachment #590480 - Flags: review?(wmccloskey)
Attachment #591578 - Flags: review?(igor)
Comment on attachment 591562 [details] [diff] [review] changes to statistics gathering, v2 (In reply to Bill McCloskey (:billm) from comment #52) > Created attachment 591559 [details] [diff] [review] > changes to statistics gathering > > This patch revises the statistics code so that it works with incremental GC. > > It totally changes the format of both the textual output (MOZ_GCTIMER) and > the error console output. I left the MOZ_GCTIMER=stdout output the same, > since I think Gregor uses this for piping to gnuplot. Actually it is the current MOZ_GCTIMER!=stdout format that is piped to gnuplot, see js/src/gnuplot/gcTimer.gnu. So the changes clearly brokes that. > > The format of the output is now pretty much the same between MOZ_GCTIMER and > the error console. It was a big pain having them be separate, and there's > now too much data to fit on a single line anyway. The format was not intended foe human consumption. So I ask Gregor for his opinion about the changes here. For me they look OK.
Attachment #591562 - Flags: review?(igor) → review?(anygregor)
Comment on attachment 591567 [details] [diff] [review] time budgeting, v2 Review of attachment 591567 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.cpp @@ +1988,5 @@ > aheader->hasDelayedMarking = 0; > markLaterArenas--; > markDelayedChildren(aheader); > + > + if (budget.checkOverBudget()) Shouldn't this use the light check isOverBudget? IIRC time sampling is rather expensive even compared with the arena scanning. Either way it is your call.
Attachment #591567 - Flags: review?(igor) → review+
Comment on attachment 591578 [details] [diff] [review] track arenas allocated during incremental GC, v2 Review of attachment 591578 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.cpp @@ +1918,5 @@ > > void > GCMarker::markDelayedChildren(ArenaHeader *aheader) > { > + JSGCTraceKind traceKind = MapAllocToTraceKind(aheader->getAllocKind()); traceKind is only used with aheader->markOverflow
Attachment #591578 - Flags: review?(igor) → review+
Attachment #591562 - Flags: review?(anygregor) → review+
Attached patch barrier verifier improvements (deleted) — Splinter Review
This patch makes some improvements to the barrier verifier. Previously, the verifier only checked that we don't miss write barriers. Now it also looks for cases where we miss read barriers on weak pointers or where new objects are not marked black. It still takes an initial heap snapshot. Then, at the end of verification, it checks that any objects now in the heap that were not in the snapshot are marked. This catches (most) read barrier problems and allocation problems. One important thing is that we need to be able to use the same set of conservative stack roots in the second verification pass that we used in the first. Otherwise it might say that we missed an object when we really didn't. So I made a debug-only vector of saved conservative roots. If you pass a special flag into MarkConservativeRoots, it will use these roots instead of doing the stack scan. This code is also used in the next patch in the series. I imagine we could use it for other kinds of verification, too.
Attachment #591632 - Flags: review?(igor)
Attached patch incremental marking validation (deleted) — Splinter Review
This patch is more a more direct kind of verification than the write barrier verifier. After incremental marking finishes, it does another marking pass, this time non-incrementally. The mark bitmaps from the first and second marking passes are compared, as follows. Assume an object is marked with color C_inc in the incremental bitmap and C_noninc in the non-incremental bitmap. 1. If C_noninc is black, then C_inc must be black or else we assert. 2. If C_inc is gray, then C_noninc must be gray or white. Otherwise we assert. This extra debug marking is pretty slow, but it's caught some bugs. We might consider turning it off eventually (or maybe making it optional, like gc zeal). I'd rather leave it on for now.
Attachment #591635 - Flags: review?(igor)
Comment on attachment 591632 [details] [diff] [review] barrier verifier improvements Review of attachment 591632 [details] [diff] [review]: ----------------------------------------------------------------- This may conflict with the bug 675078, so rebasing is in due...
Attachment #591632 - Flags: review?(igor) → review+
This patch adds the code that allows incremental GC to be invoked, both through JSAPI and through XPConnect. It also adds an about:config option for incremental GC, and it renames some of the GC entry points from js_ to js::. Finally, it changes how the mNeedGCBeforeCC flag is set. Currently it's set by XPConnect when calling nsXPConnect::Collect. However, I think that the JSGC_END callback is a better place, since it happens at the end of a GC rather than the beginning, and it's less likely that we'll miss calling it.
Attachment #592342 - Flags: review?(igor)
Attached patch add a GC slice callback (deleted) — Splinter Review
The DOMGCFinishedCallback is a little bit too limited for incremental GC. We need to be able to tell nsJSEnvironment when a slice begins and ends as well as when a full GC ends. This patch converts the GCFinishedCallback to a GCSliceCallback, which gives more info about the GC. The patch doesn't actually change DOMGCFinishedCallback much; that happens in a later patch. I also added a generic descriptor struct to pass to the slice callback. I think this should make it easier to implement bug 531396, since we will be able to pass the JSON string without changing the function signature.
Attachment #592345 - Flags: review?(igor)
This code implements the jsgc.cpp-internal triggering mechanisms for incremental GC. I have a separate patch for nsJSEnvironment.cpp. Most of the GCs triggered from jsgc.cpp are converted to use incremental GC. There are a few additional changes: - Last-ditch GCs continue to be non-incremental to ensure we reclaim memory right away. However, I ran into problems with this, because we sometimes do last-ditch GCs when we don't really need to. This happens if we call TriggerGC, and then we allocate before checking the interrupt flags. So I changed the conditions where we do a last-ditch GC. If we actually are out of memory, we will still do one, of course. - Incremental GCs are designed to happen *before* gcBytes hits gcTriggerBytes (either via MaybeGC or a timer). If we ever do hit gcTriggerBytes, we now reset the incremental GC (i.e., we fall back to non-incremental GC). The goal is to avoid doing incremental GC in really allocation-heavy sites like benchmarks.
Attachment #592348 - Flags: review?(igor)
Attached patch changes to timer-triggered GCs (deleted) — Splinter Review
This patch adjusts the nsJSEnvironment.cpp triggering code. Olli, I ended up changing how the GCFinished callback works in the "add a GC slice callback" patch, so you might want to look at that first. The goal of this code is to: - Use incremental GCs for timer-triggered GCs - Avoid running the cycle collector during an incremental GC - If we're in the middle of an incremental GC, slices should be no more than 100ms apart. Otherwise it might finish too slowly.
Attachment #592349 - Flags: review?(bugs)
Attached patch support for frame-based triggering (obsolete) (deleted) — Splinter Review
This code adds support to the JS engine for "animation frame events". This event is supposed to happen whenever the browser draws a frame. Ideally, we want to do one incremental GC slice per animation frame. There is some extra code here to spread out the slices in some cases.
Attachment #592351 - Flags: review?(igor)
This calls AnimationFrameEvent from nsPresShell::Paint. Rob, I know you wanted this to work somewhat differently. If you don't want to r+ the patch, could you file a bug for the DidPaint issues we talked about and make it block this bug? I'd rather deal with that later, but it's up to you.
Attachment #592352 - Flags: review?(roc)
Attached patch support for tracefiles (obsolete) (deleted) — Splinter Review
This last patch is kind of optional. It adds support for generating trace files that can be used with the code in bug 684072. I've found it to be extremely useful in tracking down GC pauses. It makes it really easy to see where the framerate drops and then zero in on what's causing it to drop. However, it does clutter up the code somewhat. Maybe you have some ideas, Igor? Also, this is the last patch!
Attachment #592354 - Flags: review?(igor)
Comment on attachment 592352 [details] [diff] [review] call AnimationFrameEvent from nsPresShell::Paint Review of attachment 592352 [details] [diff] [review]: ----------------------------------------------------------------- I just landed a patch for bug 721294 on inbound. Now you should be able to do this if (!aWillSendDidPaint) in PresShell::Paint (at the end, *after* painting, please), and in PresShell::DidPaint() (which is only be called once per paint, now). Also, I think the XPConnect method should be called NotifyDidPaint.
Attachment #592352 - Flags: review?(roc) → review-
Comment on attachment 592349 [details] [diff] [review] changes to timer-triggered GCs > nsJSContext::CycleCollectNow(nsICycleCollectorListener *aListener) > { > if (!NS_IsMainThread()) { > return; > } > >+ if (sCCLockedOut) { >+ // If we're in the middle of an incremental GC, CC later. >+ PokeCC(); >+ return; >+ } This will need to be changed after I've pushed my patches. >- sCCollectedWaitingForGC = 0; >- if (sGCTimer) { >- // If we were waiting for a GC to happen, kill the timer. >+ // Prevent cycle collections during incremental GC. >+ if (progress == js::GC_CYCLE_BEGIN) >+ sCCLockedOut = true; >+ else if (progress == js::GC_CYCLE_END) >+ sCCLockedOut = false; if (expr) { stmt; } else if (expr) { stmt; } And could you please change the parameter names. In Gecko the coding style is aParameter.
Attachment #592349 - Flags: review?(bugs) → review+
Attached patch support for NotifyDidPaint from nsPresShell (obsolete) (deleted) — Splinter Review
Thanks for fixing that, Rob. Here's the updated patch.
Attachment #592352 - Attachment is obsolete: true
Attachment #592530 - Flags: review?(roc)
This renames AnimationFrameEvent to NotifyDidPaint.
Attachment #592351 - Attachment is obsolete: true
Attachment #592351 - Flags: review?(igor)
Attachment #592531 - Flags: review?(igor)
Comment on attachment 592530 [details] [diff] [review] support for NotifyDidPaint from nsPresShell Review of attachment 592530 [details] [diff] [review]: ----------------------------------------------------------------- This code is OK but unfortunately you missed a couple of early return paths in PresShell::Paint.
Comment on attachment 591635 [details] [diff] [review] incremental marking validation Review of attachment 591635 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsgc.cpp @@ +3071,5 @@ > + > + js::gc::State state = rt->gcIncrementalState; > + rt->gcIncrementalState = NO_INCREMENTAL; > + > + WeakMapBase::resetWeakMapList(rt); Why is this necessary here? In general, comment about anything that looks like chnage the global state in debug-only code. @@ +3082,5 @@ > + map.init(); > + > + for (GCChunkSet::Range r(rt->gcChunkSet.all()); !r.empty(); r.popFront()) { > + ChunkBitmap *bitmap = &r.front()->bitmap; > + uintptr_t *entry = (uintptr_t *)js_malloc(sizeof(bitmap->bitmap)); assert here and in other places that malloc does not fail.
Attachment #591635 - Flags: review?(igor) → review+
Comment on attachment 592530 [details] [diff] [review] support for NotifyDidPaint from nsPresShell Should use a RAII wrapper to avoid early return bugs
Attachment #592342 - Flags: review?(igor) → review+
Attachment #592345 - Flags: review?(igor) → review+
Attachment #592348 - Flags: review?(igor) → review+
Comment on attachment 592531 [details] [diff] [review] support for frame-based triggering, v2 Review of attachment 592531 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsfriendapi.h @@ +648,5 @@ > + * Signals a good place to do an incremental slice, because the browser is > + * drawing a frame. > + */ > +extern JS_FRIEND_API(void) > +NotifyDidPaint(JSContext *cx); Can we split this into query and GCSlice pair like: WantGCSlice(JSRuntime *rt)/RunGCSlice(JSContext *cx)? This way NotifyDidPain can avoid XPCCallContext ccx(NATIVE_CALLER) overhead when the incremental GC is not running. ::: js/xpconnect/idl/nsIXPConnect.idl @@ +734,5 @@ > */ > void GarbageCollect(in PRUint32 reason, in PRUint32 kind); > > /** > + * Signals a good place to do an incremental slice, because the browser is s/an incremental slice/an incremental GC slice
Attachment #592531 - Flags: review?(igor) → review+
With the current implementation of regular expressions we have a potential problem with the incremental GC. As regular expression private data object is cleared during marking (see regexp_trace in js/src/vm/RegExpObject.cpp), then during the incremental GC that private could be recreated after the regexp object is marked and purged. Then this private will survive the GC. I do not know how bad is this, but one way or another this has to be addressed.
In the code we have js_SetReservedSlot called by JS_SetReservedSlot that does not have any barriers and js::SetReservedSlot exposed via jsfriendapi.h that do apply barriers. Since the former is also used to store GC things, perhaps we should fix this?
Another issue is about private data structure holding GC things. They all must use barriers, right? However, this implies that one can not use plain C to implement them. In turn this means JSObject::trace pretty much inmplementable in C...
(In reply to Igor Bukanov from comment #78) > In the code we have js_SetReservedSlot called by JS_SetReservedSlot that > does not have any barriers and js::SetReservedSlot exposed via jsfriendapi.h > that do apply barriers. Since the former is also used to store GC things, > perhaps we should fix this? js_SetReservedSlot calls JSObject::setSlot, which is barriered (since its assignment goes through a HeapValue, which handles barriers automatically). js::SetReservedSlot is different because it can't use a HeapValue, since we don't expose HeapValue through jsapi or jsfriendapi. So it needs to do the barrier manually.
Bill, do you already have performance numbers or do you need some help with performance tests, regression findings.....? What's the current state here?
(In reply to Gregor Wagner [:gwagner] from comment #81) > Bill, do you already have performance numbers or do you need some help with > performance tests, regression findings.....? What's the current state here? I'm working on fixing some bugs. I haven't done much performance testing. It looks like SunSpider is unchanged, but I saw some problems in V8 that I didn't have time to track down. If you have time to help, that would be great. Everything is on larch. However, I've been messing around with the value returned by gcZeal() is jscntxt.h to get better testing on tinderbox. So you might have to fix that first.
This one uses RAII.
Attachment #592530 - Attachment is obsolete: true
Attachment #592530 - Flags: review?(roc)
Attachment #596253 - Flags: review?(roc)
One last patch. This does what we talked about with regard to trace hooks. It adds a new flag on the JSClass that should only be set if the class implements write barriers correctly. If we ever create an object whose class has a trace hook and doesn't have this flag set, then we permanently disable incremental GC as long as the browser is running. Hopefully this should be sufficient to solve the problem of misbehaving add-ons.
Attachment #596258 - Flags: review?(igor)
(In reply to Bill McCloskey (:billm) from comment #84) > Created attachment 596258 [details] [diff] [review] > disable incremental GC in the presence of unknown trace hooks > > One last patch. This does what we talked about with regard to trace hooks. > It adds a new flag on the JSClass that should only be set if the class > implements write barriers correctly. If we ever create an object whose class > has a trace hook and doesn't have this flag set, then we permanently disable > incremental GC as long as the browser is running. > > Hopefully this should be sufficient to solve the problem of misbehaving > add-ons. Will this be completely silent or be notified in e.g. about:support?
The idea for about:support seemed good. I also added a telemetry histogram for the same thing.
Attachment #596529 - Flags: review?(igor)
Comment on attachment 596529 [details] [diff] [review] tell telemetry and about:support if incremental GC disabled >+++ b/dom/interfaces/base/nsIDOMWindowUtils.idl >@@ -988,16 +988,22 @@ interface nsIDOMWindowUtils : nsISupport > * > */ > boolean getFileReferences(in AString aDatabaseName, in long long aId, > [optional] out long aRefCnt, > [optional] out long aDBRefCnt, > [optional] out long aSliceRefCnt); > > /** >+ * Return whether incremental GC has been disabled due to a binary add-on. >+ */ >+ [implicit_jscontext] >+ boolean isIncrementalGCDisabled(); You should update the uuid of the interface when changing it.
(In reply to Olli Pettay [:smaug] from comment #87) > You should update the uuid of the interface when changing it. Yeah, good point. Fixed locally.
Comment on attachment 592354 [details] [diff] [review] support for tracefiles Review of attachment 592354 [details] [diff] [review]: ----------------------------------------------------------------- I do not understand the goal of this. If the idea is to have this not as a separated patch but as a part of the code, then there should be a runtime switch to enable/disable this based on some preference like environment variable or about:config. ::: js/src/jscntxt.cpp @@ +1472,5 @@ > } > > JS_ASSERT(!resolvingList); > + > + GetEventStream()->flush(); flush() at arbitrary points is bad especially with events that have embedded time stamps. The time stamps should allow for time-based flush when the file is flushed automatically if the new event comes a second or similar threshold later than the last flush. ::: js/src/jsfriendapi.cpp @@ +640,5 @@ > GCSlice(cx, NULL, GC_NORMAL, gcreason::REFRESH_FRAME); > > rt->gcInterFrameGC = false; > + > + static int64_t prev = 0; In view of webworkers the variable should not be static. Just put that to the event stream and add a version of the event that calls PRMJ_Now() as necessary so the code path here would be minimally destructive. ::: js/src/jstracefile.cpp @@ +3,5 @@ > + * > + * ***** BEGIN LICENSE BLOCK ***** > + * Version: MPL 1.1/GPL 2.0/LGPL 2.1 > + * > + * The contents of this file are subject to the Mozilla Public License Version Use the new MPL 2.0 license block like in gc/Memory.h @@ +50,5 @@ > +static char gEventStreamStorage[sizeof(EventStream)]; > + > +static void *gOwnerThread; > + > +EventStream * What is wrong with stuffing this stuff to JSRuntime and avoiding own gOwnerThread management? ::: js/src/jstracefile.h @@ +2,5 @@ > + * vim: set ts=8 sw=4 et tw=78: > + * > + * ***** BEGIN LICENSE BLOCK ***** > + * Version: MPL 1.1/GPL 2.0/LGPL 2.1 > + * Use the new MPL 2.0 license block like in gc/Memory.h
Attachment #592354 - Flags: review?(igor)
Comment on attachment 596258 [details] [diff] [review] disable incremental GC in the presence of unknown trace hooks Review of attachment 596258 [details] [diff] [review]: ----------------------------------------------------------------- r+ conditional that I got interaction between IsIncrementalGCSafe and gcIncrementalEnabled in a right way. ::: js/src/jscntxt.h @@ +359,5 @@ > + /* > + * We disable incremental GC if we encounter a js::Class with a trace hook > + * that does not implement write barriers. > + */ > + bool gcIncrementalDisabled; I prefer to flip this into gcIncrementalEnabled as if (not_good) gcIncrementalEnabled = false; ... if (!gcIncrementalEnabled) bad_rare_case_recovery(); is perceived easier than if (not_good) gcIncrementalDisabled = true ... if (gcIncrementalDisabled) bad_rare_case_recovery(); This is because the pattern (not_good false) and (!check bad_rare_case) are easier to recognize together than (not_good true) and (check bad_rare_case) ::: js/src/jsobj.cpp @@ +2881,5 @@ > if (!obj) > return NULL; > > + if (clasp->trace && !(clasp->flags & JSCLASS_IMPLEMENTS_BARRIERS)) > + cx->runtime->gcIncrementalDisabled = true; I do not see how a creation of !JSCLASS_IMPLEMENTS_BARRIERS class aborts an already running incremental GC. Is it done as a part of IsIncrementalGCSafe call that is called before any incremental slice? If so comment on that here. If not, we must implement that to protect against the code that is not updated for the incremental GC. Also, what about calling abort() here? Will that survived the tryserver?
Attachment #596258 - Flags: review?(igor) → review+
Comment on attachment 596529 [details] [diff] [review] tell telemetry and about:support if incremental GC disabled Review of attachment 596529 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsfriendapi.cpp @@ +721,5 @@ > prev = now; > } > > +extern JS_FRIEND_API(bool) > +IsIncrementalGCDisabled(JSContext *cx) Change the argument to JSRuntime *
Attachment #596529 - Flags: review?(igor) → review+
(In reply to Igor Bukanov from comment #90) > I do not see how a creation of !JSCLASS_IMPLEMENTS_BARRIERS class aborts an > already running incremental GC. Is it done as a part of IsIncrementalGCSafe > call that is called before any incremental slice? If so comment on that > here. If not, we must implement that to protect against the code that is not > updated for the incremental GC. Yes. We check IsIncrementalGCAllowed (which calls IsIncrementalGCSafe) before every slice. If it returns false, the current incremental GC is canceled and a new, non-incremental GC is done. > Also, what about calling abort() here? Will that survived the tryserver? I hope so :-). I'll try doing this after I land. If it turns anything up, I'll file follow-up bugs.
Attached patch rebase over RegExp changes (deleted) — Splinter Review
This patch updates how regular expression objects are handled now that the refcounting is gone.
Attachment #596802 - Flags: review?(christopher.leary)
Comment on attachment 596802 [details] [diff] [review] rebase over RegExp changes Review of attachment 596802 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/RegExpObject.cpp @@ +364,5 @@ > NULL, /* hasInstance */ > regexp_trace > }; > > +RegExpShared::RegExpShared(JSContext *cx, RegExpFlag flags) Could we make this take a JSRuntime * instead of a JSContext? ::: js/src/vm/RegExpObject.h @@ +310,5 @@ > + * 1. Any RegExpShared with pointers from the C++ stack is not deleted. > + * 2. Any RegExpShared that was installed in a RegExpObject during an > + * incremental GC is not deleted. This is because the RegExpObject may have > + * been traced through before the new RegExpShared was installed, in which > + * case there would be a dangling pointer. Rephrasing nit: "This is because the RegExpObject may have been traced through before the new RegExpShared was installed, in which case deleting the RegExpShared would turn the RegExpObject's reference into a dangling pointer." @@ +323,5 @@ > detail::RegExpCode code; > uintN parenCount; > RegExpFlag flags; > + size_t activeUseCount; /* See comment above. */ > + uint32_t gcNumberWhenUsed; /* See comment above. */ As discussed, let's go with uint64_t to be a little safer, and in the future we could use a "color" based approach if that were considered easier to understand.
Attachment #596802 - Flags: review?(christopher.leary) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a8ceeb27f7c This is with the tracefile stuff removed; I'll file a follow-up bug to do that, since I think it's important but I don't want to wait on it. Incremental GC is disabled on Android, since it was causing tons of jsreftest timeouts. I'll look into those next week. Additionally, content/media/test/test_closing_connections.html has been temporarily disabled. We're pretty sure that incremental GC was triggering an existing bug where it times out. Work on that is happening in bug 634564 (thanks Matthew).
Target Milestone: --- → mozilla13
I forgot to say that a small regression on Dromaeo CSS is expected (maybe 1% or less; it seems to jump around a lot). This seems to be caused by the extra overhead of allocating new objects black, which adds an extra branch to the allocation slow path. We'll be able to fix this in the future, but I think for now we should take it.
What does MMU mean in the log. Well, there are two MMUs.
(In reply to Olli Pettay [:smaug] from comment #97) > What does MMU mean in the log. Well, there are two MMUs. See this comment: https://bugzilla.mozilla.org/show_bug.cgi?id=719492#c2 (which is just copy and pasted from a comment Bill added to the GC somewhere)
Any idea why GC::GarbageCollectNow is still by far the top thing in about:jank? MMUs are all the time 0%.
though, I think about:jank doesn't do what I'd expect it do to. It does report also some fast operations, if they just happen to happen during a >100ms period where also other stuff is happening.
Do we run incremental GC during the same cycle as we paint? Perhaps NotifyDidPaint should post a runnable to run incremental GC. That would let event loop to spin.
Though, I guess that shouldn't affect to GC::GarbageCollectNow (which is about 40% of about:jank on this build)
Actually, about:jank seems to kill performance pretty badly on this machine (64 bit linux), so better to not trust it too much.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Blocks: 719492
I re-ran the google incremental test http://v8.googlecode.com/svn/branches/bleeding_edge/benchmarks/spinning-balls/index.html and its still experience large spikes.. (using 18-Feb-2012 14:42 win64/snapshot) Chrome: 8000/320 15(max = 557) ms 3554 frames Score 12 0-10ms => 20 10-20ms => 3484 20-30ms => 31 30-40ms => 13 40-50ms => 4 110-120ms => 1 550-560ms => 1 Firefox: 8000/320 17(max = 102) ms 3587 frames Score 48 0-10ms => 105 10-20ms => 3440 20-30ms => 26 30-40ms => 2 80-90ms => 2 90-100ms => 10 100-110ms => 2 I was expecting or hoping the incremental garbage collector would perform better on this bench or is there other bugzilla ID's involved in improving this?
For me, I saw a large improvement, though for me Chrome had a much higher score. Bug 704716 is about specifically that benchmark, so it may be a better place for extended discussion.
(In reply to mdew from comment #105) > I re-ran the google incremental test > http://v8.googlecode.com/svn/branches/bleeding_edge/benchmarks/spinning- > balls/index.html and its still experience large spikes.. (using 18-Feb-2012 > 14:42 win64/snapshot) > > Chrome: > 8000/320 15(max = 557) ms 3554 frames > Score 12 > > 0-10ms => 20 > 10-20ms => 3484 > 20-30ms => 31 > 30-40ms => 13 > 40-50ms => 4 > 110-120ms => 1 > 550-560ms => 1 > > Firefox: > 8000/320 17(max = 102) ms 3587 frames > > Score 48 > 0-10ms => 105 > 10-20ms => 3440 > 20-30ms => 26 > 30-40ms => 2 > 80-90ms => 2 > 90-100ms => 10 > 100-110ms => 2 > > I was expecting or hoping the incremental garbage collector would perform > better on this bench or is there other bugzilla ID's involved in improving > this? Does about:support show incremental gc being enabled?
Under about:support, Incremental GC:0 about:config, javascript.options.mem.gc_incremental;true javascript.options.mem.gc_incremental_slice_ms;10 javascript.options.mem.gc_per_compartment;true javascript.options.gc_on_memory_pressure;true
(In reply to mdew from comment #108) > Under about:support, > Incremental GC:0 > about:config, > javascript.options.mem.gc_incremental;true > javascript.options.mem.gc_incremental_slice_ms;10 > javascript.options.mem.gc_per_compartment;true > javascript.options.gc_on_memory_pressure;true It's probably disabled due to https://bugzilla.mozilla.org/attachment.cgi?id=596258&action=diff
Depends on: 728683
(In reply to Bill McCloskey (:billm) from comment #84) > Created attachment 596258 [details] [diff] [review] > disable incremental GC in the presence of unknown trace hooks > > One last patch. This does what we talked about with regard to trace hooks. > It adds a new flag on the JSClass that should only be set if the class > implements write barriers correctly. If we ever create an object whose class > has a trace hook and doesn't have this flag set, then we permanently disable > incremental GC as long as the browser is running. > > Hopefully this should be sufficient to solve the problem of misbehaving > add-ons. Can you name some of the cases which will render IGC disabled ? From bug 728683 , one case is check for updates . Is there a way to enable it again once it automatically gets disabled ? or its simply switching back the pref to true ?
Update: When I updated to today's Nightly which has IGC, I opened the about:support page as soon as my Nightly started and still I had the entry IGC as 0. The about:config entry is still true.
(In reply to scrapmachines from comment #110) > Can you name some of the cases which will render IGC disabled ? > From bug 728683 , one case is check for updates . > > Is there a way to enable it again once it automatically gets disabled ? or > its simply switching back the pref to true ? I believe the idea is that it should only get turned off in the presence of unsafe addons with certain behaviors. If the browser itself is exhibiting that unsafe behavior, then you'll just have to wait for it to be patched. I filed an issue for the MemChaser addon ( https://github.com/whimboo/memchaser/issues/66 ) to track when incremental GC is disabled in a build that supports it. I think that could be useful in finding instances of this.
(In reply to Andrew McCreight [:mccr8] from comment #112) > (In reply to scrapmachines from comment #110) > > Can you name some of the cases which will render IGC disabled ? > > From bug 728683 , one case is check for updates . > > > > Is there a way to enable it again once it automatically gets disabled ? or > > its simply switching back the pref to true ? > > I believe the idea is that it should only get turned off in the presence of > unsafe addons with certain behaviors. If the browser itself is exhibiting > that unsafe behavior, then you'll just have to wait for it to be patched. It can definitely be disabled by the browser itself. See bug 728686.
It's definitely an oversight if incremental GC is disabled in a browser without add-ons. I really should have taken Igor's advice and tested this aspect more thoroughly. Hopefully, fixing bug 728686 will solve the problems that people are having. Thanks for tracking that down, Kyle.
Depends on: 728692
Depends on: 728699
Depends on: 728701
Depends on: 728702
(In reply to Andrew McCreight [:mccr8] from comment #112) > I believe the idea is that it should only get turned off in the presence of > unsafe addons with certain behaviors. Can you make sure this becomes part of the AMO review process?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (away February 21 to March 17) from comment #115) > Can you make sure this becomes part of the AMO review process? Do you mean to say that Add0ons will not be given full clearance if they trigger IGC to be disabled, even if they do not leak memory, and are feature wise ok , etc. ?
(In reply to scrapmachines from comment #116) > (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (away February > 21 to March 17) from comment #115) > > Can you make sure this becomes part of the AMO review process? > > Do you mean to say that Add0ons will not be given full clearance if they > trigger IGC to be disabled, even if they do not leak memory, and are feature > wise ok , etc. ? Yes, but the only addons that can disable IGC are ones that implement their own custom JSAPI stuff (a custom JSClass with a trace hook, to be precise), so this should be very rare.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #117) > Yes, but the only addons that can disable IGC are ones that implement their > own custom JSAPI stuff (a custom JSClass with a trace hook, to be precise), > so this should be very rare. Can you name a couple? for reference ?
Question about the comment at https://hg.mozilla.org/mozilla-central/file/1016e3e84271/js/src/gc/Barrier.h#l142 : * One additional note: not all object writes need to be barriered. Writes to * newly allocated objects do not need a barrier as long as the GC is not * allowed to run in between the allocation and the write. In these cases, we * use the "obj->field.init(value)" method instead of "obj->field = value". * We use the init naming idiom in many places to signify that a field is being * assigned for the first time, and that no GCs have taken place between the * object allocation and the assignment. I do not see where this requirement "the GC is not allowed to run in between the allocation and the write" comes from. I.e. what exactly the GC that run after the allocation but before the assigned can do to prevent the init from working?
(In reply to Igor Bukanov from comment #119) > I do not see where this requirement "the GC is not allowed to run in between > the allocation and the write" comes from. I.e. what exactly the GC that run > after the allocation but before the assigned can do to prevent the init from > working? You're right. If we're only talking about incremental GC, there's nothing wrong with a GC happening in between allocation and init(). I wrote that comment a while ago and I'm not sure what I was thinking. I may have been thinking forward to generational GC. In that case, it's safe to skip the barrier on a new object that you know was allocated in the nursery. However, if a GC happens in between the allocation and the write, then the object will be moved to the old space and then it's not safe to skip the barrier. However, I'm not sure what we're actually going to do for generational. Right now, the post barrier is always invoked in init(), just to be safe. I haven't been very careful about ensuring that there are no GCs between allocation and init(), since it's not required for incremental barriers. If we decide to remove the post barrier from init(), we'll have to audit all the uses of init() to make sure that no GCs can happen in between. That sounds pretty unpleasant, so I'm guessing we just won't do that optimization. Except maybe in the JIT, where we would only have to check one or two places.
It makes some sense to me to never move a value until at least one write has been performed.
Depends on: 728976
This seems to be in the regression range for some Dromaeo DOM and V8 regressions...
(In reply to Boris Zbarsky (:bz) from comment #122) > This seems to be in the regression range for some Dromaeo DOM and V8 > regressions... I looked at the graphs for a while, and I am seeing a consistent Dromaeo DOM regression on MacOS, but not on other platforms. And I don't really see anything consistent happening with V8 at all. If you see something else, can you link to the graph? I'll investigate the MacOS regression.
Depends on: 729171
Besides normal GC things we also have the script filename table. Its entries are marked during normal GC. However I do not see any code that marks the entry there for newly allocated scripts. So presumably live filename can be collected.
(In reply to Igor Bukanov from comment #124) > Besides normal GC things we also have the script filename table. Its entries > are marked during normal GC. However I do not see any code that marks the > entry there for newly allocated scripts. So presumably live filename can be > collected. You're right. Do you want to patch, or should I?
Depends on: 729427
(In reply to scrapmachines from comment #118) > (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #117) > > Yes, but the only addons that can disable IGC are ones that implement their > > own custom JSAPI stuff (a custom JSClass with a trace hook, to be precise), > > so this should be very rare. > > Can you name a couple? for reference ? I'm not actually aware of any that do this.
Depends on: 730283
No longer blocks: 731437
Depends on: 731437
> Can you name a couple? for reference ? StatusBarEX
Depends on: 732787
Depends on: 732719
anyone created a bug for Firefox Sync disabling incremental gc? i can confirm an existing session with it enabled will disable when it syncs at start, as well as creating a new sync account in a completely new profile doing it as well.
(In reply to Danial Horton from comment #127) > > Can you name a couple? for reference ? > > StatusBarEX (In reply to Danial Horton from comment #128) > anyone created a bug for Firefox Sync disabling incremental gc? > > i can confirm an existing session with it enabled will disable when it syncs > at start, as well as creating a new sync account in a completely new profile > doing it as well. I believe these are both a result of bug 728686. There's no need to file any new bugs until that is fixed.
Depends on: 735944
Depends on: 731423
Depends on: 731837
Depends on: 729773
Attached file Support for tracefiles, rebased to m-c 147c0d893cdb (obsolete) (deleted) —
Attachment #592354 - Attachment is obsolete: true
My last one applied cleanly, but had some build bustage.
Attachment #609923 - Attachment is obsolete: true
is there any plans to stop falling back to GC when a compartment is created or destroyed? nearly all my GC's are full because of certain pages refreshing frames (like a router control panel that refreshes its stats every few seconds)
(In reply to Danial Horton from comment #132) > is there any plans to stop falling back to GC when a compartment is created > or destroyed? nearly all my GC's are full because of certain pages > refreshing frames (like a router control panel that refreshes its stats > every few seconds) Yeah, this is a problem. I filed bug 739899 for it.
Depends on: 735099
blocking-kilimanjaro: --- → ?
Whiteboard: [Snappy:P1] → [Snappy:P1][k9o:p?:fx?]
+ for k9o. It already shipped, so that doesn't mean a lot. The real action now is in but 735099.
blocking-kilimanjaro: ? → +
Whiteboard: [Snappy:P1][k9o:p?:fx?] → [Snappy:P1][k9o:p1:fx14]
Depends on: 750959
No longer depends on: 731423
Whiteboard: [Snappy:P1][k9o:p1:fx14] → [Snappy:P1][k9o:p1:fx14][games:p1]
Blocks: 750449
Bug 734946 suggests that incremental GC is disabled in FF13. Setting the flags to reflect that. Please let me know if that isn't the case.
Depends on: 757483
No longer depends on: 757483
In the release notes of Firefox 13 there's also the incremental gc (https://www.mozilla.org/en-US/firefox/13.0/releasenotes/), but it's disabled by default.
(In reply to Marco Castelluccio from comment #137) > In the release notes of Firefox 13 there's also the incremental gc > (https://www.mozilla.org/en-US/firefox/13.0/releasenotes/), but it's > disabled by default. Good point. I emailed akeybl about this.
Depends on: 761739
(In reply to Marco Castelluccio from comment #137) > In the release notes of Firefox 13 there's also the incremental gc > (https://www.mozilla.org/en-US/firefox/13.0/releasenotes/), but it's > disabled by default. This was only up for a couple of hours, and was absent from the 13.0beta notes for the entire cycle. Fixed now.
Is there no way to get this into fx15? This help remove frame rate stalls and would be a big win to get in as soon as possible. Just want to make sure that we are considering if this is worth doing since 15 is still in Aurora and the cause of the pref off looks like a bug that has been fixed.
Version: unspecified → 16 Branch
Target Milestone: mozilla13 → mozilla16
Depends on: 774859
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: