Open Bug 1308039 Opened 8 years ago Updated 1 year ago

Yield to painting during GC

Categories

(Core :: JavaScript: GC, defect, P3)

defect

Tracking

()

REOPENED
mozilla52
Performance Impact low

People

(Reporter: billm, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf:responsiveness, triage-deferred)

Attachments

(6 files, 2 obsolete files)

Bug 1279086 added the ability to paint (for tab switching purposes) while we're running JS code. We've gotten reports in bug 1300411 that people are getting the tab switch spinner because of GC. I don't know of any reason why we couldn't paint during GC as well. This bug will attempt to do that. The main danger I see here is that the painting code could invoke barriers. I'm going to add release-mode assertions to ensure that doesn't happen. I'm also unsure of where we're actually spending time during these long GCs. Most of the long GCs I see in bug 1300411 are non-incremental. Our GC logging doesn't seem to give any reason why we're doing non-incremental GCs, so it would be good to figure out why that's happening. I guess we can start by yielding during the mark phase since that's the easiest. That won't help at all if the long GCs are incremental, but it will help if they're non-incremental. Of course, we should probably avoid doing this if we're close to OOMing.
(In reply to Bill McCloskey (:billm) from comment #0) > The main danger I see here is that the painting code could invoke barriers. > I'm going to add release-mode assertions to ensure that doesn't happen. Is that really a problem? For painting during the CC, I was able to structure it so that the CC is basically leaving a slice before we do the paint, so you can run anything. Maybe starting and stopping a GC slice has more work so you can't just do that.
(In reply to Bill McCloskey (:billm) from comment #0) > The main danger I see here is that the painting code could invoke barriers. > I'm going to add release-mode assertions to ensure that doesn't happen. Would that occur if painting code ends up running JS? Or can it happen in other cases too?
(In reply to Andrew McCreight [:mccr8] from comment #1) > (In reply to Bill McCloskey (:billm) from comment #0) > > The main danger I see here is that the painting code could invoke barriers. > > I'm going to add release-mode assertions to ensure that doesn't happen. > > Is that really a problem? For painting during the CC, I was able to > structure it so that the CC is basically leaving a slice before we do the > paint, so you can run anything. Maybe starting and stopping a GC slice has > more work so you can't just do that. Besides non-incremental GCs, there are (or used to be) phases of the GC that take a long time and cannot be incremental.
(In reply to :Ehsan Akhgari from comment #2) > (In reply to Bill McCloskey (:billm) from comment #0) > > The main danger I see here is that the painting code could invoke barriers. > > I'm going to add release-mode assertions to ensure that doesn't happen. > > Would that occur if painting code ends up running JS? Or can it happen in > other cases too? It could happen if painting runs JS, but it could also happen if painting code touches JS objects that are stored in the DOM (which could happen even if we don't run JS code). I doubt that happens, but I would like to be certain.
(In reply to Bill McCloskey (:billm) from comment #3) > Besides non-incremental GCs, there are (or used to be) phases of the GC that > take a long time and cannot be incremental. Well, for the CC, if we're graph building in a non-incremental CC, I can still interrupt that, and I just later treat it like an incremental CC to do the fixup. I guess that does not really work for the GC. And I don't know how much painting is happening during which phases of the CC. Getting some telemetry on that, somehow, would be good so we don't wear ourselves out optimizing situations that rarely happen.
Attached patch Convert AutoAssertOnGC to release assertion (obsolete) (deleted) — Splinter Review
If there's a problem with this painting code, it's likely to happen rarely and not on tryserver. So I'd like to get this assertion on nightly. I only checked the places where AutoAssertOnGC is used directly since there aren't too many and they don't look super hot. I left AutoCheckCannotGC alone because it's used in many more places and I wasn't sure how hot they are.
Attachment #8798719 - Flags: review?(jcoppeard)
Attached patch Add AutoAssertOnBarrier (deleted) — Splinter Review
This patch adds an RAII class to assert if we use ExposeGCThingToActiveJS. This is something I want to avoid during the special paints since it could trip up the GC.
Attachment #8798720 - Flags: review?(jcoppeard)
This patch ensures that we don't access DOM reflectors from our special tab switching paints. That could cause us to set mark bits, which would be bad if we're calling the painting code from inside a GC.
Attachment #8798721 - Flags: review?(mrbkap)
Attached patch GC interrupt callbacks (deleted) — Splinter Review
This patch adds a concept of a GC interrupt callback. It's similar to the normal interrupt callback mechanism, but it runs during a GC. So far I'm only checking whether we should run the callback during phases that are already incremental, but we can augment that based on telemetry data about what is slow. In theory we should be able to invoke this at any time during a GC.
Attachment #8798722 - Flags: review?(jcoppeard)
Attached patch Preempt GC to paint (deleted) — Splinter Review
This patch allows us to paint during a GC.
Attachment #8798723 - Flags: review?(mrbkap)
Comment on attachment 8798719 [details] [diff] [review] Convert AutoAssertOnGC to release assertion Review of attachment 8798719 [details] [diff] [review]: ----------------------------------------------------------------- I agree with having these assertions enabled more often.
Attachment #8798719 - Flags: review?(jcoppeard) → review+
Comment on attachment 8798720 [details] [diff] [review] Add AutoAssertOnBarrier Review of attachment 8798720 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. (This has a similar effect to but is not the same as AutoEnterCycleCollection. Possible we could re-implement the latter in terms of this.) ::: js/public/GCAPI.h @@ +520,5 @@ > #endif > }; > > /** > + * Assert if a GC occurs while this class is live. This class does not disable The comment should talk about barriers rather than GC. ::: js/src/jsgc.cpp @@ +7016,5 @@ > +JS::AutoAssertOnBarrier::AutoAssertOnBarrier(JSContext* cx) > + : context(cx), > + prev(cx->runtime()->allowGCBarriers()) > +{ > + context->runtime()->allowGCBarriers_ = false; Can you assert the previous state here and also in the destructor?
Attachment #8798720 - Flags: review?(jcoppeard) → review+
Comment on attachment 8798722 [details] [diff] [review] GC interrupt callbacks Review of attachment 8798722 [details] [diff] [review]: ----------------------------------------------------------------- The slice budget pointer is a little unpleasant, but that's OK if we need to be able to interrupt GCs quickly... do you have any data for this? It seems like we should still get reasonably fast interrupts even without zeroing the counter to get a full check. More generally, we need try and do fewer non-incremental GCs.
Attachment #8798722 - Flags: review?(jcoppeard) → review+
(In reply to Jon Coppeard (:jonco) from comment #13) > Comment on attachment 8798722 [details] [diff] [review] > GC interrupt callbacks > > Review of attachment 8798722 [details] [diff] [review]: > ----------------------------------------------------------------- > > The slice budget pointer is a little unpleasant, but that's OK if we need to > be able to interrupt GCs quickly... do you have any data for this? It seems > like we should still get reasonably fast interrupts even without zeroing the > counter to get a full check. Yeah, you're probably right. I'll see how long the different phases go before a full check.
Attachment #8798721 - Flags: review?(mrbkap) → review+
Attachment #8798723 - Flags: review?(mrbkap) → review+
(In reply to Bill McCloskey (:billm) from comment #14) > (In reply to Jon Coppeard (:jonco) from comment #13) > > Comment on attachment 8798722 [details] [diff] [review] > > GC interrupt callbacks > > > > Review of attachment 8798722 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > The slice budget pointer is a little unpleasant, but that's OK if we need to > > be able to interrupt GCs quickly... do you have any data for this? It seems > > like we should still get reasonably fast interrupts even without zeroing the > > counter to get a full check. > > Yeah, you're probably right. I'll see how long the different phases go > before a full check. I was all set to take out this part, but it seems like it actually matters. To test this I turned off incremental GC, visited [1], waited for a long pause, and then switched tabs. Without the requestFullCheck() call, I got the spinner a lot more. That might mean we need to fix how the counter works since ideally there shouldn't be much time in between checks. [1] http://people.mozilla.org/~wmccloskey/incremental-blog/example-pause.html
(In reply to Jon Coppeard (:jonco) from comment #12) > ::: js/src/jsgc.cpp > @@ +7016,5 @@ > > +JS::AutoAssertOnBarrier::AutoAssertOnBarrier(JSContext* cx) > > + : context(cx), > > + prev(cx->runtime()->allowGCBarriers()) > > +{ > > + context->runtime()->allowGCBarriers_ = false; > > Can you assert the previous state here and also in the destructor? I added an assert in the destructor that the flag is still false. Is that what you meant? I'm not sure what else to assert in the constructor. allowGCBarriers_ could be either true or false depending on whether AutoAssertOnBarrier is nested.
Backed out for spidermonkey bustage: https://hg.mozilla.org/integration/mozilla-inbound/rev/5fa834fe9b96d1b6b1cf99d14335b0beb1bd3811 https://hg.mozilla.org/integration/mozilla-inbound/rev/8ee921cb1b8512917c15a974a0540ac9da959509 https://hg.mozilla.org/integration/mozilla-inbound/rev/2a4eb2a3f21f531cf3aee827a165e80a9b2d6b72 https://hg.mozilla.org/integration/mozilla-inbound/rev/d38210acd0dcbfc7cdca146c7d4eb44793611859 https://hg.mozilla.org/integration/mozilla-inbound/rev/847976827ad6029e10137ff005ae4df158498757 Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=cf4c2f956c4cf4fd3703620a0752981576e7c567 Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=37280281&repo=mozilla-inbound [task 2016-10-07T19:41:02.840026Z] 19:41:02 INFO - /usr/bin/ccache /home/worker/workspace/build/src/gcc/bin/gcc -m32 -march=pentiumpro -std=gnu99 -o /home/worker/workspace/build/src/obj-firefox/security/nss/lib/libpkix/pkix_pl_nss/pki/pkix_pl_publickey.o -c -Os -gdwarf-2 -fPIC -Di386 -DLINUX2_1 -m32 -pipe -ffunction-sections -fdata-sections -DLINUX -Dlinux -DHAVE_STRERROR -Wall -DNSS_NO_GCC48 -DXP_UNIX -UDEBUG -DNDEBUG -D_REENTRANT -DNSS_NO_INIT_SUPPORT -DUSE_UTIL_DIRECTLY -DNO_NSPR_10_SUPPORT -DSSL_DISABLE_DEPRECATED_CIPHER_SUITE_NAMES -I/home/worker/workspace/build/src/obj-firefox/dist/include/nspr -I/home/worker/workspace/build/src/obj-firefox/dist/include/nspr -I/home/worker/workspace/build/src/obj-firefox/dist/include/nss -I/home/worker/workspace/build/src/obj-firefox/dist/private/nss pkix_pl_publickey.c [task 2016-10-07T19:41:02.965818Z] 19:41:02 INFO - In file included from /home/worker/workspace/build/src/obj-firefox/js/src/Unified_cpp_js_src30.cpp:38:0: [task 2016-10-07T19:41:02.965910Z] 19:41:02 INFO - /home/worker/workspace/build/src/js/src/vm/ArrayBufferObject.cpp: In member function 'void js::ArrayBufferObject::changeViewContents(JSContext*, js::ArrayBufferViewObject*, uint8_t*, js::ArrayBufferObject::BufferContents)': [task 2016-10-07T19:41:02.965965Z] 19:41:02 INFO - /home/worker/workspace/build/src/js/src/vm/ArrayBufferObject.cpp:350:62: error: no matching function for call to 'js::ArrayBufferViewObject::dataPointerUnshared(JS::AutoCheckCannotGC&)' [task 2016-10-07T19:41:02.965991Z] 19:41:02 INFO - uint8_t* viewDataPointer = view->dataPointerUnshared(nogc); [task 2016-10-07T19:41:02.966017Z] 19:41:02 INFO - ^ [task 2016-10-07T19:41:02.966045Z] 19:41:02 INFO - /home/worker/workspace/build/src/js/src/vm/ArrayBufferObject.cpp:350:62: note: candidate is: [task 2016-10-07T19:41:02.966072Z] 19:41:02 INFO - In file included from /home/worker/workspace/build/src/js/src/vm/GlobalObject.h:18:0, [task 2016-10-07T19:41:02.966098Z] 19:41:02 INFO - from /home/worker/workspace/build/src/js/src/jscompartment.h:22, [task 2016-10-07T19:41:02.966124Z] 19:41:02 INFO - from /home/worker/workspace/build/src/js/src/jsweakmap.h:13, [task 2016-10-07T19:41:02.966151Z] 19:41:02 INFO - from /home/worker/workspace/build/src/js/src/builtin/TypedObject.h:11, [task 2016-10-07T19:41:02.966176Z] 19:41:02 INFO - from /home/worker/workspace/build/src/js/src/jsobjinlines.h:18, [task 2016-10-07T19:41:02.966204Z] 19:41:02 INFO - from /home/worker/workspace/build/src/js/src/proxy/ScriptedProxyHandler.cpp:13, [task 2016-10-07T19:41:02.966234Z] 19:41:02 INFO - from /home/worker/workspace/build/src/obj-firefox/js/src/Unified_cpp_js_src30.cpp:2: [task 2016-10-07T19:41:02.966272Z] 19:41:02 INFO - /home/worker/workspace/build/src/js/src/vm/ArrayBufferObject.h:443:14: note: uint8_t* js::ArrayBufferViewObject::dataPointerUnshared(const JS::AutoAssertOnGC&) [task 2016-10-07T19:41:02.966295Z] 19:41:02 INFO - uint8_t* dataPointerUnshared(const JS::AutoAssertOnGC&); [task 2016-10-07T19:41:02.966309Z] 19:41:02 INFO - ^ [task 2016-10-07T19:41:02.966350Z] 19:41:02 INFO - /home/worker/workspace/build/src/js/src/vm/ArrayBufferObject.h:443:14: note: no known conversion for argument 1 from 'JS::AutoCheckCannotGC' to 'const JS::AutoAssertOnGC&' [task 2016-10-07T19:41:02.966371Z] 19:41:02 INFO - gmake[5]: *** [Unified_cpp_js_src30.o] Error 1
Flags: needinfo?(wmccloskey)
Attached patch Make AutoReleaseAssertOnGC (obsolete) (deleted) — Splinter Review
Unfortunately bug 1307296 makes it so that AutoCheckCannotGC needs to be convertable to AutoAssertNoGC. My patch removed that conversion for opt builds. I messed around with a few things, but this is the simplest. It just adds a separate AutoReleaseAssertNoGC. AutoCheckCannotGC still inherits from AutoAssertNoGC (which is a DEBUG assert). If we decide to assert for every AutoCheckCannotGC use then we can remove one of these classes. But I don't feel comfortable doing that.
Attachment #8798719 - Attachment is obsolete: true
Flags: needinfo?(wmccloskey)
Attachment #8799096 - Flags: review?(jcoppeard)
Comment on attachment 8799096 [details] [diff] [review] Make AutoReleaseAssertOnGC Can you look at this Steve? I'd like to get it landed soon.
Attachment #8799096 - Flags: review?(jcoppeard) → review?(sphink)
Attached patch AutoRequireNoGC (deleted) — Splinter Review
This patch implements the abstract class we discussed.
Attachment #8799096 - Attachment is obsolete: true
Attachment #8799096 - Flags: review?(sphink)
Attachment #8799564 - Flags: review?(sphink)
Comment on attachment 8799564 [details] [diff] [review] AutoRequireNoGC Review of attachment 8799564 [details] [diff] [review]: ----------------------------------------------------------------- r=me, but I thought you wanted a release assert version too? Or maybe that was a workaround for something else.
Attachment #8799564 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink] [:s:] from comment #22) > Comment on attachment 8799564 [details] [diff] [review] > AutoRequireNoGC > > Review of attachment 8799564 [details] [diff] [review]: > ----------------------------------------------------------------- > > r=me, but I thought you wanted a release assert version too? Or maybe that > was a workaround for something else. Yeah, that was a workaround. See comment 19.
I'm anxious to get this landed so we can get data, so I landed the fix in comment 25 to eliminate a rooting hazard. Steve, is there a better way I should be doing this? I assume AutoCheckCannotGC doesn't help when you're doing an indirect call?
Flags: needinfo?(sphink)
(In reply to Bill McCloskey (:billm) from comment #16) > I added an assert in the destructor that the flag is still false. Is that > what you meant? I'm not sure what else to assert in the constructor. Ah right, I didn't spot that these could nest. This is great. Sorry I missed your review request.
(In reply to Bill McCloskey (:billm) from comment #26) > I'm anxious to get this landed so we can get data, so I landed the fix in > comment 25 to eliminate a rooting hazard. Steve, is there a better way I > should be doing this? I assume AutoCheckCannotGC doesn't help when you're > doing an indirect call? Right, AutoCheckCannotGC won't work there because the analysis will think the indirect call could GC. Your fix is good. Except that it's redundant with the earlier AutoAssertOnGC. And since you have a cx, you may as well pass it in to prevent it from having to dig through TLS. So I'd replace both the current AutoAssertOnGC and AutoSuppressGCAnalysis with a single JS::AutoSuppressGCAnalysis nogc(cx); I know it reads a little funny. Maybe it should be called AutoSuppressGCAnalysisAndAssertOnGC. But meh.
Flags: needinfo?(sphink)
Bill, do you think it makes sense to backport the allowGCBarriers() asserts from here to Aurora? Bug 1279086 is on Aurora, but this isn't. (Maybe the stronger assertions aren't needed for bug 1279086?)
Yeah, it might make sense. We should at least backport the fixes that affect that bug.
Depends on: 1310351
Depends on: 1310550
I don't think I want to remove the AutoAssertOnGC. AutoSuppressGCAnalysis inherits from AutoAssertNoAlloc, which is a bit different from AutoAssertOnGC. AutoAssertNoAlloc is a DEBUG-only assertion, and it triggers in slightly different circumstances. It's kind of weird that we have two assertions that are so similar, but I guess it kind of makes sense. We can allocate even when GC is suppressed, and AutoAssertNoAlloc will still trigger then. AutoAssertOnGC won't. That probably matters for the assertion we have in ZoneCellIter.
Attachment #8802240 - Flags: review?(sphink)
Comment on attachment 8802240 [details] [diff] [review] Add cx arg to AutoSuppressGCAnalysis Review of attachment 8802240 [details] [diff] [review]: ----------------------------------------------------------------- Hm. Fair enough. We were discussing removing AutoAssertNoAlloc in bug 1310147, but it's true that it's not strictly the same thing. Anyway, for now you may as well leave it.
Attachment #8802240 - Flags: review?(sphink) → review+
Depends on: 1310335
Depends on: 1325813
Backout by wmccloskey@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/4e9c1a72a8ec Back out - Preempt GC to paint (a=backout) https://hg.mozilla.org/integration/mozilla-inbound/rev/0001674978a5 Back out - GC interrupt callbacks (a=backout) https://hg.mozilla.org/integration/mozilla-inbound/rev/7faa70924168 Back out - Add AutoAssertOnBarrier to painting code (a=backout)
I guess I'll re-open this since it's no longer fixed. Discussion of the backout in bug 1328423.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It seems the backout improved performance on Octane Gameboy, Splay and SplayLatency, according to AWFY.
Assignee: wmccloskey → nobody
Keywords: triage-deferred
Priority: -- → P3
Whiteboard: [fxperf]
This is probably still worth doing - albeit perhaps differently, given the regressions that showed up when this originally landed.
Whiteboard: [fxperf] → [fxperf:p3]
Severity: normal → S3

I'm not sure if this is still relevant, I think GC might be done in smaller slices now than it was 7 years ago.

Performance Impact: --- → ?
Whiteboard: [fxperf:p3]

The calculator isn't really well suited for this issue, but it seems low performance impact to me.

Performance Impact: ? → low
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: