1341474 - FinishAnyIncrementalGC can be really expensive

Reporter

Description

•

8 years ago

Olli, Bill, I think we once discussed this but I don't remember what the conclusion was. But I keep seeing this coming up in profiles, so filing a bug about it in the hope that we can do something better. This profile shows this function taking 1.2 seconds in the parent process's main thread: <https://perfht.ml/2l5CkJT>

(no longer active)

Reporter

Updated

•

8 years ago

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 1

•

8 years ago

This means GC is too slow - which also means we have too much JS. (IMO we should try to convert more code to C++ [or if one solves the memory management when using Rust, then Rust would be fine too]) One issue I've seen in parent process is that since almost all the JS uses system zone, we end up collecting the whole world almost all the time. I wonder if we could split system zone and have several such. Some zone for rarely used .jsms and then perhaps separate zone for each top level window.

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 2

•

8 years ago

FWIW, per telemetry (which is showing data only up to Feb 5 atm), this happens very rarely. 0.01% of the CCs end up finishing iGC.

Andrew McCreight [:mccr8]

Comment 3

•

8 years ago

Basically, the CC wants to run every X seconds. If an IGC takes a really long time, then the CC synchronously finishes off the GC so it can run, which can take a long time. We could increase X a little, which might help, though it can also cause memory to increase. Also note that sometimes the GC takes a long time due to problems in the GC. For instance, Terrence had an issue in his own browsing session where some XPConnect gunk had to get traced in every GC slice, and there was a lot of it, so almost all of the GC slice was spent on that overhead, and practically no forward progress was being made. Obviously, if we can find and fix issues like that it would be better than poking around the margin with heuristic tweaks. One heuristic I've thought would be handy for things like this would be to increase the slice time as a GC/CC progresses longer. It would be better to have, say, 4 250ms slices than a single one second slice.

(no longer active)

Reporter

Comment 4

•

8 years ago

(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #1) > One issue I've seen in parent process is that since almost all the JS uses > system zone, we end up collecting the whole world almost all the time. FWIW anecdotally, I do see more GC related issues in the parent process than content processes, and I always wondered why that is...

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 5

•

8 years ago

(In reply to :Ehsan Akhgari from comment #4) > (In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment > #1) > > One issue I've seen in parent process is that since almost all the JS uses > > system zone, we end up collecting the whole world almost all the time. > > FWIW anecdotally, I do see more GC related issues in the parent process than > content processes, and I always wondered why that is... This is still really surprising. Typically there's not *that* much chrome JS. It would be good to dig into why your GC times are so horrible. Can you post about:memory for the chrome process? The next step would be to collect GC statistics.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 6

•

8 years ago

I see 434,651,456 B (39.44%) -- js-non-window in parent process. Bug 1287330 should help quite a bit in cases where user has many un-restored tabs. But even after that we have tons of system zone compartments. I see 455 such compartments.

(no longer active)

Reporter

Comment 7

•

8 years ago

Attached file about:memory (deleted) — Details

(In reply to Bill McCloskey (:billm) from comment #5) > (In reply to :Ehsan Akhgari from comment #4) > > (In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment > > #1) > > > One issue I've seen in parent process is that since almost all the JS uses > > > system zone, we end up collecting the whole world almost all the time. > > > > FWIW anecdotally, I do see more GC related issues in the parent process than > > content processes, and I always wondered why that is... > > This is still really surprising. Typically there's not *that* much chrome > JS. It would be good to dig into why your GC times are so horrible. Can you > post about:memory for the chrome process? The next step would be to collect > GC statistics. Not sure why. Here's an about:memory. What type of GC statistics did you have in mind?

(no longer active)

Reporter

Comment 8

•

8 years ago

(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #6) > I see > 434,651,456 B (39.44%) -- js-non-window > in parent process. > > Bug 1287330 should help quite a bit in cases where user has many un-restored > tabs. Which I do. Another thing that worries me is the high amount of heap-unclassified.

(no longer active)

Reporter

Updated

•

8 years ago

Depends on: 1287330

Dão Gottwald [:dao]

Comment 9

•

8 years ago

(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #6) > I see > 434,651,456 B (39.44%) -- js-non-window > in parent process. > > Bug 1287330 should help quite a bit in cases where user has many un-restored > tabs. More work needs to be done, see bug 906076 comment 215.

Depends on: lazytabs
No longer depends on: 1287330

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 10

•

8 years ago

Ugh, okay. I guess the unrestored tabs will do it. No need for a GC log. You have about 700MB of JS in the parent process, and marking takes roughly about 1ms/MB. So if you include sweep time, that's not unreasonable. I think the unrestored tabs should each be in their own zone. But the TabChildGlobal will live in the system zone. And you might have hit a full (non-zone) GC anyway.

Dão Gottwald [:dao]

Comment 11

•

7 years ago

Is this bug still valid / useful / actionable?

Flags: needinfo?(ehsan)

(no longer active)

Reporter

Comment 12

•

7 years ago

I haven't seen this in a while, don't think the bug in its current form is actionable any more. I'd be happy to reopen if I saw it again.

Status: NEW → RESOLVED

Closed: 7 years ago

Flags: needinfo?(ehsan)

Resolution: --- → INCOMPLETE

Bugzilla

FinishAnyIncrementalGC can be really expensive

Categories

(Core :: XPCOM, defect)

Tracking

()

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Comment 11

Comment 12

Attachment

General

Description

File Name

Content Type