Open
Bug 1057530
(GC.60fps)
Opened 10 years ago
Updated 2 years ago
[meta] Reduce our GC max-pause
Categories
(Core :: JavaScript: GC, defect, P3)
Core
JavaScript: GC
Tracking
()
NEW
People
(Reporter: terrence, Unassigned)
References
(Depends on 9 open bugs, Blocks 1 open bug)
Details
(Keywords: meta, Whiteboard: [platform-rel-Games])
Attachments
(1 file)
(deleted),
text/plain
|
Details |
Right now, on a fresh profile, with default everything, our max pause is ~40-50ms. This appears to all come in the last slice. The GC dump is attached. Most importantly, the last slice times break down like this:
Wait Background Thread: 5.8ms,
Mark: 2.7ms
Mark Roots: 1.6ms
Sweep: 30.4ms
Mark During Sweeping: 8.1ms
Mark Weak: 2.3ms
Mark Gray: 5.5ms
Mark Gray and Weak: 0.2ms
Finalize Start Callback: 0.5ms
Sweep Atoms: 3.9ms
Sweep Compartments: 11.2ms
Sweep Discard Code: 1.1ms
Sweep Tables: 7.0ms
Sweep Cross Compartment Wrappers: 0.8ms
Sweep Base Shapes: 3.2ms
Sweep Initial Shapes: 1.7ms
Sweep Type Objects: 0.6ms
Discard Analysis: 2.9ms
Discard TI: 1.0ms
Sweep Types: 1.9ms
Sweep Object: 0.8ms
Sweep String: 0.3ms
Sweep Script: 0.6ms
Sweep Shape: 1.9ms
Sweep JIT code: 0.1ms
Finalize End Callback: 1.7ms
Deallocate: 0.5ms
End Callback: 0.2ms
Reporter | ||
Comment 1•10 years ago
|
||
Also, this is on a fast machine with plenty of memory and not on any particular site.
Updated•10 years ago
|
Comment 2•10 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #1)
> Also, this is on a fast machine with plenty of memory and not on any
> particular site.
Win 8.1 laptop, i7-4500u, 8G ram, m-c built today (opt build without debug symbols).
Also, the telemetry histograms support these values. E.g. On Firefox 34: GC_MAX_PAUSE_MS's median is 55.09 : http://telemetry.mozilla.org/#filter=nightly%2F34%2FGC_MAX_PAUSE_MS&aggregates=multiselect-all!Submissions!Mean!5th%20percentile!25th%20percentile!median!75th%20percentile!95th%20percentile&evoOver=Builds&locked=true&sanitize=true&renderhistogram=Graph
Updated•10 years ago
|
Updated•10 years ago
|
I think it's important to decide up front whether the goal is to improve the empty profile use case or the lots-of-tabs users. Part of the problem with GC is that we don't have test cases that are at all representative of how users actually use the GC. areweslimyet.com is probably the closest, but it takes forever to run.
We could also consider incorporating more GC data into telemetry.
Comment 4•10 years ago
|
||
The goal is, IMO, to have the majority of GCs to not have slices above 10ms. Right now, from my experiments (using the patch from bug 1019611) that every 5-30ms we get a 30-60ms max pause.
I used a clean profile with various scenarios, e.g. constant simple animation on http://testufo.com, just opening and closing tabs, banana bread webgl mozilla demos, scrolling up and down on cnn.com, and others. The pattern is quite consistent- we get a max pause of ~40 ms typically every 10-30s.
Each of those - if it happens during animation - is a clearly visible dropped frame. The testufo.com site also identifies it as "not smooth".
I didn't test with multiple tabs, but I'd say that if it can't keep a simple animation consistently smooth with a single tab on a clean profile, then what chance do we have with multiple tabs and dirty profiles?
Comment 5•10 years ago
|
||
(In reply to Avi Halachmi (:avih) from comment #4)
> The goal is, IMO, to have the majority of GCs to not have slices above 10ms.
> Right now, from my experiments (using the patch from bug 1019611) that every
> 5-30ms we get a 30-60ms max pause.
Typo, every 5-30s there's a GC pause of 30-60ms which results in visibly dropped frame[s] if it happens to be while animating or gaming.
Comment 6•10 years ago
|
||
(In reply to Avi Halachmi (:avih) from comment #4)
> The goal is, IMO, to have the majority of GCs to not have slices above 10ms.
If we wanted to really define a _good_ goal, I'd say that slices should be typically not more than 5ms and, if possible, more spread apart (e.g. maybe every 5-10 frames rather than every frame).
10ms taken out of a frame could many times still cause a frame drop IMO, since it's 70% of the entire frame time, so it squeezes the frame's work quite a bit.
FWIW, I was also observing CC max slices (bug 1019101) which happen a bit more frequently than GCs (typically every 6-7s) and are shorter - typically 1-7 ms averaging about 4 I think. And on most of the CCs I couldn't notice frame drops and also testufo.com didn't detect "stutter" during CCs.
OK. I have to stress, though, that 5ms is extremely ambitious. I worked on a research Java GC (Metronome) in 2006 and it managed to achieve pause times in that range, but the throughput was really bad. Another system, the Java GC from Azul is "pauseless", but it requires OS support. Also, Java GC is a lot easier than JavaScript GC, especially in a browser. I think that to get even close to 10ms would take a year or more of work.
A more reasonable goal would be to try to make GCs less frequent (with better scheduling) and smaller (per-tab rather than for all the tabs). Also, thinking ahead, electrolysis will split apart the JS heaps for the parent and the child processes. As a consequence, GCs in both processes will get faster.
Comment 8•10 years ago
|
||
Yeah, I didn't pretend to be able to estimate the difficulty of a 5ms goal. But both my understanding of animations and my observations with the CC pauses tell me that 5ms is a good goal to strive to.
My estimation is that 10ms taken out from a frame is likely to drop frames frequently enough - especially if we're CPU intensive like in gaming, and that consecutive frames with 10ms pause each is even more likely to (though according to Terrence we don't typically have consecutive >10ms slices - he gave me this example for slices: [10, 1, 2, 1, 1, 3, 2, 2, 1, 39]).
I do agree, however, that having a frame or two dropped every 2 minutes is way better than every 18s - which is what I observed.
Another approach, which again I can't assess its difficulty, is to make the GCs/slices much more frequent, but only if you can make sure that typically all of them are very quick (say, 1-2 ms for the 75th percentile, and not more than 5ms for the rest upto 99th percentile).
Reporter | ||
Comment 9•10 years ago
|
||
Useful gc benchmarking links for Avi:
http://v8.googlecode.com/svn/branches/bleeding_edge/benchmarks/spinning-balls/index.html
http://29a.ch/2010/6/2/realtime-raytracing-in-javascript
https://videos.cdn.mozilla.net/uploads/mozhacks/flight-of-the-navigator/
(In reply to Avi Halachmi (:avih) from comment #8)
> Another approach, which again I can't assess its difficulty, is to make the
> GCs/slices much more frequent, but only if you can make sure that typically
> all of them are very quick (say, 1-2 ms for the 75th percentile, and not
> more than 5ms for the rest upto 99th percentile).
I don't understand this. What do you mean by more frequent? We're doing a slice at the top of each frame right now so more frequent would not be any different from a longer slice time.
Comment 10•10 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #9)
> (In reply to Avi Halachmi (:avih) from comment #8)
> > Another approach, which again I can't assess its difficulty, is to make the
> > GCs/slices much more frequent, but only if you can make sure that typically
> > all of them are very quick (say, 1-2 ms for the 75th percentile, and not
> > more than 5ms for the rest upto 99th percentile).
>
> I don't understand this. What do you mean by more frequent? We're doing a
> slice at the top of each frame right now so more frequent would not be any
> different from a longer slice time.
Obviously more frequent than once a frame would have negative effect. I suggested an idea where the next GC would start sooner in the hope that its slices might be shorter. I can't assess how practical or useful it would be.
Reporter | ||
Comment 11•10 years ago
|
||
(In reply to Avi Halachmi (:avih) from comment #10)
> Obviously more frequent than once a frame would have negative effect. I
> suggested an idea where the next GC would start sooner in the hope that its
> slices might be shorter. I can't assess how practical or useful it would be.
Ah, no. The minimum length of the longest slices is proportional to the retained size, not the freed size, so this would be ineffective.
Reporter | ||
Updated•10 years ago
|
Depends on: ParallelSweeping
Reporter | ||
Updated•10 years ago
|
Depends on: ConcurrentMarking
Reporter | ||
Updated•10 years ago
|
Alias: GC.60fps
Reporter | ||
Updated•10 years ago
|
No longer depends on: ConcurrentMarking
Reporter | ||
Updated•9 years ago
|
Depends on: ConcurrentGC
Updated•8 years ago
|
Whiteboard: [platform-rel-Games]
Updated•8 years ago
|
platform-rel: --- → ?
Updated•8 years ago
|
platform-rel: ? → ---
Updated•8 years ago
|
Updated•3 years ago
|
Assignee: terrence.d.cole → nobody
Status: ASSIGNED → NEW
Comment 14•2 years ago
|
||
In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.
Severity: critical → --
You need to log in
before you can comment on or make changes to this bug.
Description
•