Closed Bug 1258172 Opened 9 years ago Closed 9 years ago

Super janky scrolling in inverted allocations tree

Categories

(DevTools :: Memory, defect, P2)

defect

Tracking

(firefox48 fixed)

RESOLVED FIXED
Firefox 48
Tracking Status
firefox48 --- fixed

People

(Reporter: fitzgen, Assigned: fitzgen)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

STR: * Open memory tool on irccloud.com * Enable recording allocations * Refresh * Take snapshot * View -> "aggregates" * Group by -> "inverted call stack" Profile: http://media.fitzgeraldnick.com/perf-profile-of-memory-profiling-irccloud.json Tons of time in minor gc, and some in major gc too but not much. I think this is a fishy amount of minor gc'ing, its just too much time.
Has STR: --- → yes
~40% of time in nursery collections ~16% of time in major GC
Repeatedly getting ~10ms nursery collections inside Tree#_dfsFromRoots. Nursery collections shouldn't usually be more than 1-2ms or so. ni myself to re-test with JS_GC_PROFILE_NURSERY=1 env variable to see where nursery time is being spent
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
New STR: * Open memory tool * View -> "aggregate" * Group by -> "Inverted call stack" * Import this snapshot: http://media.fitzgeraldnick.com/dumping-grounds/irccloud-with-stacks.fxsnapshot ---------------------------------------------------- If I change Tree#_dfsFromRoots' for/of loop into a C-style for loop, then performance improves quite a bit, but is still fairly suboptimal. This is worrying because the temporary iteration protocol objects are /exactly/ the kinds of objects that generational GC is supposed to make "almost free" to allocate.
Here is the results with `JS_GC_PROFILE_NURSERY=5000`: > MinorGC: Reason PRate Size Time canIon mkVals mkClls mkSlts mcWCll mkGnrc ckTbls mkRntm mkDbgr clrNOC collct tenCB swpABO updtIn frSlts clrSB sweep resize pretnr logPtT > MinorGC: FULL_STORE_BUFFER 1.1% 9 12636 0 129 0 106 1 11650 0 118 0 2 88 1 10 3 0 10 517 1 0 > MinorGC: FULL_STORE_BUFFER 11.8% 16 5554 0 431 283 152 13 660 0 102 0 1 3360 1 14 4 5 15 510 0 0 > MinorGC: FULL_STORE_BUFFER 0.7% 14 6064 0 151 1 86 1 4883 0 60 0 1 48 0 4 2 0 9 720 98 0 > MinorGC: FULL_STORE_BUFFER 11.6% 16 5671 0 246 201 76 8 364 0 93 0 1 3532 3 15 7 9 10 1105 0 0 > MinorGC: FULL_STORE_BUFFER 1.4% 7 23299 0 127 1 74 0 22531 0 87 1 0 84 0 5 2 0 12 374 1 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 21642 0 102 0 82 0 20882 0 75 0 1 59 1 3 2 0 10 424 1 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 5353 0 105 0 94 1 4620 0 55 0 1 39 0 3 1 0 9 423 0 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 5432 1 106 0 73 1 4689 1 60 1 0 44 1 4 2 0 8 441 0 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 9130 0 117 1 87 1 8330 0 76 1 0 58 0 4 3 0 10 442 0 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 21082 0 103 0 82 1 20320 0 77 1 0 61 0 4 1 0 10 420 0 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 10835 0 120 11 89 1 10074 0 63 0 0 40 1 4 1 0 9 420 1 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 20792 0 129 0 96 1 19975 0 78 0 1 58 0 4 2 0 9 437 1 0 > MinorGC: FULL_STORE_BUFFER 1.1% 9 169353 0 106 0 92 1 168437 0 102 1 1 93 1 5 4 0 13 494 0 0 Looks like mkGnrc is dominating, which is similar to bug 1237058.
Seems to involve a lot of CCW wrapping, just like bug 1237058. Seems to be the same nursery perf cliff.
Flags: needinfo?(nfitzgerald)
Terrence, I remember you telling me how using stable hashing for the CCW map (bug 1225577) could help with reducing mkGnrc time, but I have completely forgotten why/how that might be. Can you re-enlighten me? Thanks!
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #6) > Terrence, I remember you telling me how using stable hashing for the CCW map > (bug 1225577) could help with reducing mkGnrc time, but I have completely > forgotten why/how that might be. Can you re-enlighten me? Thanks! Terrence: ^
Flags: needinfo?(terrence)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #6) > Terrence, I remember you telling me how using stable hashing for the CCW map > (bug 1225577) could help with reducing mkGnrc time, but I have completely > forgotten why/how that might be. Can you re-enlighten me? Thanks! It would help because we're moving the update from the generic buffer to the normal store buffer. This allows for deduplication and avoids a ton of complexity in the update loop. Given that inlining the normal update loop saved us 60% of the time we were spending there, I think this has the potential to be a huge win. I've already got patches doing this. I'll dust those off and re-request review today.
Flags: needinfo?(terrence)
Note: I don't plan on landing this until I can confirm that Terrence's ongoing nursery + CCW work helps in this Real World test case.
Attachment #8734121 - Flags: review?(jimb)
Assignee: nobody → nfitzgerald
Status: NEW → ASSIGNED
Comment on attachment 8734121 [details] [diff] [review] Use C-style for loops instead of for/of to cut down on the number of CCWs when using the Tree component Review of attachment 8734121 [details] [diff] [review]: ----------------------------------------------------------------- This looks fine to me, with no changes. But: - If you don't know that it's worth it to pull `length` out of the loop like that, it's pretty distracting. - Some of these could become `.forEach` calls and not be quite so ugly. Up to you.
Attachment #8734121 - Flags: review?(jimb) → review+
Terrence, we're getting a little close to uplift, so I want to land this before then. We can always locally back out this patch to retest your nursery + wrappers changes.
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: