Closed Bug 1373816 Opened 7 years ago Closed 7 years ago

Several 15+ms reflows when loading a facebook group page

Categories

(Core :: General, defect)

55 Branch
x86_64
Windows 10
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mchang, Assigned: arus)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [QRC][QRC_NeedAnalysis])

+++ This bug was initially created as a clone of Bug #1363424 +++ Steps to reproduce: 1. Launch browser. 2. Fill Facebook.com in Navigation Start, then press enter. 3. Login with credentials. 4. Record with Gecko Profile: Tap on a facebook group, and wait to have the first item displayed. 5. Share the profile. Gecko profile: https://perfht.ml/2q0q2Yr Notes: https://docs.google.com/spreadsheets/d/1Kxn850VasyaG_WfRg3pMInW0hbIT8LP7pRPBYTIpdbM/edit?pli=1#gid=679575660 Seeing lots of layout flushes, construction of frames, and reflows here that take quite a while. Daniel, can you please take a look? Thanks!
Flags: needinfo?(dholbert)
Sure, I'll take a look early next week. (Already quite busy for the rest of today.)
First glance: the worst thing I'm seeing in that profile happens at around 7.96 seconds, and it's a JS call to getBoundingClientRect() which forces a 19ms restyle followed immediately by a 20ms reflow. (So the overall getBoundingClientRect() call takes 39ms.) Link: http://bit.ly/2rAupvG It looks like it's some deeply nested frame construction, followed by an extremely deeply nested block-reflow (probably of all the frames that were just constructed).
Summary: 15+ms Reflows during facebook → Several 15+ms reflows when loading a facebook group page
Much of it looks like style system stuff to me; it'd be interesting to know if/how much Stylo improves this. It appears that these logs are from a build that's several weeks old. I see FramePropertyTable in there, which we have replaced now, and lot's of time in bidi resolution, which we've also optimized since then. I'd expect the reflow part to be significantly improved in a recent build...
From comment 3, can you please try profiling again?
Flags: needinfo?(arus)
Assignee: nobody → arus
Assignee: arus → jyavenard
Assignee: jyavenard → nobody
Back to arus to retest with a more recent Nightly build.
Assignee: nobody → arus
Tested again, on Acer Aspire e15, Win 10 Pro, x64 build: 20170620030208 profile link: https://perf-html.io/public/830842928218cd9298c70db69582695fc21e8b07/calltree/?thread=2
Flags: needinfo?(arus)
Blocks: 1363424
No longer depends on: 1363424
From comment 6, I don't see reflows anymore.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Flags: needinfo?(dholbert)
(In reply to Mason Chang [:mchang] from comment #7) > From comment 6, I don't see reflows anymore. Hmm... I see two long reflows there. If I look at the Timeline view, for content process 2, I see: (1) a 66ms reflow in this region: https://perf-html.io/public/830842928218cd9298c70db69582695fc21e8b07/timeline/?hiddenThreads=&range=42.5154_42.5816&thread=5&threadOrder=0-2-3-5-1-4 THOUGH, strangely, the "Call Tree" view only shows 6ms spent in reflow there (and a long period with no samples in the middle of this range). I wonder if Windows context-switched us out in the middle of our reflow here? (2) a 15ms reflow in this region: https://perf-html.io/public/830842928218cd9298c70db69582695fc21e8b07/timeline/?hiddenThreads=&range=51.4052_51.4206&thread=5&threadOrder=0-2-3-5-1-4 Here there's no gap of missing samples, but the "Call Tree" view only reports 8ms -- a fraction of the time that I'm expecting...
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Having said that, things are *hopefully* OK here, because: - Both of the reflows noted in comment 8 are from the refresh driver ticks and are "interruptible", so we should be able to bail if we can tell they're taking too long... - ehsan says the huge gap of missing samples in the first (66ms) reflow is due to thread starvation -- the sampler thread didn't get a chance to run for that period of time. So we don't know what was happening there. It's even possible (likely?) that we got context-switched out entirely the OS was handling some other process (which is maybe why we didn't get a chance to take advantage of our interruptibleness & bail midway through that 66ms period). Alin, sorry to bother you -- would you mind profiling one more time? I'm curious whether that "66ms reflow" was really something expensive in reflow (and our sampler thread just didn't get to observe it), or if it was just a fluke with the OS interrupting us to do some background work...
Flags: needinfo?(arus)
Alin, Please profile again based on Daniel's comment 9.
Tested again, on Acer Aspire e15, Win 10 Pro, x64 nightly build: 20170726030207 profile link: https://perfht.ml/2eNLCMk https://perf-html.io/public/371f3f0e2de5840fdfea78d46e35967044fe1e45/calltree/?hiddenThreads=&thread=2&threadOrder=0-2-3-4-6-1-5 Let me know if further profiles are needed here. Regards, Alin
Flags: needinfo?(arus)
Depends on: 1384669
Thanks, Alin! The reflows in that profile are all pretty short -- definitely none on the order of 60ms anymore, so I think ehsan's suspicion (of the OS context-switching away from us) was on target. In this latest profile, we spend a *total* of 64ms in reflow, over that whole 9.5second profile, with the longest contiguous spike being a 20ms reflow near the end. And the next-longest reflow is 13ms. The reflows mostly come from calls to .getBoundingClientRect() and .offsetWidth on DOM elements. (Both of those APIs cause layout flushes.) I noticed one possible tiny win, from looking at inverted stacks, and filed bug 1384669. Beyond that, I think this is WORSKFORME as far as what was reported in comment 0 goes ("lots of layout flushes, construction of frames, and reflows here that take quite a while.")
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.