Closed Bug 1377253 Opened 7 years ago Closed 6 years ago

Reflow is slow on twitter direct messages (and other sites), with lots of reflow time spent in nested flex reflow

Categories

(Core :: Layout, defect, P2)

Product:

Component:

Type:

defect

Priority:

P2

Severity:

normal

Tracking

()

Status:

RESOLVED WORKSFORME

Performance Impact

high

Tracking Flags:

Tracking

Status

platform-rel

---

?

firefox57

---

fix-optional

firefox58

---

affected

People

(Reporter: dholbert, Assigned: dholbert)

References

(Blocks 6 open bugs)

Details

(Keywords: perf, Whiteboard: [platform-rel-Twitter])

Daniel Holbert [:dholbert]

Assignee

Description

•

7 years ago

This is a helper bug for bug 1373023. When you type characters on Twitter, we spend some time in synchronous reflows (triggered by each character that you type). Profile, from bug 1373023 comment 5: http://bit.ly/2sBRXNT It looks like each PresShell::DoReflow call there is ~28ms long, with most of the time spent in nested flexbox reflow. The textbox does have four layers of flex container ancestors (with other non-flex-container ancestors in between the layers, which I'll disregard for the moment): > <div id="dm_dialog" class="DMDialog modal-container"> > <div class="DMActivity DMConversation js-ariaDocument u-chromeOverflowFix DMActivity--open"> > <div class="DMActivity-container"> > <form class="DMComposer tweet-form"> So that's why we're nested several recursive levels into nsFlexContainerFrame::Reflow stuff. As I recall, we're not particularly careful about tracking dirtiness for specific flex items. If a flex container has one dirty descendant & receives a call to Reflow as a result, it just reflows all of its descendants, because the new layout information from the dirty descendant can influence the results of the flexbox layout algorithm & the sizes of all the other flex items. We probably can & should cache some more state between reflows, so that we can reflow dirty flex-items-and-their-subtrees more selectively & see if their flex-relevant measurements have changed in a way that'd influence the flex layout algorithm. (And if they haven't changed, then we can skip redoing flex layout & skip reflowing all the other flex items.)

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Summary: Reflow is slow on twitter direct messages → Reflow is slow on twitter direct messages (with reflow time mostly spent in nested flex reflow)

Daniel Holbert [:dholbert]

Assignee

Comment 1

•

7 years ago

(In reply to Daniel Holbert [:dholbert] from comment #0) > If a flex container has one dirty descendant & > receives a call to Reflow as a result, it just reflows all of its > descendants, (It's actually not quite this bad -- the whole subtree won't necessarily get reflowed. For example, if one of the descendants is a block and that block isn't marked as dirty, then I'm pretty sure that block won't get reflowed or will have a trivial reflow. But, each flex item will get an opportunity to reflow itself, which may or may not be fast depending on what sort of container it is & what dirty bits are set.)

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: FastReflows

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Assignee: nobody → dholbert

Status: NEW → ASSIGNED

Whiteboard: [qf]

Comment hidden (obsolete)

Comment hidden (obsolete)

Comment hidden (obsolete)

Daniel Holbert [:dholbert]

Assignee

Comment 5

•

7 years ago

Here's a profile I took locally, using latest Nightly on my Linux64 laptop, when typing (1 character per second) into the text field for a DM conversation between me & jwatt (with a large paragraph already present in the textbox before I started profiling here): https://perf-html.io/public/9aedaaa3b6a5cb8ef6c738966c47a136b8d09bed/calltree/?hiddenThreads=&search=reflow&thread=3&threadOrder=0-2-3-1 (Note that this profile doesn't have the fix for bug 1375491, which will reduce the frequency of layout-flushes here -- that's on inbound and will be in tomorrow's Nightly, I assume.) The layout flushes in my profile are a bit shorter than the ones in overholt's profile from comment 0 -- around 10ms per character press. (This is probably just due to differing amounts of content & machine capability.)

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: 1376536

Naveed Ihsanullah [:naveed]

Updated

•

7 years ago

Whiteboard: [qf] → [qf:p1]

Astley Chen (inactive)

Updated

•

7 years ago

Priority: -- → P1

Daniel Holbert [:dholbert]

Assignee

Comment 6

•

7 years ago

Here's another profile from today's Nightly, now with 0.1ms samples (for better resolution). I typed 9 characters into the textbox for the same twitter conversation with jwatt, with a long "blah blah" paragraph pre-entered into the textbox before my typed characters just in case it matters. https://perfht.ml/2t832cV (already filtered for "reflow") The layout flushes for each of my 9 keypresses are the "double humps" of reflow activity (and there are 4 other brief reflows from JS calls to offsetWidth). Each keypress seems to trigger 25-30ms of reflow activity, in this profile.

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Summary: Reflow is slow on twitter direct messages (with reflow time mostly spent in nested flex reflow) → Reflow is slow on twitter direct messages (and other sites), with lots of reflow time spent in nested flex reflow

Daniel Holbert [:dholbert]

Assignee

Comment 7

•

7 years ago

Calling this qf:p2 since I haven't gotten to it yet & risk tolerance is ramping down, and I don't think I'll have this ready in time for 57 given that. p2 -> to do right after 57.

Whiteboard: [qf:p1] → [qf:p2]

Emma Humphries ☕️🎸🧞‍♀️✨ (she/they) [:emceeaich] (Pacific Time) use needinfo

Updated

•

7 years ago

Keywords: stale-bug

Jet Villegas (inactive)

Updated

•

7 years ago

Keywords: stale-bug

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: 1393120

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: 1367004

Jet Villegas (inactive)

Updated

•

7 years ago

status-firefox57: --- → fix-optional

Astley Chen (inactive)

Updated

•

7 years ago

status-firefox58: --- → affected

Priority: P1 → P2

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1375032

Firefox Bug Husbandry Bot

Updated

•

7 years ago

Keywords: perf

Brian Grinstead [:bgrins]

Updated

•

7 years ago

Blocks: 1408190

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Whiteboard: [qf:p2] → [qf:p1]

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf:p1] → [qf:i60][qf:p1]

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf:i60][qf:p1] → [qf:f60][qf:p1]

Adam Stevenson [:adamopenweb]

Updated

•

7 years ago

platform-rel: --- → ?

Whiteboard: [qf:f60][qf:p1] → [qf:f60][qf:p1] [platform-rel-Twitter]

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: 1422367

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf:f60][qf:p1] [platform-rel-Twitter] → [qf:f61][qf:p1] [platform-rel-Twitter]

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: 1438768

Emma Humphries ☕️🎸🧞‍♀️✨ (she/they) [:emceeaich] (Pacific Time) use needinfo

Updated

•

7 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1439394

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Blocks: flexbox-perf-issues

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Depends on: 1282821

Daniel Holbert [:dholbert]

Assignee

Updated

•

7 years ago

Depends on: 1374540
No longer depends on: 1282821

(no longer active)

Comment 8

•

7 years ago

Hi Daniel, I was wondering if this is still on your radar to fix? thanks!

Flags: needinfo?(dholbert)

Mike Conley (:mconley) (:⚙️)

Updated

•

6 years ago

Whiteboard: [qf:f61][qf:p1] [platform-rel-Twitter] → [qf:f64][qf:p1] [platform-rel-Twitter]

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

Blocks: 1425106

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

Blocks: 1465205

Jean Gong :jgong

Updated

•

6 years ago

Whiteboard: [qf:f64][qf:p1] [platform-rel-Twitter] → [qf:p1:f64][platform-rel-Twitter]

Bobby Holley (:bholley)

Updated

•

6 years ago

Blocks: layout-perf

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

Blocks: 1473110

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

Blocks: 1474813

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

No longer blocks: 1474813

Brad Werth [:bradwerth]

Comment 9

•

6 years ago

(In reply to Daniel Holbert [:dholbert] from comment #0) > We probably can & should cache some more state between reflows, so that we > can reflow dirty flex-items-and-their-subtrees more selectively & see if > their flex-relevant measurements have changed in a way that'd influence the > flex layout algorithm. (And if they haven't changed, then we can skip > redoing flex layout & skip reflowing all the other flex items.) Since this plan was outlined, we may have incidentally made a start on this solution. In Bug 1409083, we added computed-on-demand cached flex item and flex line sizing information for use by devtools. It seems like this mechanism could be altered to solve this problem. Here's a sketch of how it could be done, with some questions: 1) Pare down the cached information to the minimum amount necessary to solve this problem and change the devtools cached information to be separate/additive. a) Is mainSize the only information we need? Seems like crossSize might not affect the space available to siblings. b) Is it still appropriate to hang the cached data off of a property? Or is there a more memory/performance friendly way to store the cached data, given that we'll need it for every flex item for every future reflow? 2) In DoFlexLayout(), where we check item->HadMeasuringReflow(), we would do something similar to do a running check of all items on the line. As long as each one of them is the same size as the cached value, we don't need to reflow it. Once we do, then we have to reflow the following children. a) Do we actually have sizes for the flex items here if they haven't had a measuring reflow? b) The iteration order here is the order the items appear in the flex line; not the order that they were frozen by the flex algorithm. If that matters, how do we account for that? I'm sure there are other tricky issues to consider. If this is outline is deemed plausible, I would be happy to take the bug and attempt it.

Brian Grinstead [:bgrins]

Updated

•

6 years ago

Blocks: 1449346

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 10

•

6 years ago

Slow flex reflows show up in https://www.huffingtonpost.com/ too.

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

Depends on: 1490890

Daniel Holbert [:dholbert]

Assignee

Updated

•

6 years ago

Depends on: 1492538

Daniel Holbert [:dholbert]

Assignee

Comment 11

•

6 years ago

I think this particular case with Twitter is better now. At least: - I verified that keypress-triggered reflows aren't reflowing frames that are part of the conversation backscroll. So: that's a sign that we're not reflowing unboundedly-expensive arbitrary amounts of content. (That's what was unboundedly-slow over in bug 1376536.) - I captured a profile of me typing "123456" (with a pause between each character) into a twitter DM with ~medium amount of backscroll (25 messages), and I got this profile: https://perfht.ml/2T0qv9p We're spending 16ms of the time there with DoReflow in the backtrace (which averages to 16ms/6-keypresses = 2.6ms per keypress, though that's actually broken down into several different reflows per keypress since Twitter is doing a lot at once, and none of the individual reflows is longer than 2ms as you can see in the "Marker Chart" view.) Meanwhile, if I perform the same experiment in Chrome, they report spending over twice as much time in layout (34ms and 36ms in the two tests that I did). Compare this (particularly <2ms reflow times) to the ~28ms reflow times in comment 0, and it seems likely this is fixed. Overholt: if you happen to still have the expensive-to-type-in twitter-DM-conversation available to test with (The one that you used in bug 1373023), it'd be great if you could reprofile just to double check my measurements! But for now, I'm going to call this WORKSFORME.

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

Flags: needinfo?(dholbert) → needinfo?(overholt)

Resolution: --- → WORKSFORME

Daniel Holbert [:dholbert]

Assignee

Comment 12

•

6 years ago

(In reply to Olli Pettay [:smaug] (high review load) from comment #10) > Slow flex reflows show up in https://www.huffingtonpost.com/ too. BTW, I just profiled a pageload of huffingtonpost.com: https://perfht.ml/2zGiayB The "marker table" UI there shows that there are lots of reflows, though most are quite short. The longest one is right at the beginning (which makes sense), and it's 51ms long. The second-longest one is the 10th reflow, and it's 12.2ms long. So, the 51ms-long-reflow is a bit concerning, though it's an entirely unreasonable time for reflow-to-completion of a full complex page. (And interruptible reflow means we're probably still somewhat responsive during that 51ms, if the user were to click or anything, since it's the initial reflow of the page.) Also: for comparison, I profiled the same page in Chrome dev, and in that profile, I'm seeing the first layout being 83.38ms long, and then there's another layout ~100ms later that is 42.7ms long, and then there's another one ~50ms later that is 91.81ms long, and there are several more in the 20-30ms range later on. All in all, they report 425.7ms of time spent on Layout during the load of this page, whereas we report 194.8ms of layout (time with DoReflow in the backtrace). So at the moment, we seem to be doing a great deal better than Chrome is doing, in terms of layout on huffingtonpost.com loading (fewer long layouts, shorter duration for longest layout, and < 50% of the time spent in layout overall). So I don't think there's likely to be much that's actionable there at the moment.

Daniel Holbert [:dholbert]

Assignee

Comment 13

•

6 years ago

(BTW: if anyone digs in further on huffingtonpost and disagrees with my assessment in comment 12, let's track it in a new bug.)

Daniel Holbert [:dholbert]

Assignee

Comment 14

•

6 years ago

(In reply to Daniel Holbert [:dholbert] from comment #12) > BTW, I just profiled a pageload of huffingtonpost.com: > https://perfht.ml/2zGiayB > [...] > So, the 51ms-long-reflow is a bit concerning, though it's > an entirely unreasonable time for reflow-to-completion of a full complex > page. Sorry, I left out a word -- I meant to say it's *not* an entirely unreasonable time. :)

Andrew Overholt [:overholt]

Comment 15

•

6 years ago

(In reply to Daniel Holbert [:dholbert] from comment #11) > > Overholt: if you happen to still have the expensive-to-type-in > twitter-DM-conversation available to test with (The one that you used in bug > 1373023), it'd be great if you could reprofile just to double check my > measurements! But for now, I'm going to call this WORKSFORME. I tried it and it feels no laggier than Notepad. I didn't even bother profiling because it seems perfectly fine. Thanks!

Flags: needinfo?(overholt)

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P1

Whiteboard: [qf:p1:f64][platform-rel-Twitter] → [platform-rel-Twitter]

You need to log in before you can comment on or make changes to this bug.