Closed Bug 1591366 Opened 5 years ago Closed 3 years ago

Bad performance with deeply-nested display: grid elements

Categories

(Core :: Layout: Grid, defect, P3)

70 Branch
Unspecified
All
defect

Tracking

()

RESOLVED FIXED
94 Branch
Webcompat Priority P2
Performance Impact high
Tracking Status
firefox-esr68 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix
firefox72 --- wontfix
firefox94 --- fixed

People

(Reporter: alexandre.rico, Assigned: sefeng)

References

(Regressed 1 open bug)

Details

(Keywords: perf, perf:resource-use, testcase)

Attachments

(5 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0

Steps to reproduce:

This is a test file I wrote so you can reproduce the bug.
Basically when you open this HTML in Firefox, it freeze the browser and the computer CPU usage go to 100%.
Delete 3 <div> and it work fine.
The more you had <div> the longer it take to render.

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Insert title here</title>
</head>
<body>
<div style="padding: 10px; display: grid; grid-template-rows: min-content auto;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<div style="display: grid;">
<span>TEST</span>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>

Actual results:

Taking way too long to display this HTML. And my CPU goes to 100% compare to Chrome with the same file.

Expected results:

Render immediately the HTML.

Severity: normal → critical
Status: UNCONFIRMED → NEW
Has STR: --- → yes
Component: Untriaged → Layout: Grid
Ever confirmed: true
Keywords: perf, testcase
OS: Unspecified → All
Product: Firefox → Core

So regression range contains "enable CSS grid": bug 1000592. Probably not a regression at all then?

But yeah it seems there's some exponential reflow going on here.

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

As per comment 2, not really a regression, but it does look like performance takes a nosedive with the nested display:grid elements here; Mats, any thoughts on how to improve this?

Severity: critical → normal
Flags: needinfo?(mats)
Keywords: regression
Priority: -- → P3
Summary: Bad performance with display: grid CSS → Bad performance with deeply-nested display: grid elements
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p1:resources]
Whiteboard: [qf:p1:resources] → [qf:p1:resource]

Here's a case that likes to reproduce on firefox linux nightly, noting that it seems like the problem actually triggers after initial layout when I scroll the window.

Here's a profile that I gathered from the attached HTML file using "Firefox Platform" profiling settings AFTER the process hosting the file became unresponsive and hitting "capture" after what seemed like "long enough": https://share.firefox.dev/2V38Y2G

Edit: Note that the test file contains an external script reference to "simple.bundle.js" in the header that is moot and can be ignored. (I inlined the relevant parts of the script later in the file, along with the data that generates the dynamic HTML which isn't really all that complex.)

Daniel, can you please provide some optimizations that we could do for grids? Probably similar to the ones that we did for flexbox?

Flags: needinfo?(dholbert)

Yeah, bug 1524056 approximately covers adding some similar optimizations there. It's a substantial amount of work (mostly to reason about all the edge cases, the invariants that need to be maintained, etc); it's on the layout team's radar and is hopefully getting done sometime this year, but I'm not sure precisely when.

Flags: needinfo?(dholbert)

Specifically: CachedBAxisMeasurement and CachedFlexItemData are the two flexbox helper-structs that we may want to create some analogous versions of, over in our grid layout code. But it will take a good bit of investigation & reasoning about grid layout to determine what parts of these optimizations are sound, what sorts of conditions need to cause these cached values to be invalidated, etc.

I am starting to look into or learning CachedBAxisMeasurement and CachedFlexItemData, to see what we could do for grid. Just sort of mentions that I am looking into this bug.

Here's the reporter's testcase from comment 0, fwiw.

And here's a performance profile of me loading a local copy of this testcase in a fresh user-profile:
https://share.firefox.dev/3fdtoQa

Attachment #9221148 - Attachment description: reporter's testcase from comment 0 → reporter's testcase from comment 0 (triggers ~8-second reflow)
Assignee: nobody → sefeng
Status: NEW → ASSIGNED
Attachment #9222994 - Attachment description: WIP: Bug 1591366 - Improve nested grid layout performance by caching the computed BAxis → Bug 1591366 - Improve nested grid layout performance by caching the computed BAxis
Attachment #9222994 - Attachment description: Bug 1591366 - Improve nested grid layout performance by caching the computed BAxis → WIP: Bug 1591366 - Improve nested grid layout performance
Attachment #9222994 - Attachment description: WIP: Bug 1591366 - Improve nested grid layout performance → Bug 1591366 - Improve nested grid layout performance

Something like this should give us 64 bits to play with also on 32-bit systems...

Flags: needinfo?(mats)
Attachment #9237462 - Attachment is patch: true
Attachment #9237462 - Attachment mime type: application/octet-stream → text/plain
Attachment #9222994 - Attachment description: Bug 1591366 - Improve nested grid layout performance → Bug 1591366 - Improve nested grid layout performance r=mats

Depends on D115711

Attachment #9240105 - Attachment description: Bug 1591366 - Always reserve 64 bits for frame property value r=mats → Bug 1591366 - Always reserve 64 bits for frame property value r=dholbert
Webcompat Priority: --- → P2
Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a2b6f6abda6e Improve nested grid layout performance r=mats https://hg.mozilla.org/integration/autoland/rev/19be43686dc0 Always reserve 64 bits for frame property value r=dholbert

That's actually a test "unexpected-pass"! Interesting. Looks like that test is currently marked as expected-fail on all platforms, and started passing on Android with this patch, for some reason. This patch isn't really supposed to change behavior (aside from making us faster), so it shouldn't really be a thorough "fix" for any tests like that one.

That test's expected-failure is tracked in bug 1521088. Given that bug's relation to incremental reflow (e.g. from browser-window resizes), it's not entirely surprising that the test's behavior might change from a patch like this one that reduces the amount of redundant reflow that we're doing.

IMO, our simplest option for the moment is just to change the test annotation[1] to have an Android-specific [PASS, FAIL] status (marking it as random on that platform, basically) -- assuming that the (occasional?) pass status is Android-specific.

[1] https://searchfox.org/mozilla-central/source/testing/web-platform/meta/css/css-grid/table-grid-item-dynamic-004.html.ini

Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0479e0a4cc15 Improve nested grid layout performance r=mats https://hg.mozilla.org/integration/autoland/rev/19499821190d Always reserve 64 bits for frame property value r=dholbert
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch
Flags: needinfo?(sefeng)
Depends on: 1732082

Note: it seems that this did indeed fix things for attachment 9158023 [details] (asuth's persnosco testcase, "slow-grid.html bad things happen on scroll"). I can scroll that testcase in current Nightly with no noticeable rendering delays (whereas if I turn off the pref, I see extreme many-second delays and sometimes hangs).

However, the reporter's testcase that was quoted in comment 0 (which I attached as attachment 9221148 [details]) seems to still be pretty slow -- its render time has been cut in half (so it takes ~4 seconds instead of ~8 seconds, roughly; vs. it has no render delay in Chrome).

That testcase is approximately the same as the testcase in bug 1682686, so let's track that remaining slowness over there.

Blocks: 1732082
No longer depends on: 1732082
Regressions: 1733978
Regressions: 1735376
Regressions: 1735931
Performance Impact: --- → P1
Whiteboard: [qf:p1:resource]
Duplicate of this bug: 1717355
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: