Open Bug 1839036 Opened 1 year ago Updated 1 year ago

High CPU use when browser is idle

Tracking

()

Status:

UNCONFIRMED

Project Flags:

Performance Impact

low

People

(Reporter: curlypaul924, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf:resource-use)

Attachments

(5 files)

Screenshot_20230617_160054.png 1 year ago Paul Brannan (deleted), image/png		Details
Screenshot_20230617_162131 - firefox main thread.png 1 year ago Paul Brannan (deleted), image/png		Details
Screenshot_20230617_162232 - firefox renderer thread.png 1 year ago Paul Brannan (deleted), image/png		Details
Screenshot_20230617_162307 - firefox glean.dispatcher.png 1 year ago Paul Brannan (deleted), image/png		Details
Screenshot_20230617_162334 - firefox compositor thread.png 1 year ago Paul Brannan (deleted), image/png		Details

Paul Brannan

Reporter

Description

•

1 year ago

Attached image Screenshot_20230617_160054.png (deleted) — Details

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0

Steps to reproduce:

Use the browser for a while (many tabs open)
Firefox eventually uses nearly 500% cpu (8-core machine)
Using about:processes kill all processes (note: none of the processes killed were showing as using CPU)

Actual results:

After killing all processes firefox was still using nearly 500% cpu (visible in top and in about:processes).

Expected results:

With no active processes firefox should be using close to 0% cpu.

Paul Brannan

Reporter

Comment 1

•

1 year ago

Profile: https://share.firefox.dev/3JjvYUw

The renderer is spending 100% of time in mozilla::wr::RenderThread::HandleFrameOneDocInner.

The main thread is spending its time in g_main_context_poll.

Paul Brannan

Reporter

Comment 2

•

1 year ago

Attached image Screenshot_20230617_162131 - firefox main thread.png (deleted) — Details

Paul Brannan

Reporter

Comment 3

•

1 year ago

Attached image Screenshot_20230617_162232 - firefox renderer thread.png (deleted) — Details

Paul Brannan

Reporter

Comment 4

•

1 year ago

Attached image Screenshot_20230617_162307 - firefox glean.dispatcher.png (deleted) — Details

Paul Brannan

Reporter

Comment 5

•

1 year ago

Attached image Screenshot_20230617_162334 - firefox compositor thread.png (deleted) — Details

Paul Brannan

Reporter

Comment 6

•

1 year ago

Attached output from perf top for the busiest threads. A lot of time is spent in futex_wait/futex_wait. There is also a lot of time spent scheduling the threads (as if they are waking very briefly and then yielding the cpu). The high cost of the linux scheduler for threads with this work pattern is something that afaict isn't captured by the firefox profiler.

BugBot [:suhaib / :marco/ :calixte]

Comment 7

•

1 year ago

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk

Product: Firefox → Core

Martin Stránský [:stransky] (ni? me)

Comment 8

•

1 year ago

A possible dupe of Bug 1826291.

Can you try Wayland backend?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Testing_Mozilla_binaries

Thanks.

Flags: needinfo?(curlypaul924)

Martin Stránský [:stransky] (ni? me)

Updated

•

1 year ago

Priority: -- → P3

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1826291

Paul Brannan

Reporter

Comment 9

•

1 year ago

(In reply to Martin Stránský [:stransky] (ni? me) from comment #8)

A possible dupe of Bug 1826291.

Hmm, interesting thought. On the surface the conditions are different (I did not have any active background windows), but the effect is similar.

I agree that aggressively swapping buffers for a window that is actively drawing but not visible is not ideal. In this case all the background windows are idle/inactive. There should be zero work done for windows that are inactive, as there is no active content to render.

Can you try Wayland backend?
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Testing_Mozilla_binaries

Thanks.

Will the Wayland backend work under X11? I am not running Wayland.

Flags: needinfo?(curlypaul924)

Martin Stránský [:stransky] (ni? me)

Comment 10

•

1 year ago

(In reply to Paul Brannan from comment #9)

Will the Wayland backend work under X11? I am not running Wayland.

No, Wayland needs different environment. You need Wayland compositor running (Sway for instance or Mutter in Wayland mode), see:
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems#Testing_different_Wayland_compositor

Paul Brannan

Reporter

Comment 11

•

1 year ago

I realized the profiler's screenshots obscures the work the browser is doing. Here is an updated profile with screenshots disabled: https://share.firefox.dev/3Nl39Ie

Above profile has two tabs showing content (one from reddit, one from HN), and extensions are enabled. I tried to get another profile with extensions disabled and showing all threads: https://share.firefox.dev/3NEFx2O

But the second profile seems to have hit a different bug. The renderer and compositor threads are now idle, and there are multiple pool-firefox threads visible in top (afaik I've never seen this before).

It's possible this bug is related to bug#1581169 as right-clicking and saving images is part of the process I used to reproduce the bug.

Patricia Lawless

Updated

•

1 year ago

Performance Impact: --- → ?

Daniel Holbert [:dholbert]

Comment 12

•

1 year ago

(In reply to Paul Brannan from comment #11)

I realized the profiler's screenshots obscures the work the browser is doing. Here is an updated profile[...]

Thanks. So based on the profiles from comment 11, there seems to be quite a lot of garbage collection happening in the parent process, which is showing up as a fair amount of jank. Let's classify under JS:Garbage Collection, assuming comment 11 is representative of the same original issue here.

That wouldn't correspond to your original report of 500% cpu usage (I'd expect at-most 100% from that single process); but it is a user-visible perf issue and it looks like more than I would expect.

Iain, could you take a look at the comment 11 profiles (particularly the second one with extensions disabled) and see if you have any theories about what's going on?

Component: Widget: Gtk → JavaScript: GC

Daniel Holbert [:dholbert]

Comment 13

•

1 year ago

The Performance Impact Calculator has determined this bug's performance impact to be low. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

[x] Causes severe resource usage

Keywords: perf:resource-use

Justin Link

Updated

•

1 year ago

Performance Impact: ? → low

Iain Ireland [:iain]

Comment 14

•

1 year ago

The second profile in comment 11 looks to me like the parent process is otherwise idle and we are taking advantage of the opportunity to do a major GC. I'm a little surprised that we're managing to spend 7 mostly uninterrupted seconds running a GC without finishing, but maybe I'm underestimating the scope of the heap. The first second is spent marking, and then we keep sweeping until the end of the profile.

Looking at the first profile, I note that the parent process is doing some non-GC work, but it's still the case that every half-second we're running a GCSlice with a 100ms budget. We're already sweeping when the profile starts, and we're still sweeping when it ends.

Jon, is it concerning/noteworthy if the parent process spends 6-10+ seconds sweeping during a single major GC?

Flags: needinfo?(jcoppeard)

BugBot [:suhaib / :marco/ :calixte]

Comment 15

•

1 year ago

The component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit BugBot documentation.

Priority: P3 → --

Paul Brannan

Reporter

Comment 16

•

1 year ago

I tried to reproduce the bug today in a controlled experiment (fresh profile, no extensions) but with no success. I visited multiple websites and saved over 500 images in a single directory using right-click and save as. The browser became very slow when saving files, same as bug#1581169, but CPU usage remained low after the file was saved. So it appears this is a different bug from that one at least. I did get a profile from saving an image and will add it to that bug report.

I will continue trying to reproduce the bug and see if there's a particular website that triggers it. For now I don't know, just that after browsing for a while firefox gets sluggish, and when I look at about:processes to see if there's a busy process, there is no obvious culprit.

Jon Coppeard (:jonco) (PTO until 14th September)

Comment 17

•

1 year ago

(In reply to Iain Ireland [:iain] from comment #14)

The second profile in comment 11 looks to me like the parent process is otherwise idle and we are taking advantage of the opportunity to do a major GC. I'm a little surprised that we're managing to spend 7 mostly uninterrupted seconds running a GC without finishing, but maybe I'm underestimating the scope of the heap. The first second is spent marking, and then we keep sweeping until the end of the profile.

This is unusual to say the least. Telemetry shows the 95th percentile of sweep time in the parent process is 34 milliseconds. That suggests some kind of memory leak in the parent process.

Looking at the flame graph, most of the time is spent tracking cycle collector gray roots, i.e. C++ objects.

(In reply to Paul Brannan from comment #16)
If you reproduce this again, can you measure the memory use with about:memory and post the results for the parent process?

Flags: needinfo?(jcoppeard)

Bryan Thrall [:bthrall]

Updated

•

1 year ago

Blocks: jsperf

Severity: -- → S3

Priority: -- → P3

You need to log in before you can comment on or make changes to this bug.