High CPU load on macOS during file download
Categories
(Firefox :: Downloads Panel, defect, P2)
Tracking
()
People
(Reporter: jonas, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: perf:resource-use, power, Whiteboard: [necko-triaged])
Attachments
(6 files)
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0
Steps to reproduce:
- Start firefox in safe mode with fresh profile (also works if using without safe mode and non-fresh profile)
- Download large-ish file (3GiB)
I obtained a trace using the macOS "Instruments" application in the hope that this helps debugging the issue. I can reproduce this all the time so if I should try it with some other tool I'm happy to do so.
Actual results:
Download works fine but 1 of 4 CPUs is at 100% load all the time (makes fans spin on MacBook)
Expected results:
Downloads should not be so CPU heavy
Top result from profiler:
3.99 s 59.1% 0 s -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] 0x268c81
Comment 2•4 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Comment 3•4 years ago
|
||
Jonas H, thanks for taking the time to report this issue, can you please capture a performance profile and attach it to this bug?
You can get more info on how to install and use the Cleopatra add-on (that helps you get the performance profile) by going to:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
https://perf-html.io/
Please note that this addon works only with FF Nightly, so this means you need to be able to reproduce the issue on Nightly first. You can download the latest Nightly from
https://www.mozilla.org/en-US/firefox/channel/desktop/
Updated•4 years ago
|
Here we go! https://share.firefox.dev/2ATqe3G
Comment 5•4 years ago
|
||
Markus, where should I direct reports with performance profiles like these?
Comment 6•4 years ago
|
||
Core :: Performance is a good place. Or if enough information is known that we know that the bug is already in the correct component, then adding "[qf]" to the whiteboard will also put the bug into the Perf Triage queue.
Comment 8•4 years ago
|
||
Could you try this in Firefox Nightly (rather than release) and get another Instruments recording? This will have more useful stacks and symbols.
Is there a way to upload a 20MiB file? Otherwise, can I send it to your e-mail address, Markus?
Reporter | ||
Comment 10•4 years ago
|
||
Reporter | ||
Comment 11•4 years ago
|
||
I have another file that's too large for upload.
Comment 12•4 years ago
|
||
(In reply to Jonas H. from comment #9)
Is there a way to upload a 20MiB file? Otherwise, can I send it to your e-mail address, Markus?
You can try sending it to my bugzilla email address but I'm not sure that attachments larger than 20MB will make it through. But the other profile you uploaded looks like it has all the necessary information, thanks!
In the profile from attachment 9163591 [details], 45% of the CPU usage is spent inside the following stack:
19 libfreebl3.dylib 21832.0 SHA256_Compress_Generic
18 libfreebl3.dylib 21825.0 SHA256_Update_Generic
17 libsoftokn3.dylib 21825.0 NSC_DigestUpdate
16 libnss3.dylib 21825.0 PK11_DigestOp
15 XUL 21825.0 mozilla::net::DigestOutputStream::Write(char const*, unsigned int, unsigned int*)
14 XUL 21825.0 nsStreamCopierIB::ConsumeInputBuffer(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*)
13 XUL 21825.0 nsPipeInputStream::ReadSegments(nsresult (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*)
12 XUL 21825.0 nsStreamCopierIB::DoCopy(nsresult*, nsresult*)
11 XUL 21825.0 nsAStreamCopier::Process()
10 XUL 21825.0 nsAStreamCopier::Run()
6% are spent in the write
system call, another 6% in recvfrom and poll, a bit in memcpy, and a fair bit in starting and stopping threads of a threadpool in nsStreamTransportService::Dispatch
which I filed as bug 1652898.
The sha256 hashing was added in bug 829832 and seems to be used for "reputation checking": https://searchfox.org/mozilla-central/rev/82c04b9cad5b98bdf682bd477f2b1e3071b004ad/toolkit/components/downloads/DownloadIntegration.jsm#494-509
I've never seen Firefox block a download on macOS so I'm not sure if this hash is actually used these days.
Jonas, what is your download speed?
If I profile a download on my machine, I get a very different CPU distribution. My download speed is 3.0MB/s, and I see 50% of the CPU usage in the poll
system call and only 1.5% in SHA256_Compress_Generic
on my machine. Actually, it seems like poll
is keeping an entire core busy, which seems bad... but that's a different issue from the one reported in this bug, and I've filed it as bug 1652901.
Comment 13•4 years ago
|
||
The component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 14•4 years ago
|
||
Can you try this download? https://transfer.sh/iCiI5/Instruments2.tar.bz2
I am not sure if the Instruments3 profile is correct; it didn't show CPU spikes in the Instruments UI.
Reporter | ||
Comment 15•4 years ago
|
||
On good days I have 400Mbit/s download, but mostly it's somewhere in the 30–100Mbit/s range. I did the benchmarks with a localhost download though; the speed I got there was around 85MiB/s while wget gets around 600MiB/s.
Comment 16•4 years ago
|
||
(In reply to Jonas H. from comment #14)
Can you try this download? https://transfer.sh/iCiI5/Instruments2.tar.bz2
The download worked but the file doesn't contain the necessary information; please use the "Time Profiler" tool as before. This will contain call stacks of Firefox C++ code.
Reporter | ||
Comment 17•4 years ago
|
||
Reporter | ||
Comment 18•4 years ago
|
||
Reporter | ||
Comment 19•4 years ago
|
||
Added 2 new files:
localhost.zip: unthrottled download from localhost (webfsd); same setup as "Instruments3" and "Instruments2" files. ~80–85MiB/s download speed. ~150% CPU load according to htop all the time.
wifi.zip: download speed throttled to "WiFi" using Firefox developer tools. ~3.7MiB/s download speed. ~140% CPU load for the first few seconds, then ~30% CPU load.
Updated•4 years ago
|
Comment 20•4 years ago
|
||
Thanks. localhost.zip looks the same as comment 10; 43% of CPU in SHA256 hashing. wifi.zip looks different; only 5% of the CPU usage is from SHA256 hashing, and much of the rest spent in overhead from the devtools network monitor.
Comment 21•4 years ago
|
||
This more looks like something for PSM or even NSS.
Maybe this NS_AsyncCopy
needs a larger chunk size? https://searchfox.org/mozilla-central/rev/72a1334982cadde0ca8b3d3583877afea80f5639/netwerk/base/BackgroundFileSaver.cpp#601
I'm not surprised that hashing a 3GiB file takes some CPU time - short of somehow making sha256 faster, nothing we change in PSM or NSS will improve this situation.
Reporter | ||
Comment 23•4 years ago
|
||
Here's a new profile with latest nightly. Not sure if it has any new info; I was just annoyed by Firefox draining my battery for a simple download so I did another profile.
Reporter | ||
Comment 24•4 years ago
|
||
(Oops, pressed submit too early.)
Download was ~15GB file from Internet Archive. You can see from the CPU usage graphs that download speed varied a lot (they are proportional). The download attempt profiled in the Instruments profile was 200MB in ~150s, so a mean speed of 1.33MB/s. Even at this rate Firefox was by far the most energy consuming application running on the system, according to Activity Monitor :-(
Reporter | ||
Comment 25•4 years ago
|
||
Reporter | ||
Comment 26•4 years ago
|
||
Reporter | ||
Comment 27•4 years ago
|
||
Had to split profile into two chunks because of upload limit. Please use cat to obtain the zip file. MD5 checksums:
baefc791578ef73b51462c5626605531 Instruments.trace.zip (orig)
1b713ff9de6ae730a5d4e3941d46d79b Instruments.trace.zip1
c51bf6b088fb192cf23cdacd0192e950 Instruments.trace.zip2
Comment 28•4 years ago
|
||
P2 as this is reproducible consistently, but not for majority of users.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 29•4 years ago
|
||
Note that I can see a similar high CPU load of the main process when downloading the following file:
https://www.microsoft.com/en/microsoft-teams/download-app#desktopAppDownloadregion
My download rate is around 5MB/s, and checking the download with Safari it only uses at maximum 10% of CPU and not >100%.
Comment 30•4 years ago
|
||
On my machine, with today's Nightly, on macOS, it seems like there might be extra CPU usage if the build gets into a certain "state": After a fresh restart, downloading takes about 70%-90% CPU. But once I get into that other state, I see CPU usage of > 150%. In the Gecko profiler the difference is how frequently we tick the refresh driver: If we only tick it whenever the percentage display in the downloads button's progress bar is updated, then we see the lower CPU usage. But sometimes we tick the refresh driver on every vsync while the download is going on, because we think there is some other animation. This causes the higher CPU usage.
I'm not sure what "other animation" that is, and how to get into that state.
The 70% are roughly 40% main thread, 15% socket thread (recv + TLS decryption), 15% hashing on a background thread.
Comment 31•4 years ago
|
||
Interesting. So as workaround I could move the download button from the toolbar into the overflow menu. Doing that the CPU load decreases to around 35%. That's way lesser but still a lot more compared to Safari.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•3 years ago
|
Comment 32•3 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #30)
On my machine, with today's Nightly, on macOS, it seems like there might be extra CPU usage if the build gets into a certain "state": After a fresh restart, downloading takes about 70%-90% CPU. But once I get into that other state, I see CPU usage of > 150%. In the Gecko profiler the difference is how frequently we tick the refresh driver: If we only tick it whenever the percentage display in the downloads button's progress bar is updated, then we see the lower CPU usage. But sometimes we tick the refresh driver on every vsync while the download is going on, because we think there is some other animation. This causes the higher CPU usage.
I'm not sure what "other animation" that is, and how to get into that state.
Your description here reminds me bug 1742393, that was fixed 5 months ago.
Is there anything actionable here other than bug 1709175?
Comment 33•3 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #32)
Your description here reminds me bug 1742393, that was fixed 5 months ago.
Interesting, I agree it does sound like that.
Is there anything actionable here other than bug 1709175?
Probably not.
Updated•3 years ago
|
Description
•