[meta] Pre-allocated content processes are not always idle but consume a bit of CPU
Categories
(Core :: DOM: Content Processes, defect, P3)
Tracking
()
Fission Milestone | Future |
People
(Reporter: whimboo, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: meta, power, Whiteboard: fission-perf)
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:86.0) Gecko/20100101 Firefox/86.0 ID:20210122212755
As I have noticed with Firefox Nightly and Fission turned on the 3 pre-allocated processes are not always idle. They consume a bit of CPU all the time (max 0.1%) but I think that this should not happen.
There was some discussion on Matrix between Nika and Florian, and as it looks like Nika figured it out. So I will let her reply here.
Updated•4 years ago
|
Comment 1•4 years ago
|
||
There are a bunch of different issues that contribute to this. We should track it as a meta bug so they can be fixed individually.
Comment 2•4 years ago
|
||
With some help and discussion with Florian and others, we first checked which threads were waking up, and ran a quick profiler run to find the specific messages which are being sent back and forth from the content process. Florian's profile is here: https://share.firefox.dev/3aaI7Zy. There are a couple of cases here which are leading to the content process being woken up:
The preallocated content process is periodically receiving messages broadcasted by the parent process. The main culprits are:
GetMemoryUniqueSetSize
(bug 1689182) - this is called by memory telemetry on a regular interval to collect the unique set information for each content process, and is implemented by pinging each process and asking it to compute its own unique set size. In bug 1652813 support was added to get this information from the parent process without waking content processes, so we should switch the memory collection telemetry over to using this approach.DataStoragePut
(bug 1689191) - this is called bymozilla::DataStorage
on a regular basis, and broadcasts a message to all content processes to update some state related to our ssl implementation. I don't know enough about this data to know why it's changing so frequently and whether or not we could potentially avoid broadcasting this information to every process. This message only fires a couple of times in the profile Florian captured, but was firing in bursts every second or so when I recorded my live profile, which leads me to believe the rate of updates is related to the browser being in-use.
In addition, the preallocated content process appears to wake itself with a timer on a somewhat regular basis to send AccumulateChildKeyedHistograms
and RecordDiscardedData
messages to the parent process. These appear to fire in pairs every 2 seconds or so. I was initially worried that this could be caused by a loop of IPC collecting telemetry when sending telemetry data, however the telemetry code explicitly waits to disarm the timer until after IPC data has been sent in order to avoid looping (https://searchfox.org/mozilla-central/rev/b9384b091e901b3283ce24b6610e80699d79fd06/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp#301-302). The most likely case is that IPC receiving the other messages mentioned here causes the timer to start firing again, although this should be verified once those issues are fixed.
Comment 3•4 years ago
|
||
M8 unless this we find this is a bigger problem than we currently believe.
Updated•4 years ago
|
Comment 4•4 years ago
|
||
Randell, what are your thoughts on this with Nika's findings in comment 2?
Comment 5•4 years ago
|
||
This is almost solely an issue with power use. This gets more important on mobile devices, but can have a small impact on laptops. These are relatively cheap operations on a preallocated process, so the power impact isn't high, especially if the browser is otherwise active. Disabling these until the process is allocated to something is very feasible, but adds some small complexity and also increases the amount of overhead when allocating a process from the preallocation cache, which may make a small regression for some page loads.
M8 or even MVP seems reasonable.
Comment 6•4 years ago
|
||
Not a user-perceivable performance impact so pushing it to MVP.
Reporter | ||
Comment 7•3 years ago
|
||
I can still see PContent::Msg_FlushFOGData
observer notifications being dispatched from these pre-allocated processes. Here an appropriate profile from a recent Nightly build: https://share.firefox.dev/3AsTxnM
As it looks like this is coming from Glean, and I assume we should try to stop these notifications?
Reporter | ||
Comment 8•3 years ago
|
||
Chris, could you please have a look at my last comment? Thanks!
Comment 9•3 years ago
|
||
Yup, this sure appears to be from FOG, the layer integrating the Glean SDK into Firefox Desktop.
FOG will, after 5s of idle, ask content processes to hand up any data they have kicking around. We use ContentParent::GetAll
to get a list of all the content processes that might be harbouring unsent data and ask them all to flush. Most have nothing to send at the moment, but soon any might.
Is there a way to identify processes as being pre-allocated? We can exclude them from the iteration easily enough. We've been deliberately vague in the documentation about how we schedule these flushes, so we have flexibility here (though we will want to make a specific note that telemetry accumulated in pre-allocated content processes will be specifically excluded).
We're also interested in learning more about how to best do efficient and non-intrusive opportunistic IPC flushes, like in bug 1641989. So if the current approach is completely wrong and should be revisited, we'll simply need to prioritize the work. We're not attached to it the way it is : )
Reporter | ||
Comment 10•3 years ago
|
||
Chris, thank you for the quick reply! I just noticed that there is one dependency open for this meta, which is bug 1689446. I wonder if the Telemetry part should block that other bug (which might in turn could be a meta bug?).
Comment 11•3 years ago
|
||
Or be blocked by. It seems as though bug 1689446 is thinking about rethinking GetAll
in a way that FOG could then use...
Comment 12•3 years ago
|
||
Moving this meta bug from Fission MVP to Future. The one remaining blocking bug 1689446 doesn't need to block Fission MVP.
Description
•