xpcshell logcat spam: ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings + ServiceChildProcess: This process belongs to a different GeckoRuntime owner
Categories
(GeckoView :: General, defect, P3)
Tracking
(Not tracked)
People
(Reporter: robwu, Unassigned)
References
Details
When I run a xpcshell test on Android, I get tons of logspam that is a repetition of the following two lines:
D ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings: FOREGROUND priority, 0 importance, 3 successful binds, 0 failed binds, 0 successful unbinds
W ServiceChildProcess: This process belongs to a different GeckoRuntime owner: a6e047f4-b59e-44e6-8fe7-2e77dfa27779 process: b4d03bfd-e47c-4c35-80d3-687a1b78ffce
Here is an example of a 350 MB logcat-emulator-5554.log from a try push:
- 1 963 881 lines
- 1 527 627 lines with the above two messages.
To get anything useful out of adb logcat locally, I currently have to filter logcat as follows:
adb logcat | grep -vE 'ServiceChildProcess: This process belongs to a different GeckoRuntime owner|ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices'
If you want to reproduce, create a .mozconfig
to develop for Android (artifact builds are OK):
./mach android-emulator
./mach test toolkit/components/extensions/test/xpcshell/test_ext_alarms_does_not_fire.js
and in another terminal, adb logcat
:
07-21 19:14:47.812 5136 5152 I Gecko : {"action":"log","time":1689959687807,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_alarms_does_not_fire.js | Starting test_cleared_alarm_does_not_fire"}
07-21 19:14:47.813 5136 5152 I Gecko :
07-21 19:14:47.813 5136 5152 I Gecko : {"action":"log","time":1689959687813,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test test_cleared_alarm_does_not_fire pending (2)"}
07-21 19:14:47.848 5136 5152 I Gecko :
07-21 19:14:47.848 5136 5152 I Gecko : {"action":"log","time":1689959687848,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"\"Extension attached\""}
07-21 19:14:47.848 5136 5152 I Gecko :
07-21 19:14:47.848 5136 5152 I Gecko : {"action":"log","time":1689959687848,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test run_next_test 1 finished (2)"}
07-21 19:14:47.962 5136 5152 D : HostConnection::get() New Host Connection established 0x7bdbf081f6a0, tid 5152
07-21 19:14:47.967 5136 5152 D EGL_emulation: eglCreateContext: 0x7bdbf08a7440: maj 3 min 0 rcv 3
07-21 19:14:47.968 5136 5152 D EGL_emulation: eglMakeCurrent: 0x7bdbf08a7440: ver 3 0 (tinfo 0x7bdbf08ba6e0)
07-21 19:14:47.972 5136 5152 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008cdf
07-21 19:14:47.972 5136 5152 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008824
07-21 19:14:47.989 5136 5168 D ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings: FOREGROUND priority, 0 importance, 3 successful binds, 0 failed binds, 0 successful unbinds
07-21 19:14:47.997 5053 5070 W ServiceChildProcess: This process belongs to a different GeckoRuntime owner: a5be9210-1374-4549-bb9f-86271cd1137b process: dcc7ad4b-a327-40fe-8490-5baa279e0cda
07-21 19:14:48.011 5136 5168 D ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings: FOREGROUND priority, 0 importance, 3 successful binds, 0 failed binds, 0 successful unbinds
07-21 19:14:48.013 5053 5070 W ServiceChildProcess: This process belongs to a different GeckoRuntime owner: a5be9210-1374-4549-bb9f-86271cd1137b process: dcc7ad4b-a327-40fe-8490-5baa279e0cda
The last two lines repeat 1237 times in this one test run, interleaved with other usual log output. Out of the 3.2k log lines, 2.5k was from this logspam...
Reporter | ||
Updated•1 year ago
|
Reporter | ||
Comment 1•1 year ago
|
||
Jamie - the logging was introduced in bug 1741156. Do you know what's going on here?
Besides being a pain during development, this also results in gigabytes, if not terabytes of storage space consumption in CI.
Comment 2•1 year ago
|
||
I think what's happening here is that in xpcshell tests we create many GeckoRuntime
s so that we can run multiple tests in parallel. Each of these runtimes attempts to launch their own GPU process but android only allows one instance of each service, in this case org.mozilla.gecko.process.GeckoChildProcessServices$gpu
. So after the first runtime successfully launches the GPU process, each subsequent runtime attempting to do so will instead notify the existing instance we attempted to start it. The existing instance will see that it belongs to a different runtime, and log this warning, and we will repeatedly retry launching it.
This logging existed before bug 1741156, back when content processes were the only type of child process. While we only declare one gpu
process service (and one socket
, one utility
, etc), we declare many content process ones, eg tab1
, tab2
, etc. If gecko attempts to launch a content process and picks a service already in use then we log this warning and retry with a random tabN
until we find one which succeeds. However, with the GPU process (and socket process, utility process, etc) this will never succeed as we've only declared one service for the process type. Note that for the content processes we can encounter this warning during normal operation because multiple content processes are allowed, whereas for GPU and other process types we can only encounter it during tests due to having multiple GeckoRuntimes.
I'm unsure what the consequences of this are. Perhaps we can simply remove the logging. Or perhaps this is an actual problem. We only support multiple GeckoRuntimes during tests so it's not an issue in the wild, but perhaps it might causes issues during tests. Maybe we need to declare multiple services for each type of child process and use the same logic as for content processes to select a random service name during launch. (or perhaps even a generic childN
services could be used for both content and non-content child processes.) Kayacan, perhaps you or somebody else on the geckoview team might be better placed to answer that?
Reporter | ||
Comment 3•1 year ago
|
||
This logging existed before bug 1741156,
Bug 1696460 added the original logging in the allow-multiple-runtimes patch, specifically https://hg.mozilla.org/mozilla-central/rev/9c4199568bbc4f6f1a7c439149c744156892c788
If gecko attempts to launch a content process and picks a service already in use then we log this warning and retry with a random
tabN
until we find one which succeeds. However, with the GPU process (and socket process, utility process, etc) this will never succeed as we've only declared one service for the process type.
So that would require revisiting the logic here: https://searchfox.org/mozilla-central/rev/385f408040a8bb5fd9d168fb16a4e9071a13938d/mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoProcessManager.java#822-828
Comment 4•1 year ago
|
||
Hey Rob, agreed! And thanks Jaime for the great details there. We need to revisit the logic there to get rid of those spamming logs. There should not be multiple instances of the GeckoRuntime except the tests, as Jaime said. Removing the logs or, better, refactoring the logic of retrying to start the child service process would solve this issue. I will give it P3 and S3 at the moment to keep it in the backlog for the upcoming sprints. Worth to note that Fission is being implemented on Android at the moment, so the child service process management logic will very possibly be changing. So this maybe addressed there as well. There's no ticket/plan for that at the moment, that i know of. When a ticket is created for the process management on GeckoView layer for the Fission project, I'll link it here.
Description
•