Open Bug 1844829 Opened 1 year ago Updated 1 year ago

xpcshell logcat spam: ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings + ServiceChildProcess: This process belongs to a different GeckoRuntime owner

Categories

(GeckoView :: General, defect, P3)

All
Android

Tracking

(Not tracked)

People

(Reporter: robwu, Unassigned)

References

Details

When I run a xpcshell test on Android, I get tons of logspam that is a repetition of the following two lines:

D ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings: FOREGROUND priority, 0 importance, 3 successful binds, 0 failed binds, 0 successful unbinds
W ServiceChildProcess: This process belongs to a different GeckoRuntime owner: a6e047f4-b59e-44e6-8fe7-2e77dfa27779 process: b4d03bfd-e47c-4c35-80d3-687a1b78ffce

Here is an example of a 350 MB logcat-emulator-5554.log from a try push:

  • 1 963 881 lines
  • 1 527 627 lines with the above two messages.

To get anything useful out of adb logcat locally, I currently have to filter logcat as follows:

adb logcat | grep -vE 'ServiceChildProcess: This process belongs to a different GeckoRuntime owner|ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices'

If you want to reproduce, create a .mozconfig to develop for Android (artifact builds are OK):

./mach android-emulator
./mach test toolkit/components/extensions/test/xpcshell/test_ext_alarms_does_not_fire.js

and in another terminal, adb logcat :

07-21 19:14:47.812  5136  5152 I Gecko   : {"action":"log","time":1689959687807,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_alarms_does_not_fire.js | Starting test_cleared_alarm_does_not_fire"}
07-21 19:14:47.813  5136  5152 I Gecko   :
07-21 19:14:47.813  5136  5152 I Gecko   : {"action":"log","time":1689959687813,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test test_cleared_alarm_does_not_fire pending (2)"}
07-21 19:14:47.848  5136  5152 I Gecko   :
07-21 19:14:47.848  5136  5152 I Gecko   : {"action":"log","time":1689959687848,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"\"Extension attached\""}
07-21 19:14:47.848  5136  5152 I Gecko   :
07-21 19:14:47.848  5136  5152 I Gecko   : {"action":"log","time":1689959687848,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test run_next_test 1 finished (2)"}
07-21 19:14:47.962  5136  5152 D         : HostConnection::get() New Host Connection established 0x7bdbf081f6a0, tid 5152
07-21 19:14:47.967  5136  5152 D EGL_emulation: eglCreateContext: 0x7bdbf08a7440: maj 3 min 0 rcv 3
07-21 19:14:47.968  5136  5152 D EGL_emulation: eglMakeCurrent: 0x7bdbf08a7440: ver 3 0 (tinfo 0x7bdbf08ba6e0)
07-21 19:14:47.972  5136  5152 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008cdf
07-21 19:14:47.972  5136  5152 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008824
07-21 19:14:47.989  5136  5168 D ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings: FOREGROUND priority, 0 importance, 3 successful binds, 0 failed binds, 0 successful unbinds
07-21 19:14:47.997  5053  5070 W ServiceChildProcess: This process belongs to a different GeckoRuntime owner: a5be9210-1374-4549-bb9f-86271cd1137b process: dcc7ad4b-a327-40fe-8490-5baa279e0cda
07-21 19:14:48.011  5136  5168 D ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings: FOREGROUND priority, 0 importance, 3 successful binds, 0 failed binds, 0 successful unbinds
07-21 19:14:48.013  5053  5070 W ServiceChildProcess: This process belongs to a different GeckoRuntime owner: a5be9210-1374-4549-bb9f-86271cd1137b process: dcc7ad4b-a327-40fe-8490-5baa279e0cda

The last two lines repeat 1237 times in this one test run, interleaved with other usual log output. Out of the 3.2k log lines, 2.5k was from this logspam...

Summary: xpcshell logcat spam: erviceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings + ServiceChildProcess: This process belongs to a different GeckoRuntime owner → xpcshell logcat spam: ServiceAllocator: org.mozilla.gecko.process.GeckoChildProcessServices$gpu updateBindings + ServiceChildProcess: This process belongs to a different GeckoRuntime owner

Jamie - the logging was introduced in bug 1741156. Do you know what's going on here?

Besides being a pain during development, this also results in gigabytes, if not terabytes of storage space consumption in CI.

Flags: needinfo?(jnicol)

I think what's happening here is that in xpcshell tests we create many GeckoRuntimes so that we can run multiple tests in parallel. Each of these runtimes attempts to launch their own GPU process but android only allows one instance of each service, in this case org.mozilla.gecko.process.GeckoChildProcessServices$gpu. So after the first runtime successfully launches the GPU process, each subsequent runtime attempting to do so will instead notify the existing instance we attempted to start it. The existing instance will see that it belongs to a different runtime, and log this warning, and we will repeatedly retry launching it.

This logging existed before bug 1741156, back when content processes were the only type of child process. While we only declare one gpu process service (and one socket, one utility, etc), we declare many content process ones, eg tab1, tab2, etc. If gecko attempts to launch a content process and picks a service already in use then we log this warning and retry with a random tabN until we find one which succeeds. However, with the GPU process (and socket process, utility process, etc) this will never succeed as we've only declared one service for the process type. Note that for the content processes we can encounter this warning during normal operation because multiple content processes are allowed, whereas for GPU and other process types we can only encounter it during tests due to having multiple GeckoRuntimes.

I'm unsure what the consequences of this are. Perhaps we can simply remove the logging. Or perhaps this is an actual problem. We only support multiple GeckoRuntimes during tests so it's not an issue in the wild, but perhaps it might causes issues during tests. Maybe we need to declare multiple services for each type of child process and use the same logic as for content processes to select a random service name during launch. (or perhaps even a generic childN services could be used for both content and non-content child processes.) Kayacan, perhaps you or somebody else on the geckoview team might be better placed to answer that?

Flags: needinfo?(jnicol) → needinfo?(kkaya)

This logging existed before bug 1741156,

Bug 1696460 added the original logging in the allow-multiple-runtimes patch, specifically https://hg.mozilla.org/mozilla-central/rev/9c4199568bbc4f6f1a7c439149c744156892c788

If gecko attempts to launch a content process and picks a service already in use then we log this warning and retry with a random tabN until we find one which succeeds. However, with the GPU process (and socket process, utility process, etc) this will never succeed as we've only declared one service for the process type.

So that would require revisiting the logic here: https://searchfox.org/mozilla-central/rev/385f408040a8bb5fd9d168fb16a4e9071a13938d/mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoProcessManager.java#822-828

Hey Rob, agreed! And thanks Jaime for the great details there. We need to revisit the logic there to get rid of those spamming logs. There should not be multiple instances of the GeckoRuntime except the tests, as Jaime said. Removing the logs or, better, refactoring the logic of retrying to start the child service process would solve this issue. I will give it P3 and S3 at the moment to keep it in the backlog for the upcoming sprints. Worth to note that Fission is being implemented on Android at the moment, so the child service process management logic will very possibly be changing. So this maybe addressed there as well. There's no ticket/plan for that at the moment, that i know of. When a ticket is created for the process management on GeckoView layer for the Fission project, I'll link it here.

Severity: -- → S3
Flags: needinfo?(kkaya)
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.