Closed Bug 603147 Opened 14 years ago Closed 11 years ago

Intermittent mochitest-plain, mochitest-chrome, mochitest-other zombiecheck | child process NNNN still alive after shutdown

Categories

(Core Graveyard :: Plug-ins, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [see comment 416])

Not the first time I've seen this, but the first time it was on my own push so I had to file it. http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1286681368.1286682878.19416.gz#err0 Rev3 WINNT 6.1 mozilla-central opt test mochitest-other on 2010/10/09 20:29:28 s: talos-r3-w7-023 15252 INFO TEST-START | Shutdown 15253 INFO Passed: 14011 15254 INFO Failed: 0 15255 INFO Todo: 113 15256 INFO SimpleTest FINISHED *** registerContentHandler(application/vnd.mozilla.maybe.feed,http://mochi.test:8888/tests/toolkit/components/places/tests/chrome/demohandler.html?feedurl=%s,Demo handler) ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv INFO | automation.py | Application ran for: 0:07:33.978000 INFO | automation.py | Reading PID log: c:\users\cltbld\appdata\local\temp\tmpmm14hepidlog ==> process 1028 launched child process 1712 ==> process 1028 launched child process 4076 ==> process 1028 launched child process 1936 ==> process 1028 launched child process 3348 INFO | automation.py | Checking for orphan process with PID: 1712 INFO | automation.py | Checking for orphan process with PID: 4076 INFO | automation.py | Checking for orphan process with PID: 1936 TEST-UNEXPECTED-FAIL | automation.py | child process 1936 still alive after shutdown INFO | automation.py | Checking for orphan process with PID: 3348 SUCCESS: The process with PID 372 has been terminated. SUCCESS: The process with PID 1168 has been terminated. WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! INFO | runtests.py | Running tests: end. ERROR: The process "2476" not found.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1288060048.1288061504.26823.gz Rev3 WINNT 6.1 mozilla-central opt test mochitest-other on 2010/10/25 19:27:28 15312 INFO TEST-PASS | chrome://mochitests/content/chrome/widget/tests/test_wheeltransaction.xul | passed: Very large delta scrolling (h-2) 15313 INFO TEST-END | chrome://mochitests/content/chrome/widget/tests/test_wheeltransaction.xul | finished in 9316ms 15314 INFO TEST-START | Shutdown 15315 INFO Passed: 14062 15316 INFO Failed: 0 15317 INFO Todo: 112 15318 INFO SimpleTest FINISHED *** registerContentHandler(application/vnd.mozilla.maybe.feed,http://mochi.test:8888/tests/toolkit/components/places/tests/chrome/demohandler.html?feedurl=%s,Demo handler) ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv INFO | automation.py | Application ran for: 0:06:18.228000 INFO | automation.py | Reading PID log: c:\users\cltbld\appdata\local\temp\tmpwp8zo2pidlog ==> process 3604 launched child process 3676 ==> process 3604 launched child process 1912 ==> process 3604 launched child process 1376 ==> process 3604 launched child process 3720 INFO | automation.py | Checking for orphan process with PID: 3676 INFO | automation.py | Checking for orphan process with PID: 1912 INFO | automation.py | Checking for orphan process with PID: 1376 TEST-UNEXPECTED-FAIL | automation.py | child process 1376 still alive after shutdown INFO | automation.py | Checking for orphan process with PID: 3720 SUCCESS: The process with PID 784 has been terminated. ERROR: The process with PID 3368 could not be terminated. Reason: There is no running instance of the task. SUCCESS: The process with PID 200 has been terminated. WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! INFO | runtests.py | Running tests: end. program finished with exit code 0 elapsedTime=386.085000 TinderboxPrint: mochitest-chrome<br/>14062/0/112
https://tbpl.mozilla.org/php/getParsedLog.php?id=10302386&tree=Firefox Rev3 WINNT 6.1 mozilla-central opt test mochitest-other on 2012-03-22 19:50:54 PDT for push 2cec1f79a141 { ==> process 1808 launched child process 2120 ... INFO | automation.py | Checking for orphan process with PID: 2120 TEST-UNEXPECTED-FAIL | automation.py | child process 2120 still alive after shutdown ... ERROR: The process "2056" not found. }
This one is from mochitest-plain-1... not sure if it's worth a separate bug... https://tbpl.mozilla.org/php/getParsedLog.php?id=12835269&tree=Mozilla-Aurora
Summary: Intermittent "child process nnnn still alive after shutdown" during mochitest-chrome shutdown → Intermittent "child process nnnn still alive after shutdown" during mochitest-chrome or mochitest-1 shutdown
Whiteboard: [orange]
Joel, this warning was added by you in 0aeedccc0125; could you take a look at this (non auto-starrable) toporange? Cheers :-)
Flags: needinfo?(jmaher)
actually this was added by bsmedberg much earlier in bug 523208, I just refactored it to work with remote. looking at orange factor, this is only on windows 7 and mostly on mochitest 1 and mochitest-other. plugin process is probably the other process that is sitting around. maybe bsmedberg can provide more insight?
Flags: needinfo?(jmaher) → needinfo?(benjamin)
What kind of insight are you looking for? We don't know from the log whether the child process is a plugin process or a content process, but it's definitely not good that we're not shutting that process down cleanly. It looks like the existing code which write the "process log" (http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/process_util_win.cc#168) doesn't say *why* we're launching those processes, but you could probably add some additional debug spew to the plugin and content parents to log the PID of their child when they are created, so we at least know which process isn't being cleaned up.
Flags: needinfo?(benjamin)
Summary: Intermittent "child process nnnn still alive after shutdown" during mochitest-chrome or mochitest-1 shutdown → Intermittent "child process nnnn still alive after shutdown" during mochitest-chrome or mochitest shutdown
Whiteboard: [see comment 416]
Please can you find an owner for this intermittent-failure - the current overall tree intermittent failure rate is spiralling out of control & the majority of bugs are unowned (see dev.platform thread).
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #416) > but you could probably add some additional debug spew to the plugin and > content parents to log the PID of their child when they are created, so we > at least know which process isn't being cleaned up. I'm guessing roughly at: http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentParent.cpp#1097 and http://mxr.mozilla.org/mozilla-central/source/dom/plugins/ipc/PluginProcessParent.cpp#78 Yeah?
Depends on: 854407
Adjusting summary so this can be suggested by TBPL once bug 854407 propagates.
Summary: Intermittent "child process nnnn still alive after shutdown" during mochitest-chrome or mochitest shutdown → Intermittent mochitest-plain, mochitest-chrome zombiecheck | child process NNNN still alive after shutdown
Depends on: 855681
Depends on: 855686
Summary: Intermittent mochitest-plain, mochitest-chrome zombiecheck | child process NNNN still alive after shutdown → Intermittent mochitest-plain, mochitest-chrome, mochitest-other zombiecheck | child process NNNN still alive after shutdown
per platform mtg just now, BenT is looking at both bug#855681 and bug#855686.
Go figure, the logging patch from bug 855686 made this disappear. Since it would be a stretch to call that a fix, resolving this WFM.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(benjamin)
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.