Closed Bug 539295 Opened 15 years ago Closed 15 years ago

sporadic issue with mochitest-ipcplugins: "missing output line for total leaks"

Categories

(Core :: IPC, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1

People

(Reporter: dholbert, Assigned: cjones)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(2 files)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263319076.1263326866.16722.gz Linux mozilla-central debug test everythingelse on 2010/01/12 09:57:56 s: moz2-linux-slave15 { TEST-UNEXPECTED-FAIL | plugin process 31456 | automationutils.processLeakLog() | missing output line for total leaks! } http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263312578.1263318564.21326.gz Linux mozilla-central debug test everythingelse on 2010/01/12 08:09:38 s: moz2-linux-slave26 http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263309338.1263317368.7376.gz Linux mozilla-central debug test everythingelse on 2010/01/12 07:15:38 s: moz2-linux-slave41
Not sure what component this should go in. IPC? Plugins? Testing? I filed this in RelEng for now, since the failure suggests that it could be an issue with the testing framework. Feel free to relocate if another component makes more sense.
I think that this is a crash or something in the plugin process and that the harness is working correctly, but we're trying to get crash reporting hooked up today to help figure that out.
Component: Release Engineering → IPC
Product: mozilla.org → Core
QA Contact: release → ipc
Version: other → unspecified
FWIW, if you think something is a test harness issue, you'd file it in Testing:Whatever. RelEng should just be for "I think the build machine is broken" or "buildbot itself is broken".
It doesn't appear that the plugin process is crashing (or at least, not in a way which generates a minidump). Alternate theories: * The plugin process is shutting down after the main process, and racing with the automation script which reads its log * The plugin process is randomly exiting early (but not crashing in a way which would produce a minidump)
Assignee: nobody → benjamin
Linux mozilla-central debug test everythingelse on 2010/01/12 13:17:12 http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263331032.1263338310.14006.gz I'm having trouble reproducing locally, but that may be because I'm massively multi-core.
debugged: this is easy to reproduce if you insert a sleep(120) here: http://hg.mozilla.org/mozilla-central/annotate/2f969cc4f104/toolkit/xre/nsEmbedFunctions.cpp#l330 It appears that this thread is dying either before/during XRE_LogTerm. cjones, I suspect that the parent is killing it off before the leak log is fully written.
The parent process appears to be killing it in: #0 0x000000336ae331c0 in kill () from /lib64/libc.so.6 #1 0x00007f9fde37e345 in KillProcess (this=0x7f9fc8406b80) at ../../../src/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:126 #2 0x00007f9fde37e45f in ~ChildReaper (this=0x515a) at ../../../src/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:91 #3 0x00007f9fde359802 in MessageLoop::DeletePendingTasks (this=0x7f9fd6a07ee0) at ../../../src/ipc/chromium/src/base/message_loop.cc:408 #4 0x00007f9fde359e22 in ~MessageLoop (this=0x7f9fd6a07ee0) at ../../../src/ipc/chromium/src/base/message_loop.cc:143 #5 0x00007f9fde3653fb in base::Thread::ThreadMain (this=0x7f9fdbdbfa80) at ../../../src/ipc/chromium/src/base/thread.cc:175 #6 0x00007f9fde376506 in ThreadFunc (closure=0x515a) at ../../../src/ipc/chromium/src/base/platform_thread_posix.cc:26 #7 0x000000336ba073da in start_thread () from /lib64/libpthread.so.0 #8 0x000000336aee627d in clone () from /lib64/libc.so.6 Main thread is in: #0 0x000000336ba07cb5 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f9fde3654e9 in base::Thread::Stop (this=0x7f9fdbdbfa80) at ../../../src/ipc/chromium/src/base/thread.cc:114 #2 0x00007f9fde313499 in ~BrowserProcessSubThread (this=0x7f9fd6a089e0) at ../../../src/ipc/glue/GeckoThread.cpp:117 #3 0x00007f9fde38b911 in mozilla::ShutdownXPCOM (servMgr=0x7fffe6d1b320) at ../../../src/xpcom/build/nsXPComInit.cpp:913 #4 0x00007f9fdda18e50 in ~ScopedXPCOMStartup (this=0x7fffe6d1b9c0) at ../../../src/toolkit/xre/nsAppRunner.cpp:1042 #5 0x00007f9fdda1b3ad in XRE_main (argc=<value optimized out>, argv=<value optimized out>, aAppData=<value optimized out>) at ../../../src/toolkit/xre/nsAppRunner.cpp:3520 #6 0x0000000000401b4a in main (argc=5, argv=0x7fffe6d1bc78) at ../../../src/browser/app/nsBrowserApp.cpp:158 PluginModuleParent::~PluginModuleParent has already been called, here: #0 ~PluginModuleParent (this=0x7f7464c11400) at ../../../src/dom/plugins/PluginModuleParent.cpp:81 #1 0x00007f747887da55 in ~nsNPAPIPlugin (this=0x7f7464c097a0) at ../../../../../src/modules/plugin/base/src/nsNPAPIPlugin.cpp:290 #2 0x00007f747887ce20 in nsNPAPIPlugin::Release (this=0x7f7464c097a0) at ../../../../../src/modules/plugin/base/src/nsNPAPIPlugin.cpp:221 #3 0x00007f7478890e1e in nsPluginTag::TryUnloadPlugin (this=0x7f7464c27200) at ../../../../dist/include/nsCOMPtr.h:640 #4 0x00007f7478887d79 in nsPluginHost::Destroy (this=0x7f7464ff3e80) at ../../../../../src/modules/plugin/base/src/nsPluginHost.cpp:2218 #5 0x00007f747888d15b in nsPluginHost::Observe (this=0x7f7464ff3e80, aSubject=<value optimized out>, aTopic=0x7f7478b93ca0 "xpcom-shutdown", someData=<value optimized out>) at ../../../../../src/modules/plugin/base/src/nsPluginHost.cpp:4673
Assignee: benjamin → jones.chris.g
Blocks: 531142
This is at dbaron's suggestion. Any other conditions under which we want lenient reaping?
Attachment #421501 - Flags: review?(benjamin)
Comment on attachment 421501 [details] [diff] [review] Use lenient reaping for NS_BUILD_REFCNT_LOGGING builds the ifdefs are kinda ugly, but ok
Attachment #421501 - Flags: review?(benjamin) → review+
Comment on attachment 421499 [details] [diff] [review] Add an extra EnsureProcessTerminated() parameter to control how lenient to be wrt child shutdown I'd sub the 'grim' param with 'force' since it's not immediately clear what 'grim' means. But r=me in any case.
Attachment #421499 - Flags: review?(bent.mozilla) → review+
(In reply to comment #10) > (From update of attachment 421501 [details] [diff] [review]) > the ifdefs are kinda ugly, but ok Agreed. I don't really know why this isn't also a problem on Windows, but I'm no win32 API guru.
Please leave open until it hits m-c.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Version: unspecified → Trunk
Yeah, that's this.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263576045.1263578203.23322.gz Linux mozilla-central debug test mochitest-other on 2010/01/15 09:20:45 s: moz2-linux-slave19
Status: ASSIGNED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Flags: in-testsuite+
Target Milestone: --- → mozilla1.9.3a1
Phil, could you file a new bug? We found and fixed the most common cause, but it's possible there are others lurking!
(In reply to comment #19) > Is > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264019073.1264020660.32626.gz#err6 > this, post-push, or something else doing the same thing? This error is on Windows. The fix in this bug was Linux-only. (In reply to comment #12) > (In reply to comment #10) > > (From update of attachment 421501 [details] [diff] [review] [details]) > > the ifdefs are kinda ugly, but ok > > Agreed. I don't really know why this isn't also a problem on Windows, but I'm > no win32 API guru. Can a Windows guy take a look at this?
Blocks: 540967
To the bug 540967 cave, Robin!
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: