Closed Bug 1239258 Opened 9 years ago Closed 5 years ago

Intermittent TEST-UNEXPECTED-FAIL | leakcheck | plugin process: missing output line for total leaks!

Categories

(Core Graveyard :: Plug-ins, defect, P3)

defect

Tracking

(firefox46 affected)

RESOLVED WORKSFORME
Tracking Status
firefox46 --- affected

People

(Reporter: nigelb, Unassigned)

References

()

Details

(Keywords: intermittent-failure)

https://treeherder.mozilla.org/logviewer.html#?job_id=19725226&repo=mozilla-inbound 01:01:20 INFO - --DOMWINDOW == 1 (0x7fcd1477c800) [pid = 5218] [serial = 201] [outer = (nil)] [url = chrome://browser/content/browser.xul] 01:01:20 INFO - --DOMWINDOW == 0 (0x7fcd31c49400) [pid = 5218] [serial = 5] [outer = (nil)] [url = resource://gre-resources/hiddenWindow.html] 01:01:21 INFO - nsStringStats 01:01:21 INFO - => mAllocCount: 845393 01:01:21 INFO - => mReallocCount: 102250 01:01:21 INFO - => mFreeCount: 845393 01:01:21 INFO - => mShareCount: 1123197 01:01:21 INFO - => mAdoptCount: 26878 01:01:21 INFO - => mAdoptFreeCount: 26878 01:01:21 INFO - => Process ID: 5218, Thread ID: 140519937627968 01:01:21 INFO - TEST-INFO | Main app process: exit 0 01:01:21 INFO - runtests.py | Application ran for: 0:09:57.302575 01:01:21 INFO - zombiecheck | Reading PID log: /tmp/tmpcKwv82pidlog 01:01:21 INFO - ==> process 5218 launched child process 5275 01:01:21 INFO - ==> process 5218 launched child process 5371 01:01:21 INFO - ==> process 5218 launched child process 5380 01:01:21 INFO - ==> process 5218 launched child process 5389 01:01:21 INFO - zombiecheck | Checking for orphan process with PID: 5275 01:01:21 INFO - zombiecheck | Checking for orphan process with PID: 5371 01:01:21 INFO - zombiecheck | Checking for orphan process with PID: 5380 01:01:21 INFO - zombiecheck | Checking for orphan process with PID: 5389 01:01:21 INFO - Stopping web server 01:01:21 INFO - Stopping web socket server 01:01:21 INFO - Stopping ssltunnel 01:01:21 INFO - TEST-INFO | leakcheck | default process: leak threshold set at 0 bytes 01:01:21 INFO - TEST-INFO | leakcheck | plugin process: leak threshold set at 0 bytes 01:01:21 INFO - TEST-INFO | leakcheck | tab process: leak threshold set at 10000 bytes 01:01:21 INFO - TEST-INFO | leakcheck | geckomediaplugin process: leak threshold set at 20000 bytes 01:01:21 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, plugin process 5380 01:01:21 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->| 01:01:21 INFO - | | Per-Inst Leaked| Total Rem| 01:01:21 INFO - 0 |TOTAL | 47 0| 848 0| 01:01:21 INFO - nsTraceRefcnt::DumpStatistics: 28 entries 01:01:21 INFO - TEST-PASS | leakcheck | plugin process: no leaks detected! 01:01:21 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 5218 01:01:21 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->| 01:01:21 INFO - | | Per-Inst Leaked| Total Rem| 01:01:21 INFO - 0 |TOTAL | 22 0|46519274 0| 01:01:21 INFO - nsTraceRefcnt::DumpStatistics: 1465 entries 01:01:21 INFO - TEST-PASS | leakcheck | default process: no leaks detected! 01:01:21 WARNING - TEST-UNEXPECTED-FAIL | leakcheck | plugin process: missing output line for total leaks! 01:01:21 INFO - TEST-INFO | leakcheck | missing output line from log file /tmp/tmpnS2WoB.mozrunner/runtests_leaks_plugin_pid5371.log 01:01:21 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, tab process 5275 01:01:21 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->| 01:01:21 INFO - | | Per-Inst Leaked| Total Rem| 01:01:21 INFO - 0 |TOTAL | 22 4648|26517147 34| 01:01:21 INFO - 16 |AsyncTransactionTrackersHolder | 72 72| 331 1| 01:01:21 INFO - 84 |CompositorChild | 880 880| 1 1| 01:01:21 INFO - 86 |CondVar | 40 120| 880 3| 01:01:21 INFO - 262 |IPC::Channel | 16 32| 18 2| 01:01:21 INFO - 329 |MessagePump | 16 16| 50 1| 01:01:21 INFO - 339 |Mutex | 32 96| 8520 3| 01:01:21 INFO - 363 |PCompositorChild | 776 776| 1 1| 01:01:21 INFO - 371 |PImageBridgeChild | 920 920| 1 1| 01:01:21 INFO - 445 |RefCountedMonitor | 80 160| 13 2| 01:01:21 INFO - 447 |RefCountedTask | 16 64| 26 4| 01:01:21 INFO - 507 |StoreRef | 16 32| 21 2| 01:01:21 INFO - 572 |WaitableEventKernel | 72 72| 53 1| 01:01:21 INFO - 578 |WeakReference<MessageListener> | 16 32| 6705 2| 01:01:21 INFO - 630 |base::Thread | 48 48| 3 1| 01:01:21 INFO - 663 |ipc::MessageChannel | 512 1024| 13 2| 01:01:21 INFO - 1136 |nsTArray_base | 8 48| 8771838 6| 01:01:21 INFO - 1146 |nsThread | 256 256| 49 1| 01:01:21 INFO - nsTraceRefcnt::DumpStatistics: 1255 entries 01:01:21 INFO - TEST-INFO | leakcheck | tab process: leaked 1 AsyncTransactionTrackersHolder 01:01:21 INFO - TEST-INFO | leakcheck | tab process: leaked 1 CompositorChild
Andrew, Can you take a look at this one? It's kind of weird.
Flags: needinfo?(continuation)
What this assertion means is that the plugin process crashed before it was able to output a leak log. It is odd that the actual crash doesn't show up in the summary. That's probably a separate bug. I looked at the two logs for this that are in orange factor. If you look at the line after the TEST-UNEXPECTED-FAIL line, it has the file name for the leak log, which contains the PID for the process that crashed: runtests_leaks_plugin_pid5371.log. You can then look back in the log to see where that process was created, and how it crashed. The crash looks like this: [NPAPI 4969] ###!!! ASSERTION: plug removed: 'glib assertion', file /builds/slave/m-in-l64-d-0000000000000000000/build/src/toolkit/xre/nsSigHandlers.cpp, line 140 #01: libglib-2.0.so.0 + 0x4ef61 #02: libglib-2.0.so.0 + 0x4f172 #03: libnptest.so + 0x53d4 #04: libgtk-x11-2.0.so.0 + 0x136dd8 #05: libgobject-2.0.so.0 + 0xfca2 [...] #17: libglib-2.0.so.0 + 0x48124 #18: base::MessagePumpForUI::RunWithDispatcher(base::MessagePump::Delegate*, base::MessagePumpForUI::Dispatcher*) [ipc/chromium/src/base/message_pump_glib.cc:192] #19: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:235] #20: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:520] #21: XRE_InitChildProcess [toolkit/xre/nsEmbedFunctions.cpp:629] #22: content_process_main(int, char**) [ipc/contentproc/plugin-container.cpp:240] #23: libc.so.6 + 0x2176d #24: _start ** (plugin-container:4969): ERROR **: plug removed [Parent 4811] WARNING: [PluginModuleParent::ActorDestroy] abnormal shutdown without minidump!: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/dom/plugins/ipc/PluginModuleParent.cpp, line 1592 In both logs, the crash is happening in test_fullscreen-api.html, which was just enabled for e10s in bug 1191597. I don't know if this is a problem with the full screen API or with some odd plugin thing. Xidorn, do you know what is happening here? Thanks.
Blocks: 1191597
Component: General → Plug-ins
Flags: needinfo?(continuation) → needinfo?(quanxunzhen)
Product: Firefox → Core
It looks like bug 1237853 was filed on this issue already, but I guess it wasn't expected to be intermittently happening still?
I filed bug 1239343 for the test harness not noticing the fatal plugin assertion.
(In reply to Andrew McCreight [:mccr8] from comment #2) > What this assertion means is that the plugin process crashed before it was > able to output a leak log. It is odd that the actual crash doesn't show up > in the summary. That's probably a separate bug. > > I looked at the two logs for this that are in orange factor. If you look at > the line after the TEST-UNEXPECTED-FAIL line, it has the file name for the > leak log, which contains the PID for the process that crashed: > runtests_leaks_plugin_pid5371.log. You can then look back in the log to see > where that process was created, and how it crashed. > > The crash looks like this: > > [NPAPI 4969] ###!!! ASSERTION: plug removed: 'glib assertion', file > /builds/slave/m-in-l64-d-0000000000000000000/build/src/toolkit/xre/ > nsSigHandlers.cpp, line 140 > #01: libglib-2.0.so.0 + 0x4ef61 > #02: libglib-2.0.so.0 + 0x4f172 > #03: libnptest.so + 0x53d4 FWIW, this is (per running addr2line on libnptest.so): DeleteWidget /builds/slave/m-in-l64-d-0000000000000000000/build/src/dom/plugins/test/testplugin/nptest_gtk2.cpp:264 which is explicitly calling g_error.
(In reply to Andrew McCreight [:mccr8] from comment #3) > It looks like bug 1237853 was filed on this issue already, but I guess it > wasn't expected to be intermittently happening still? Before I add the workaround, it was a permafail. I guess the workaround was not reliable enough, and there might be some race condition here. To totaly resolve this, that bug need to be fixed. But if we do not get much progress there while this intermittent gets more annoying, we probably can make the workaround a bit more reliable via allowing more time before closing the window. (BTW, why there is no red circle next to "Bugzilla@Mozilla" today given there is a ni? for me... probably a bug need to be filed...)
Flags: needinfo?(quanxunzhen)
Depends on: 1237853
Ed, is there anything we can do to make this failure starrable in Treeherder?
Flags: needinfo?(emorley)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #12) > Ed, is there anything we can do to make this failure starrable in Treeherder? It's already starrable, the bug summary is just incomplete. The algorithm (https://github.com/mozilla/treeherder/blob/cda4b33cef44192a73546be51c593ba0ad838e05/treeherder/model/error_summary.py#L97-L121) is roughly as follows: 1) Try leak-specific search if line is of form "leakcheck | .*\d+ bytes leaked (Object-1, Object-2, Object-3, ...)" 2) Else if in pipe-delimited format && middle token not blacklisted/too short a string, then search using the middle token 3) Otherwise try the full line In this case, we reach #3, since the middle token is "leakcheck", which is in the blacklist: https://github.com/mozilla/treeherder/blob/cda4b33cef44192a73546be51c593ba0ad838e05/treeherder/model/error_summary.py#L171
Flags: needinfo?(emorley)
Summary: Intermittent leakcheck | plugin process: missing output line for total leaks! → Intermittent TEST-UNEXPECTED-FAIL | leakcheck | plugin process: missing output line for total leaks!
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
I looked at one log, and it looked like there was an assertion. https://public-artifacts.taskcluster.net/JYpgqo41R-OxQaoEn2ibnQ/0/public/logs/live_backing.log I wonder why the log parser is missing this. Maybe the weird character. 09:02:04 INFO - [NPAPI 3029] ###!!! ASSERTION: plug removed: 'glib assertion', file /home/worker/workspace/build/src/toolkit/xre/nsSigHandlers.cpp, line 140
Oh I already noticed that in comment 2...
Bug 1239362 is now fixed, but this error is still occurring sometimes, and the "plug removed" assertion is not involved.
Depends on: 1239362
The volume did seem to go down quite a bit after that bug landed. I looked at the two failures in the last week. If you look at the line after the UNEXPECTED-FAIL line, it gives the PID of the process that failed to create a leak log: TEST-INFO | leakcheck | missing output line from log file /tmp/tmpDERMDD.mozrunner/runtests_leaks_plugin_pid3384.log Then you can search for the pid in the log. For the Linux one, I see something about "received an X Window System error". Maybe that is related? That log is odd because it looks like shutdown gets far enough to print out nsStringStats. It looks like that's happening during dom/html/test/test_fullscreen-api.html. For the Windows one, I'm not sure if there are any errors in the log around when the plugin process is created, but it looks like it is created during dom/plugins/test/mochitest/test_hangui.xul so maybe something went wrong with that.
(In reply to Andrew McCreight [:mccr8] from comment #46) > For the Linux one, I see something about "received an X Window System > error". Maybe that is related? [...] It looks like that's > happening during dom/html/test/test_fullscreen-api.html. BadWindow with request_code 3, which is X_GetWindowAttributes, does sound related, thanks, and likely still due to bug 1237853. GDK2 (used in the plugin process) is calling exit(1), without using g_error(), and so this is missing our X/GDK error handler and not triggering a crash report. I suspect it would be possible to InstallX11ErrorHandler() (designed for GDK2) for the plugin process to get slightly better error messages and trigger the crash reporter.
That actually sounds similar to bug 1309420 comment 14: when we assert, we are okay, but when we hit some X windows error, we leak (probably because we exit without shutting things down properly). I just worked around that by disabling the test, which wasn't running in debug builds either.
Depends on: 1331320
Recent occurrences here are actually Bug 1497029 as all failure logs have 22:39:17 INFO - File "Z:\task_1538864482\build\tests\bin\fix_stack_using_bpsyms.py", line 76, in __init__ 22:39:17 INFO - (rva, size, line, filenum) = line.split(None) before the errors, so please disregard the next Orange factor comment regarding the failures on the 6th of October.

No failures since Dec 2018.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.