Closed Bug 1009765 Opened 10 years ago Closed 10 years ago

Add GMP plugin crashes to FHR crash reporting

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
Firefox 34

People

(Reporter: benjamin, Assigned: adw)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [openh264-uplift])

Attachments

(2 files, 7 obsolete files)

We will have a new plugin type, GMP plugins for OpenH264. Bug 1009760 is going to implement crash reporting for these plugins and we need to add these crashes to FHR. This can probably trivially extend the work that adw did in bug 983313.
Flags: firefox-backlog+
Whiteboard: p=3 → p=3 [qa+]
Points: --- → 3
QA Whiteboard: [qa+]
Whiteboard: p=3 [qa+]
Points: 3 → 5
Whiteboard: [openh264-uplift]
Hi Marco -- We need this for OpenH264 in Fx33. It will need to be uplifted. Did this get pulled for this sprint (starting today)? Thanks.
Flags: needinfo?(mmucci)
Hi Maire, Drew selected this work at the planning meeting today for Iteration 34.1. I'm just late in updating Bugzilla.
Assignee: nobody → adw
Status: NEW → ASSIGNED
Iteration: --- → 34.1
Flags: needinfo?(mmucci)
Attached patch patch (obsolete) (deleted) — Splinter Review
Georg, I'd ask Benjamin to review but he's away. Could you review, or do you know who might be good? I think this is all there is to it... This extends the work in bug 983313. The two other bugs that made this possible are bug 1039575, which hooked up GMP abnormal shutdown to CrashReporterParent, and bug 1039579, which calls CrashReporterParent::NotifyCrashService for GMP crashes. How can I manually test this, let alone write an automated test for it?
Attachment #8461186 - Flags: review?(georg.fritzsche)
Attached patch patch v2 (obsolete) (deleted) — Splinter Review
Oops, this fixes uninitialized variable warnings I somehow missed.
Attachment #8461186 - Attachment is obsolete: true
Attachment #8461186 - Flags: review?(georg.fritzsche)
Attachment #8461194 - Flags: review?(georg.fritzsche)
Comment on attachment 8461194 [details] [diff] [review] patch v2 Review of attachment 8461194 [details] [diff] [review]: ----------------------------------------------------------------- Is the FHR provider modification [1] coming in a separate patch? Do we get GMP hang reports too? For manual testing you can: * make sure OpenH264 is installed (happens automatically after a minute or so, check in Tools -> Addons -> Plugins) * start an h264 call: http://mozilla.github.io/webrtc-landing/pc_test_h264.html * kill the plugin-container process For that you'll need to: * apply attachment 8460226 [details] [diff] [review] * undo attachment 8459601 [details] [diff] [review] * hopefully not run into deadlocks For automated testing of the full path we need to be able to crash the fake gmp plugin we're using. I suspect it doesn't have support for that yet, maybe jesup knows? Although the patch looks good to me, i'm not really familiar with the crash-service etc. - maybe Ted is a better contact? [1] in services/healthreport: providers.jsm, tests/xpcshell/test_provider_crashes.js, docs/dataformat.rst
Attachment #8461194 - Flags: review?(georg.fritzsche) → feedback+
(In reply to Georg Fritzsche [:gfritzsche] from comment #5) > For automated testing of the full path we need to be able to crash the fake > gmp plugin we're using. I suspect it doesn't have support for that yet, > maybe jesup knows?
Flags: needinfo?(rjesup)
(In reply to Georg Fritzsche [:gfritzsche] from comment #5) > Although the patch looks good to me, i'm not really familiar with the > crash-service etc. - maybe Ted is a better contact?
Flags: needinfo?(ted)
(In reply to Georg Fritzsche [:gfritzsche] from comment #6) > (In reply to Georg Fritzsche [:gfritzsche] from comment #5) > > For automated testing of the full path we need to be able to crash the fake > > gmp plugin we're using. I suspect it doesn't have support for that yet, > > maybe jesup knows? FWIW, our NPAPI test plugins have crash() and hang() methods accessible from script, which is how we get proper test coverage for those code paths.
(In reply to Georg Fritzsche [:gfritzsche] from comment #5) > Is the FHR provider modification [1] coming in a separate patch? Yeah, at least later, maybe in the same patch. I wanted to make sure the CrashReporterParent changes were correct first. > Do we get GMP hang reports too? What's the hang code path? For normal plugins, I see PluginModuleParent::TerminateChildProcess is called in a couple of cases, and that annotates the crash report with "PluginHang". I grepped content/media/gmp/ for "hang" and only found that GMPProcessChild starts and stops the BackgroundHangMonitor, but no one else uses it, and it looks like BackgroundHangMonitor only reports to telemetry anyway. > For manual testing you can: Thanks, I'll try that. > Although the patch looks good to me, i'm not really familiar with the > crash-service etc. - maybe Ted is a better contact? gps could probably review that part, or would you prefer that I ask him to review all the changes? (In reply to Georg Fritzsche [:gfritzsche] from comment #8) > FWIW, our NPAPI test plugins have crash() and hang() methods accessible from > script, which is how we get proper test coverage for those code paths. Yes, but it sounds like you're saying you don't know whether GMP have similar methods?
(In reply to Drew Willcoxon :adw from comment #9) > > Do we get GMP hang reports too? > > What's the hang code path? For normal plugins, I see > PluginModuleParent::TerminateChildProcess is called in a couple of cases, > and that annotates the crash report with "PluginHang". I grepped > content/media/gmp/ for "hang" and only found that GMPProcessChild starts and > stops the BackgroundHangMonitor, but no one else uses it, and it looks like > BackgroundHangMonitor only reports to telemetry anyway. Ok, looks like we don't handle or expect GMP hangs right now. > > Although the patch looks good to me, i'm not really familiar with the > > crash-service etc. - maybe Ted is a better contact? > > gps could probably review that part, or would you prefer that I ask him to > review all the changes? gps & me splitting the review would work for me. > (In reply to Georg Fritzsche [:gfritzsche] from comment #8) > > FWIW, our NPAPI test plugins have crash() and hang() methods accessible from > > script, which is how we get proper test coverage for those code paths. > > Yes, but it sounds like you're saying you don't know whether GMP have > similar methods? Right, hence the needinfo? for jesup. I think we should have support for that in our test GMP plugin. However, given the current pressure for getting things working in 33 it's unlikely to be added now.
Flags: needinfo?(ted)
(In reply to Georg Fritzsche [:gfritzsche] from comment #10) > Right, hence the needinfo? for jesup. I think we should have support for > that in our test GMP plugin. > However, given the current pressure for getting things working in 33 it's > unlikely to be added now. Filed bug 1044408 on this.
Attached patch CrashReporterParent.cpp change (deleted) — Splinter Review
Georg, I'm separating this into this CrashReporterParent.cpp change, and CrashManager, CrashService, FHR changes that I'll post next and ask gps to review. I'm not sure if that's a good separation, so please let me know if not. (In reply to Georg Fritzsche [:gfritzsche] from comment #5) > For manual testing you can: > * make sure OpenH264 is installed (happens automatically after a minute or > so, check in Tools -> Addons -> Plugins) > * start an h264 call: > http://mozilla.github.io/webrtc-landing/pc_test_h264.html > * kill the plugin-container process I wasn't able to test this manually. I hit this after killing the plugin container: > * thread #25: tid = 0x5df30c, 0x0000000102683119 XUL`mozilla::WebrtcGmpVideoEncoder::SetRates_g(this=0x0000000125de54c0, aNewBitRate=413, aFrameRate=28) + 137 at WebrtcGmpVideoCodec.cpp:312, name = 'GMPThread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) > frame #0: 0x0000000102683119 XUL`mozilla::WebrtcGmpVideoEncoder::SetRates_g(this=0x0000000125de54c0, aNewBitRate=413, aFrameRate=28) + 137 at WebrtcGmpVideoCodec.cpp:312 > 309 WebrtcGmpVideoEncoder::SetRates_g(uint32_t aNewBitRate, uint32_t aFrameRate) > 310 { > 311 MOZ_ASSERT(mGMP); > -> 312 GMPErr err = mGMP->SetRates(aNewBitRate, aFrameRate); > 313 if (err != GMPNoErr) { > 314 return WEBRTC_VIDEO_CODEC_ERROR; > 315 }
Attachment #8461194 - Attachment is obsolete: true
Attachment #8464292 - Flags: review?(georg.fritzsche)
Attached patch CrashManager, CrashService, FHR changes (obsolete) (deleted) — Splinter Review
Attachment #8464293 - Flags: review?(gps)
Attached patch CrashManager, CrashService, FHR changes v2 (obsolete) (deleted) — Splinter Review
Removed a stray dump().
Attachment #8464293 - Attachment is obsolete: true
Attachment #8464293 - Flags: review?(gps)
Attachment #8464320 - Flags: review?(gps)
Comment on attachment 8464320 [details] [diff] [review] CrashManager, CrashService, FHR changes v2 Review of attachment 8464320 [details] [diff] [review]: ----------------------------------------------------------------- I don't think you uploaded the correct patch.
Attachment #8464320 - Flags: review?(gps)
Attachment #8464320 - Attachment is obsolete: true
Attachment #8464325 - Flags: review?(gps)
Attachment #8464292 - Flags: review?(georg.fritzsche) → review+
(In reply to Drew Willcoxon :adw from comment #12) > (In reply to Georg Fritzsche [:gfritzsche] from comment #5) > > For manual testing you can: > > * make sure OpenH264 is installed (happens automatically after a minute or > > so, check in Tools -> Addons -> Plugins) > > * start an h264 call: > > http://mozilla.github.io/webrtc-landing/pc_test_h264.html > > * kill the plugin-container process > > I wasn't able to test this manually. I hit this after killing the plugin > container: > > > * thread #25: tid = 0x5df30c, 0x0000000102683119 XUL`mozilla::WebrtcGmpVideoEncoder::SetRates_g(this=0x0000000125de54c0, aNewBitRate=413, aFrameRate=28) + 137 at WebrtcGmpVideoCodec.cpp:312, name = 'GMPThread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) > > frame #0: 0x0000000102683119 XUL`mozilla::WebrtcGmpVideoEncoder::SetRates_g(this=0x0000000125de54c0, aNewBitRate=413, aFrameRate=28) + 137 at WebrtcGmpVideoCodec.cpp:312 > > 309 WebrtcGmpVideoEncoder::SetRates_g(uint32_t aNewBitRate, uint32_t aFrameRate) > > 310 { > > 311 MOZ_ASSERT(mGMP); > > -> 312 GMPErr err = mGMP->SetRates(aNewBitRate, aFrameRate); > > 313 if (err != GMPNoErr) { > > 314 return WEBRTC_VIDEO_CODEC_ERROR; > > 315 } There were some recent fixups and that method looks familiar. Have you tried testing on the current mozilla-inbound?
(In reply to Georg Fritzsche [:gfritzsche] from comment #6) > (In reply to Georg Fritzsche [:gfritzsche] from comment #5) > > For automated testing of the full path we need to be able to crash the fake > > gmp plugin we're using. I suspect it doesn't have support for that yet, > > maybe jesup knows? Support for crashing the test GMP plugin is coming up in bug 1044408. Maybe base a proper test on that or file a follow-up dependent on it? Again, i recommend using a current mozilla-inbound if you're writing a test now.
Flags: needinfo?(rjesup)
(In reply to Georg Fritzsche [:gfritzsche] from comment #18) > Support for crashing the test GMP plugin is coming up in bug 1044408. Maybe > base a proper test on that or file a follow-up dependent on it? > Again, i recommend using a current mozilla-inbound if you're writing a test > now. Thanks, I don't get the segfault anymore after updating my tree, but neither crashing manually nor bug 1044408 trigger the code path in my patch because they don't leave minidumps. Without a minidump and crash ID, CrashReporterParent::GenerateChildData and CrashReporterParent::NotifyCrashService are bypassed entirely. Bug 1010476 will fix that. Is there a way to crash GMP plugins and leave a minidump like there is for regular plugins? I wonder how crash() on regular plugins works...
(In reply to Drew Willcoxon :adw from comment #19) > Thanks, I don't get the segfault anymore after updating my tree, but neither > crashing manually nor bug 1044408 trigger the code path in my patch because > they don't leave minidumps. Without a minidump and crash ID, > CrashReporterParent::GenerateChildData and > CrashReporterParent::NotifyCrashService are bypassed entirely. Bug 1010476 > will fix that. > > Is there a way to crash GMP plugins and leave a minidump like there is for > regular plugins? I wonder how crash() on regular plugins works... Per bug 1044408 we now have an easy way: every time you flip the pref "media.gmp.plugin.crash" to true it crashes with a signature like bp-f53d6fec-9312-4d1f-9548-804512140804.
OK, I wasn't setting MOZ_CRASHREPORTER. I can now manually verify that the patches are working. I see this in about:healthreport's raw data: "org.mozilla.crashes.crashes": { "_v": 5, "gmplugin-crash": 1 }, This should really have an automated test as part of the landing, so I'll work on one now. I'll probably attach it as a separate patch so it doesn't interfere with gps's pending review of the other patch.
Iteration: 34.1 → 34.2
Comment on attachment 8464325 [details] [diff] [review] CrashManager, CrashService, FHR changes v3 Review of attachment 8464325 [details] [diff] [review]: ----------------------------------------------------------------- Looks good! Please send an email to fhr-dev@mozilla.org when this lands to announce the new fields in FHR's data.
Attachment #8464325 - Flags: review?(gps) → review+
Attached patch test + CrashManager.jsm changes WIP (obsolete) (deleted) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=fd9f5d341f6f The test is in dom/media/tests/mochitest/, where I saw some other related tests. It uses the handy framework in that directory. I had to modify CrashManager.jsm further, by adding CrashManager._promiseStore, so this includes all CrashManager.jsm changes, including the one in the other patch. I'll ask for review when try looks good.
Attached patch test and related changes WIP v2 (obsolete) (deleted) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=58a3558ef78d The previous try had a couple of failures. I disabled the test on e10s since the test is a mochitest, and the CrashManager can only be used in the main process. I also modified GMPChild::RecvCrashPluginNow to call mozilla::NoteIntentionalCrash("gmplugin") and the test to call SimpleTest.expectChildProcessCrash(). Finally, the test clears the CrashManager store before the crash.
Attachment #8467431 - Attachment is obsolete: true
There were a couple of more failures. One I think I've fixed locally. The other failure is "This test did not leave any crash dumps behind, but we were expecting some!" for the test I'm adding. It looks like the problem is due to how the test harness finds dumps. It observes the "plugin-crashed" and "ipc:content-shutdown" notifications here, but GMP crashes don't broadcast either of them: http://mxr.mozilla.org/mozilla-central/source/testing/specialpowers/content/SpecialPowersObserverAPI.js#88 So... I guess we're going to have to make GMP crashes do what nsPluginHost::PluginCrashed does: fill in a property bag with dump ID, etc. and broadcast a notification?
There's a gmp-plugin-crash notification: http://hg.mozilla.org/mozilla-central/annotate/6cbdd4d523a7/content/media/gmp/GMPParent.cpp#l498 It sends a weird space-delimited string that has the dumpID in it. I'm not sure why it's like that instead of a property bag, maybe bsmedberg can answer? There's only one observer for the notifiation so it wouldn't be hard to change if needed: http://hg.mozilla.org/mozilla-central/annotate/6cbdd4d523a7/dom/media/PeerConnection.js#l149
Thanks, Ted. My test is consistently leaking a CondVar, GeckoChildProcessHost, and Mutex... I haven't had time to track it down. Anybody know what that's about? Other than that, this works...
Attachment #8464292 - Attachment is obsolete: true
Attachment #8464325 - Attachment is obsolete: true
Attachment #8467933 - Attachment is obsolete: true
(In reply to Drew Willcoxon :adw from comment #27) > My test is consistently leaking a CondVar, GeckoChildProcessHost, and > Mutex... I haven't had time to track it down. Anybody know what that's > about? jesup, have you seen this before or got an idea?
Flags: needinfo?(rjesup)
Georg, if we're not able to fix the leak, what do you think about landing the two r+'ed patches and pushing the test to a follow-up bug? This bug was carried over from the previous iteration, and I'm feeling pressured to land it ASAP.
Flags: needinfo?(georg.fritzsche)
I wouldn't mind the iteration too much (that book-keeping shouldn't pressure us to skip things), but it would be good to get the FHR data going on Fx 34 & 33. I think landing this now is good as long as getting the test landed stays a priority.
Flags: needinfo?(georg.fritzsche)
OK, thanks. Well I'm definitely feeling pressured. In fact the possibility of doing this was brought up during this week's late iteration meeting. I'll land the r+'ed patches once inbound reopens and then file a follow-up for the test.
Attachment #8464292 - Attachment is obsolete: false
Attachment #8464325 - Attachment is obsolete: false
Attachment #8468914 - Attachment is obsolete: true
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 34
Depends on: 1052141
Follow-up bug 1052141 filed and fhr-dev@mozilla.org message sent.
QA Contact: drno → florin.mezei
QA Contact: florin.mezei → alexandra.lucinet
I tested using latest Nightly on Windows 7 64bit, Mac OS X 10.9.4 and Ubuntu 14.04 32bit. I`ve encountered some strange behavior for the data that`s stored in FHR or not after crashing Firefox using media.gmp.plugin.crash. Here are some detailed results: Windows: https://pastebin.mozilla.org/5960489 Ubuntu: https://pastebin.mozilla.org/5961266 Mac OS X: https://pastebin.mozilla.org/5960961
Flags: needinfo?(adw)
Bogdan, summarizing your pastes, there are several problems: * The main process crashed when you tried to crash the GMP plugin. I see the same thing on Nightly on OS X, and see the same top stack frame, mozilla::gmp::PGMPParent::DeallocShmems. https://crash-stats.mozilla.com/report/index/44966eed-e393-4bea-8c64-41eb52140813 tells me bug 1049501 is related, but I can't access it. * FHR recorded GMP crash submissions as plugin-crash-submission-*, but they should be recorded as gmplugin-crash-submission-*. I'm guessing that's due to passing PROCESS_TYPE_PLUGIN here: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser-plugins.js?rev=e76d723927e1#508 Could you please file a bug? * Your first GMP crash on Linux was not recorded in FHR. I don't know what that's about. If you can consistently reproduce it, please file a bug. * The plugins on the test page don't restart after the first crash on Linux. I don't know either, but it's probably not related to this bug.
Flags: needinfo?(adw)
Blocks: 1053473
(In reply to Drew Willcoxon :adw from comment #37) > * FHR recorded GMP crash submissions as plugin-crash-submission-*, but they > should be recorded as gmplugin-crash-submission-*. I'm guessing that's due > to passing PROCESS_TYPE_PLUGIN here: > http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser- > plugins.js?rev=e76d723927e1#508 Could you please file a bug? I filed bug 1053473 myself.
Blocks: 1053748
(In reply to Drew Willcoxon :adw from comment #37) > Bogdan, summarizing your pastes, there are several problems: > > * Your first GMP crash on Linux was not recorded in FHR. I don't know what > that's about. If you can consistently reproduce it, please file a bug. I was able to reproduce this but not every time, I logged bug 1053748 for that. It may be possible that the root of all this problems is bug 1049501, I will have to test this again after the dependencies are fixed and bug 1049501 as well so leaving [qa+] on until then.
(moving needinfo to bug 1052141)
Flags: needinfo?(rjesup)
Iteration: 34.2 → 34.3
QA Whiteboard: [qa+]
Flags: qe-verify+
Blocks: 1057484
Retested on Windows 8.1 64bit, Windows 7 64bit, Ubuntu 12.04 32bit and Mac OS X 10.9.4 using latest Nightly 34.0a1 and I can confirm that crashing during a call I get the correct data in FHR: "org.mozilla.crashes.crashes": { "_v": 5, "gmplugin-crash": 1, "gmplugin-crash-submission-succeeded": 1 },
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: