Closed Bug 787319 Opened 12 years ago Closed 5 years ago

[meta] crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so on Sony devices with Adreno 220/225 GPUs or devices with PowerVR SGX 531 GPU running ICS

Categories

(Core :: Audio/Video: Playback, defect)

17 Branch
ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox16 --- unaffected
firefox17 - affected
firefox18 --- affected
firefox19 --- affected
fennec + ---

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, meta, regression, Whiteboard: [native-crash], [swdecoder])

Crash Data

Attachments

(1 obsolete file)

It first appeared in 17.0a1/20120824. The regression range might be (discontinuous across builds): http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5650196a8c7d&tochange=1c0ac073dc65 Signature libomxplugin.so@0xde3 More Reports Search UUID 55df875c-562d-4caa-8f7a-e39a12120830 Date Processed 2012-08-30 17:54:32 Uptime 12 Last Crash 36 seconds before submission Install Age 51 seconds since version was first installed. Install Time 2012-08-30 17:53:32 Product FennecAndroid Version 18.0a1 Build ID 20120830030531 Release Channel nightly OS Linux OS Version 0.0.0 Linux 3.0.8+1.0.21100-30145-01926-g787c968 #1 SMP PREEMPT Tue May 29 20:15:09 2012 armv7l Build Architecture arm Build Architecture Info Crash Reason SIGSEGV Crash Address 0xc77a04ec App Notes AdapterDescription: 'Qualcomm -- Adreno (TM) 220 -- OpenGL ES 2.0 2184622 -- Model: LT26i, Product: LT26i_1257-9042, Manufacturer: Sony Ericsson, Hardware: semc' EGL? EGL+ GL Context? GL Context+ GL Layers? GL Layers+ Sony Ericsson LT26i SEMC/LT26i_1257-9042/LT26i:4.0.4/6.1.A.0.453/Ov5_zw:user/release-keys EMCheckCompatibility True Adapter Vendor ID Qualcomm Adapter Device ID Adreno (TM) 220 Frame Module Signature Source 0 @0xc77a04ec 1 libomxplugin.so libomxplugin.so@0xde3 2 libstagefright.so libstagefright.so@0x64303 3 libstagefright.so libstagefright.so@0x64317 4 libutils.so libutils.so@0x20c77 5 libstagefright.so libstagefright.so@0x64303 6 libstagefright.so libstagefright.so@0x64303 7 libstagefright.so libstagefright.so@0x64203 8 libstagefright.so libstagefright.so@0x14494e 9 libstagefright.so libstagefright.so@0x13de16 10 libstagefright.so libstagefright.so@0x74b13 11 libomxplugin.so OmxDecoder::Init OmxPlugin.cpp:375 12 @0x7 13 libomxplugin.so CreateDecoder OmxPlugin.cpp:693 14 libxul.so nsMediaPluginHost::CreateDecoder nsMediaPluginHost.cpp:125 15 libxul.so nsMediaPluginReader::ReadMetadata nsMediaPluginReader.cpp:42 16 libxul.so nsBuiltinDecoderStateMachine::DecodeMetadata nsBuiltinDecoderStateMachine.cpp:1720 17 libxul.so nsBuiltinDecoderStateMachine::DecodeThreadRun nsBuiltinDecoderStateMachine.cpp:479 18 libxul.so nsRunnableMethodImpl<void , true>::Run nsThreadUtils.h:349 19 libxul.so nsThread::ProcessNextEvent nsThread.cpp:624 20 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:220 21 libxul.so nsThread::ThreadFunc nsThread.cpp:257 22 libnspr4.so _pt_root nsprpub/pr/src/pthreads/ptthread.c:156 23 libc.so libc.so@0x13496 24 libc.so libc.so@0x12fd2 More reports at: https://crash-stats.mozilla.com/report/list?signature=libomxplugin.so%400xde3
These crashes are all various Sony devices with Adreno 220 or 225 GPUs running ICS 4.0.4. The OmxPlugin.cpp code near line 375 is calling mAudioSource->read(). I wonder if this crash is related to bug 785340, a different mAudioSource->read() crash that also started around 17.0a1/20120824.
Priority: -- → P1
Summary: crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so@0xde3 on ICS → crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so@0xde3 on Sony devices with Adreno 220/225 GPUs running ICS
Crash Signature: [@ libomxplugin.so@0xde3] → [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37]
Summary: crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so@0xde3 on Sony devices with Adreno 220/225 GPUs running ICS → crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so on Sony devices with Adreno 220/225 GPUs running ICS
It's #2 top crasher in 17.0b1 with many dupes. I assume you can't browse with those devices.
Keywords: topcrash
tracking-fennec: --- → ?
Crash Signature: [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37] → [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37] [@ libcutils.so@0x61f4] [@ libcutils.so@0x638c] [@ libcutils.so@0x62a0]
Summary: crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so on Sony devices with Adreno 220/225 GPUs running ICS → crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so on Sony devices with Adreno 220/225 GPUs or devices with PowerVR SGX 531 GPU running ICS
This has moved up into the #1 spot for crashing - handing over to Chris for further investigation.
Assignee: nobody → cpeterson
AFAICT, when Stagefright's MediaExtractor::Create() is sniffing the media file type, it calls back into our OmxPlugin's MediaStreamSource::readAt() with bad data. This is surprising because Stagefright is just peeking at the first few bytes of the media data; we haven't even begun decoding video yet. The Sony Ericsson LT26i (Xperia S aka Xperia NX aka "Nozomi") accounts for about half of the crash reports, so it's either very popular or very crashy.
Status: NEW → ASSIGNED
doublec, when you have a chance, can you please take a look at this Stagefright crash? This bug is our #1 topcrash for Fennec Beta 17. I'm supposed to be focusing 100% on B2G IME bugs right now, so my boss blassey has asked me to reassign my high-priority Stagefright bugs to you.
Assignee: cpeterson → chris.double
Blocks: 755364
Blocks: 782508
I don't have access to this type of phone. Can I get one sent to me?
I'll check the local importers today and see if any of them have the device I can expense.
I've been able to get a crash on a Sony Xperia Ion LT28H playing an mp4: F/libc ( 6063): Fatal signal 11 (SIGSEGV) at 0xc77a04ec (code=1)
If I use DataSource::CreateFromURI to create the datasource, passing a URI to an MP4 file, playback works although it uses the software decoder due to: I/OmxPlugin( 7827): Unknown video color format: 0x7fa30c03 I/OmxPlugin( 7827): Falling back to software decoder So for some reason our custom derived DataSource is causing the crash, much like it was on GingerBread. Does the Xperia have some sort of munged libstagefright?
Crash Signature: [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37] [@ libcutils.so@0x61f4] [@ libcutils.so@0x638c] [@ libcutils.so@0x62a0] → [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37] [@ libomxplugin.so@0xedb] [@ libstagefright.so@0x76974] [@ libcutils.so@0x61f4] [@ libcutils.so@0x638c] [@ libcutils.so@0x62a0]
All the crashing devices seem to be running the same OS version, ICS 4.0.4. I saw a possibly related problem with media sniffing (bug 782830). When trying to sniff the media type on the Samsung Galaxy S III (ICS 4.0.4), the call to MediaExtractor::Create() would never return. Stagefright's AwesomePlayer would start playing our audio track, so we never got a chance to play the video ourselves. Strangely, I only saw this problem on the Korean/international model of the Galaxy S III, not North American model.
(In reply to Chris Double (:doublec) from comment #9) > I/OmxPlugin( 7827): Unknown video color format: 0x7fa30c03 > I/OmxPlugin( 7827): Falling back to software decoder btw, color format 0x7fa30c03 is QOMX_COLOR_FormatYUV420PackedSemiPlanar64x32Tile2m8ka. The vlc source code claims this format is identical to NV12: http://git.videolan.org/gitweb.cgi/vlc.git/?p=vlc.git;a=blob;f=modules/codec/omxil/qcom.c;h=9f530b2d5a0f0396b387ed88e175695530cfc1e1;hb=4a8d38843ad4012e2ea0c75e0c485edd6a41ce07#l36
Whiteboard: [native-crash] → [native-crash], [swdecoder]
It looks like Sony are using a modified DataSource - the wrong virtual functions are being called in places. I'm trying to track down what their class layout is.
I have playback working now with a modified DataSource.h. The virtual functions in DataSource.h seem to have been re-arranged and there's a new one that gets called. If I return 'OK' for that new one and put readAt in the spot in the table that the device seems to expect it works. Remaining work: 1) More testing/minimize the patch 2) Change build to produce a shared library for these devices 3) Detect these devices somehow and load that specific shared library 4) Fix color conversion so hardware decoder works.
tracking-fennec: ? → 17+
I'm going to spin out (4) into a separate bug and just fix the crashing here since getting the color conversion working appears to be complicated. It's not NV12, the source mentioned in comment 11 seems to just refer to them naming the YUV420 portion of the identifier. For (3) I'm unsure how specific the detection should be so I'm appealing to the android people to get an idea of what they'd prefer: 1) Calculate a CRC of the libstagefright shared library and only use our modified code loaded on devices with that specific CRC. This will likely break on minor device updates. But we won't know if our code works on those updates anyway. 2) Check specific device and android versions. This could break if someone installs a 4.0.4 alternate rom with different stagefright libraries. Thoughts? Other options?
I think option 2 sounds safer and easier. Would Sony's busted libstagefright.so likely have the same CRC on different device models (even if they are all running ICS 4.0.4)?
I don't think there is such a think as anything being "likely" in the android world sadly.
Attached patch Fix crash (obsolete) (deleted) — Splinter Review
Fixes crash. Software decoding only due to lack of support for the color conversion format. I'll file a followup bug for this. I've whitelisted the devices and versions I saw in the crashreports. I'm not sure if this is the right approach and I'm open to suggestions from the Android team to decide how they want to enable/switch on this. I've only tested on a Sony Xperia Ion running 4.0,4 which previously crashed but with this patch plays video.
Attachment #673065 - Flags: review?(cpeterson)
Blocks: 803394
Bug 803394 is the followup bug for getting hardware decoding working on these devices.
(In reply to Chris Double (:doublec) from comment #17) > I've whitelisted the devices and versions I saw in the crashreports. Some devices are missing (those with PowerVR SGX 531 GPU and Huawei MediaPad with Adreno 220 GPU) in the code. Indeed, here are correlations per device for each signature in 17.0 Beta extracted from https://crash-analysis.mozilla.com/rkaiser/2012-10-17/2012-10-17.fennecandroid.beta.17.0.devices.weekly.html#sigs: * libomxplugin.so@0xe37 844 Sony Ericsson LT26i 591 Sony Ericsson LT22i 116 Sony ST25i 29 Sony Ericsson LT28h 22 Sony MT27i 21 Sony Ericsson IS12S 15 Sony Ericsson LT26w 15 Sony LT30p 13 Sony Ericsson LT28at 10 Sony SO-05D 6 Sony ST27i 3 Sony SO-04D 3 * libcutils.so@0x61f4 66 Alps GT-I9100 12 GT-I9300 GT-I9300 11 Samsung GT-I9100 9 Micromax A90 8 Alps A9550 5 Alps B79 5 Alps Zyrex_ZA987_OneScribe 5 Iball iball 4 Alps i-mobile i-style Q1 2 Alps i-mobile i-style Q2 2 Samsung GT-P1000 1 Motorola XT910 1 LENOVO Lenovo S880 1 * libstagefright.so@0x76974 35 HUAWEI MediaPad 35 * libcutils.so@0x638c 19 Alps GT-I9100 13 Alps i-mobile i-style Q3 6 * libcutils.so@0x63a0 4 LENOVO Lenovo A789 3 Fly IQ440 1 * libcutils.so@0x62a0 3 Alps I9220 3
(In reply to Scoobidiver from comment #19) > Some devices are missing (those with PowerVR SGX 531 GPU and Huawei MediaPad > with Adreno 220 GPU) in the code. I only have a single Sony device for testing and this fix is only for Sony devices. In fact I can only say the fix works on the Sony Xperia Ion. And fixes the crash I encountered in comment 8. I defer to the Android team to tell me what devices they want to whitelist given that information. Hopefully they have more devices that me to test with. > * libomxplugin.so@0xe37 844 > * libcutils.so@0x61f4 66 > * libstagefright.so@0x76974 35 > * libcutils.so@0x63a0 4 > * libcutils.so@0x62a0 3 Each of these signatures should be a separate bug report imho. Or each device should have a separate bug for it. Otherwise we're going to end up with one long bug constantly being reopened and closed for any crash in video playback.
(In reply to Chris Double (:doublec) from comment #20) > Each of these signatures should be a separate bug report imho. Or each > device should have a separate bug for it. I disagree. A bug is independent from the way it will be fixed or the device where it's tested. > Otherwise we're going to end up with one long bug constantly being reopened > and closed for any crash in video playback. If we want to add later a device currently not in crash stats to the newly created blacklist, we can file a dedicated bug.
Comment on attachment 673065 [details] [diff] [review] Fix crash Review of attachment 673065 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/plugins/nsMediaPluginHost.cpp @@ +95,5 @@ > PRLibrary *lib = PR_LoadLibrary(name); > if (lib) { > Manifest *manifest = static_cast<Manifest *>(PR_FindSymbol(lib, "MPAPI_MANIFEST")); > + if (manifest) { > + ALOG("Loaded omx plugin: %s", name); 1. Does this ALOG() need to be #ifdef'd ANDROID? 2. Would a "Failed to load" error message be helpful? Or is TryLoad() expected to be called with library names that might not be found (so an error message would cause unnecessary alarm)? @@ +105,5 @@ > nsMediaPluginHost::nsMediaPluginHost() { > MOZ_COUNT_CTOR(nsMediaPluginHost); > #if defined(ANDROID) && !defined(MOZ_WIDGET_GONK) > + nsCOMPtr<nsIPropertyBag2> infoService = do_GetService("@mozilla.org/system-info;1"); > + NS_ASSERTION(infoService, "Could not find a system info service"); I prefer the fatal MOZ_ASSERT() over NS_ASSERTION(). We would crash on a null infoService anyways, so we might as well pinpoint the source of the error with an accurate assert. @@ +111,5 @@ > + nsAutoString device; > + nsresult rv = infoService->GetPropertyAsAString(NS_LITERAL_STRING("device"), device); > + if (NS_SUCCEEDED(rv)) { > + ALOG("Android Device is: %s", NS_LossyConvertUTF16toASCII(device).get()); > + } If the "device" property is not found, I think we should log an error message and/or fatally MOZ_ASSERT(). @@ +119,5 @@ > + int32_t version; > + rv = infoService->GetPropertyAsInt32(NS_LITERAL_STRING("version"), &version); > + if (NS_SUCCEEDED(rv)) { > + ALOG("Android Version is: %d", version); > + } If the "version" property is not found, I think we should log an error message and/or fatally MOZ_ASSERT(). @@ +121,5 @@ > + if (NS_SUCCEEDED(rv)) { > + ALOG("Android Version is: %d", version); > + } > + > + if (version ==15 && Need a space between == and 15. @@ +132,5 @@ > + TryLoad("lib/libomxpluginsony.so"); > + } > + else { > + TryLoad("lib/libomxplugin.so"); > + } I think these tricky device and version checks should be extracted into a separate function. If you refactored this code into a function like `const char* GetOmxPluginName()`, you could replace these multiple calls to TryLoad() with just one. Also, GetOmxPluginName() would be a good place to add other whitelist/blacklist logic. The function could return NULL for blacklisted and unsupported devices. ::: media/omx-plugin/include/ics/stagefright/DataSource.h @@ +73,5 @@ > + virtual void foo1() { } > + virtual void foo2() { } > + virtual void foo3() { } > + virtual void foo4() { } > +#endif Since you named these virtual functions "foo", I assume you don't know what Sony expects them to do. Does Sony's libstagefright actually call these functions or do you just need them to pad DataSource's vtable? I think you should mark them private and add fatal MOZ_CRASH() or abort() even in release builds. If Sony's libstagefright calls our foo() functions, it will likely crash mysteriously later. I think a clean MOZ_CRASH() will make crash reports more informative.
Attachment #673065 - Flags: review?(cpeterson) → review-
(In reply to Chris Double (:doublec) from comment #20) > Each of these signatures should be a separate bug report imho. Or each > device should have a separate bug for it. Otherwise we're going to end up > with one long bug constantly being reopened and closed for any crash in > video playback. I agree with doublec that each crash address should probably be a separate bug report. Different addresses suggest that the reason for the crash is different or that the library version is different. Since this bug is about Sony beaking binary compatibility of the libstagefright.so ABI, library version is particularly important for this bug. Maybe we should file a new dependent bug for doublec's Sony libomxplugin.so@0xe37 fix and make this bug into a meta-bug?
Attachment #673065 - Attachment is obsolete: true
(In reply to Chris Peterson (:cpeterson) from comment #23) > Maybe we should file a new dependent bug for doublec's Sony > libomxplugin.so@0xe37 fix and make this bug into a meta-bug? I agree with that.
Keywords: meta
Depends on: 803794
Removing myself as assignee as I don't work on meta bugs. I'll work on any specific bugs raised for single issues.
Assignee: chris.double → nobody
tracking-fennec: 17+ → ?
Not tracking this bug, because it's a meta bug, but we'll track all the breakout bugs for each signature as their importance to 17 dictates.
Crash Signature: [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37] [@ libomxplugin.so@0xedb] [@ libstagefright.so@0x76974] [@ libcutils.so@0x61f4] [@ libcutils.so@0x638c] [@ libcutils.so@0x62a0] → [@ libomxplugin.so@0xde3] [@ libomxplugin.so@0xe37] [@ libomxplugin.so@0xedb] [@ libstagefright.so@0x76974] [@ libcutils.so@0x61f4] [@ libcutils.so@0x638c] [@ libcutils.so@0x62a0] [@ libcutils.so@0x63a0] [@ libcutils.so@0x6770]
Depends on: 804768
tracking-fennec: ? → 17+
Summary: crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so on Sony devices with Adreno 220/225 GPUs or devices with PowerVR SGX 531 GPU running ICS → [meta] crash in nsMediaPluginHost::CreateDecoder @ libomxplugin.so on Sony devices with Adreno 220/225 GPUs or devices with PowerVR SGX 531 GPU running ICS
Keywords: topcrash
no longer tracking 17 since we have a white list in place
tracking-fennec: 17+ → +
Depends on: 817478
Component: Audio/Video → Audio/Video: Playback
No assignee, updating the status.
Status: ASSIGNED → NEW
No assignee, updating the status.

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: