Closed Bug 1353313 Opened 8 years ago Closed 8 years ago

Intermittent PROCESS-CRASH | dom/media/test/test_streams_element_capture_createObjectURL.html | application crashed [@ mozilla::RemoteDataDecoder::Error]

Categories

(Core :: Audio/Video: Playback, defect, P1)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr45 --- unaffected
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 + fixed
firefox55 + fixed

People

(Reporter: cbook, Assigned: mchiang)

References

()

Details

(4 keywords)

Attachments

(2 files, 6 obsolete files)

(deleted), text/plain
Details
(deleted), patch
mchiang
: review+
abillings
: sec-approval+
Details | Diff | Splinter Review
https://treeherder.mozilla.org/logviewer.html#?job_id=88432119&repo=mozilla-central filing as sec-bug just in case for the Crash address: 0xe5e5e65d adress PROCESS-CRASH | dom/media/test/test_streams_element_capture_createObjectURL.html | application crashed [@ mozilla::RemoteDataDecoder::Error] Crash dump filename: /tmp/tmpHR_n_R/15e693df-9e1c-42af-04f1-b363e8aa2529.dmp Operating system: Android 0.0.0 Linux 3.10.40-gc16a3c6 #1 SMP PREEMPT Tue Jul 28 17:58:25 UTC 2015 armv7l CPU: arm ARMv1 Qualcomm Krait features: swp,half,thumb,fastmult,vfpv2,edsp,neon,vfpv3,tls,vfpv4,idiva,idivt 4 CPUs GPU: UNKNOWN Crash reason: SIGSEGV Crash address: 0xe5e5e65d Process uptime: not available Thread 21 (crashed) 0 libxul.so!mozilla::RemoteDataDecoder::Error [RefPtr.h:b5d8b27a7537 : 284 + 0x0] r0 = 0xe5e5e5e5 r1 = 0x949df868 r2 = 0x00000009 r3 = 0x9e3c5ea8 r4 = 0x83cfcb10 r5 = 0xe5e5e5e5 r6 = 0x9cdc34ad r7 = 0x00000001 r8 = 0x75c99628 r9 = 0xb4a20000 r10 = 0x00000000 r12 = 0x80000000 fp = 0x00000001 sp = 0x949df840 lr = 0x9cdc189d pc = 0x9cdc32a6 Found by: given as instruction pointer in context 1 libxul.so!mozilla::jni::NativeStub<mozilla::java::CodecProxy::NativeCallbacks::OnError_t, mozilla::JavaCallbacksSupport, mozilla::jni::Args<bool> >::Wrap<&mozilla::JavaCallbacksSupport::OnError> [JavaCallbacksSupport.h:b5d8b27a7537 : 61 + 0x5] r4 = 0x83cfcb10 r5 = 0x949df868 r6 = 0x9cdc34ad r7 = 0x00000001 r8 = 0x75c99628 r9 = 0xb4a20000 r10 = 0x00000000 fp = 0x00000001 sp = 0x949df868 pc = 0x9cdc189d Found by: call frame info 2 data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xd78063 r4 = 0x00000055 r5 = 0x13139720 r6 = 0x130602e0 r7 = 0x00000001 r8 = 0x75c99628 r9 = 0xb4a20000 r10 = 0x00000000 fp = 0x00000001 sp = 0x949df890 pc = 0xa2bc7065 Found by: call frame info 3 dalvik-non moving space (deleted) + 0x154e sp = 0x949df894 pc = 0x75bf4550 Found by: stack scanning 4 dalvik-main space (deleted) + 0x53971e sp = 0x949df8a0 pc = 0x13139720 Found by: stack scanning 5 dalvik-main space (deleted) + 0x53971e sp = 0x949df8a8 pc = 0x13139720 Found by: stack scanning 6 dalvik-main space (deleted) + 0x4602de sp = 0x949df8ac pc = 0x130602e0 Found by: stack scanning 7 dalvik-non moving space (deleted) + 0xa6626 sp = 0x949df8b4 pc = 0x75c99628 Found by: stack scanning 8 data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xee6fc7 sp = 0x949df8c0 pc = 0xa2d35fc9 Found by: stack scanning 9 dalvik-non moving space (deleted) + 0xa6626 sp = 0x949df8c4 pc = 0x75c99628 Found by: stack scanning 10 dalvik-main space (deleted) + 0x53971e sp = 0x949df8c8 pc = 0x13139720 Found by: stack scanning 11 dalvik-main space (deleted) + 0x5d3fe sp = 0x949df8d0 pc = 0x12c5d400 Found by: stack scanning 12 system@framework@boot.art + 0x1e63c6 sp = 0x949df8d4 pc = 0x70d013c8 Found by: stack scanning 13 dalvik-main space (deleted) + 0x4602de sp = 0x949df8e0 pc = 0x130602e0 Found by: stack scanning 14 dalvik-non moving space (deleted) + 0xa6696 sp = 0x949df8e8 pc = 0x75c99698 Found by: stack scanning 15 dalvik-main space (deleted) + 0x4602de sp = 0x949df8ec pc = 0x130602e0 Found by: stack scanning 16 data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xee702b sp = 0x949df8f0 pc = 0xa2d3602d Found by: stack scanning 17 dalvik-non moving space (deleted) + 0xa6696 sp = 0x949df8f4 pc = 0x75c99698 Found by: stack scanning 18 dalvik-zygote space (deleted) + 0x160f3ce sp = 0x949df8f8 pc = 0x75a883d0 Found by: stack scanning 19 dalvik-non moving space (deleted) + 0xa681e sp = 0x949df8fc pc = 0x75c99820 Found by: stack scanning 20 dalvik-non moving space (deleted) + 0xa681e sp = 0x949df900 pc = 0x75c99820 Found by: stack scanning 21 dalvik-non moving space (deleted) + 0xa681e sp = 0x949df904 pc = 0x75c99820 Found by: stack scanning 22 dalvik-main space (deleted) + 0x5d3fe sp = 0x949df90c pc = 0x12c5d400 Found by: stack scanning 23 data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xee6d89 sp = 0x949df910 pc = 0xa2d35d8b
Attached file more complete stack (deleted) —
dom/media/test/ is playback, not webrtc gcp: any ideas what's going on here? there's very little usable stack to work off of
Component: WebRTC: Audio/Video → Audio/Video: Playback
Flags: needinfo?(gpascutto)
OS: Unspecified → Android
John, this is your area of expertise
Flags: needinfo?(jolin)
Look for an owner for: mozilla::java::CodecProxy mozilla::RemoteDataDecoder because that's all we have to go on.
Flags: needinfo?(gpascutto)
Munro? You've been touching RemoteDataDecoder.cpp and CodecProxy
Flags: needinfo?(mchiang)
Group: core-security → media-core-security
Assignee: nobody → mchiang
Flags: needinfo?(mchiang)
Priority: -- → P1
Assignee: mchiang → nobody
Assignee: nobody → mchiang
Attachment #8856320 - Attachment is obsolete: true
Attachment #8856790 - Flags: review?(jolin)
Comment on attachment 8856790 [details] [diff] [review] Bug1353313-Use-RefPtr-to-hold-RemoteVideoDecoder.patch Review of attachment 8856790 [details] [diff] [review]: ----------------------------------------------------------------- Using RefPtr here doesn't seem helpful to me. This extra ref count will be decreased when |mJavaCallbacks| goes away in |ProcessShutdown()| and won't extend the life of |RemoteDataDecoder|. It looks like the issue is a race condition like the following: - OnError() checks |mCancel| - ProcessShutdown() preempts and resolves promise - RemoteDataDecoder is destructed - OnError() resumes - UAF!
Attachment #8856790 - Flags: review?(jolin) → review-
Comment on attachment 8856791 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-JavaCallbacksSupport.patch Review of attachment 8856791 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/platforms/android/JavaCallbacksSupport.h @@ +19,5 @@ > using Base::AttachNative; > using Base::DisposeNative; > > + JavaCallbacksSupport() : > + mCanceled(false), Coding style: use ':' and ',' to start a new line for initialization list. @@ +77,5 @@ > mCanceled = true; > } > > private: > Atomic<bool> mCanceled; With |mMutex|, |Atomic<>| is no longer needed.
Attachment #8856791 - Flags: review?(jolin) → review+
Flags: needinfo?(jolin)
Comment on attachment 8856791 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-JavaCallbacksSupport.patch Review of attachment 8856791 [details] [diff] [review]: ----------------------------------------------------------------- There's a call to "Cancel" to ensure that the JavaCallBackSupport object is no longer called by the JNI. Adding a mutex to ensure that the OnBlah function may help as it adds some delay, but it doesn't fundamentally resolve the problem that the JNI may call the callback object once it's been destructed. There needs to be another mechanism to ensure that the callback object isn't used *at all*. Not that the functions of the callback do nothing ::: dom/media/platforms/android/JavaCallbacksSupport.h @@ +25,4 @@ > > + virtual ~JavaCallbacksSupport() { > + MutexAutoLock lock(mMutex); > + mCanceled = true; this is either unnecessary or very wrong. After the destructor has been called, this is freed. and access to those members will lead to a UAF. If you rely on the destructor holding on a mutex to make sure the remaining of the code access this safely that can't be, and this is an undefined behaviour.
Attachment #8856791 - Flags: review+
(In reply to Jean-Yves Avenard [:jya] from comment #12) > Comment on attachment 8856791 [details] [diff] [review] > Bug1353313-Add-synchronization-mechanism-to-JavaCallbacksSupport.patch > > Review of attachment 8856791 [details] [diff] [review]: > ----------------------------------------------------------------- > > There's a call to "Cancel" to ensure that the JavaCallBackSupport object is > no longer called by the JNI. > > Adding a mutex to ensure that the OnBlah function may help as it adds some > delay, but it doesn't fundamentally resolve the problem that the JNI may > call the callback object once it's been destructed. > > There needs to be another mechanism to ensure that the callback object isn't > used *at all*. Not that the functions of the callback do nothing > > ::: dom/media/platforms/android/JavaCallbacksSupport.h > @@ +25,4 @@ > > > > + virtual ~JavaCallbacksSupport() { > > + MutexAutoLock lock(mMutex); > > + mCanceled = true; > > this is either unnecessary or very wrong. After the destructor has been > called, this is freed. and access to those members will lead to a UAF. > > If you rely on the destructor holding on a mutex to make sure the remaining > of the code access this safely that can't be, and this is an undefined > behaviour. For the onBlah function call after CallBackSupport dtor, this would cause a UAF indeed. I will find out another way to ensure the callback object isn't used at all. However, for the onBlah function call before dtor, I think this mutex is still needed to hold the CallBackSupport object before the onBlah function finishes. Do you agree?
There are two issues to solve: 1- Ensuring that the callback object isn't used after it's been destructed 2- Ensuring that all functions currently in use (in another thread) in the callback object have completed. The mutex change allows for 2. And calling Cancel() is sufficient. But 1 needs to be resolved too
Comment on attachment 8858185 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch Review of attachment 8858185 [details] [diff] [review]: ----------------------------------------------------------------- ::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/CodecProxy.java @@ +101,5 @@ > reportError(fatal); > } > > + private synchronized void reportError(boolean fatal) { > + if (!mCodecProxyReleased) { shouldn't the same test be done for all callbacks?
(In reply to Jean-Yves Avenard [:jya] from comment #16) > Comment on attachment 8858185 [details] [diff] [review] > Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch > > Review of attachment 8858185 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: > mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/CodecProxy. > java > @@ +101,5 @@ > > reportError(fatal); > > } > > > > + private synchronized void reportError(boolean fatal) { > > + if (!mCodecProxyReleased) { > > shouldn't the same test be done for all callbacks? There are quite a few regression bugs if we add the test to all callbacks: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a43dd2aafcbc66380f8ba8bbe93e5bfca9ac8c88 https://treeherder.mozilla.org/#/jobs?repo=try&revision=55cb1e7df136fc0ad91b2beeec74eb332fb9e2a3 I will address these in a following bug.
Attachment #8858252 - Flags: review?(jolin) → review+
Comment on attachment 8859416 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch Review of attachment 8859416 [details] [diff] [review]: ----------------------------------------------------------------- ::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/CodecProxy.java @@ +85,4 @@ > } > > @Override > + public synchronized void onOutput(Sample sample) throws RemoteException { Please dispose sample if proxy is dead.
Attachment #8859416 - Flags: review?(jolin) → review+
Comment on attachment 8859470 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch [Security approval request comment] How easily could an exploit be constructed based on the patch? Pretty hard Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No, just as the change we made, we made some function synchronized. Which older supported branches are affected by this flaw? back to 54 If not all supported branches, which bug introduced the flaw? bug 1257777 Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Will backport to beta How likely is this patch to cause regressions; how much testing does it need? Low risk to cause regression. We just solve a UAF bug.
Attachment #8859470 - Flags: sec-approval?
sec-approval+. Please request approval on patches for backporting as well.
Attachment #8859470 - Flags: sec-approval? → sec-approval+
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment on attachment 8859470 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch Approval Request Comment [Feature/Bug causing the regression]: bug 1257777 [User impact if declined]: intermittent crash [Is this code covered by automated tests?]: yes [Has the fix been verified in Nightly?]: yes [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no [Why is the change risky/not risky?]: we just solved an UAF bug. [String changes made/needed]: none
Attachment #8859470 - Flags: approval-mozilla-beta?
Comment on attachment 8859470 [details] [diff] [review] Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch Fix a sec-high. Beta54+. Should be in 54 beta 2.
Attachment #8859470 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Group: media-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: