Closed Bug 1068394 Opened 10 years ago Closed 10 years ago

WebRTC H.264 decoder doesn't work on v2.1

Categories

(Core :: WebRTC: Audio/Video, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla35
blocking-b2g 2.1+
Tracking Status
firefox34 --- fixed
firefox35 --- fixed
b2g-v2.0 --- unaffected
b2g-v2.1 --- verified
b2g-v2.2 --- verified

People

(Reporter: jaywang, Assigned: jesup)

References

Details

(Whiteboard: [caf priority: p1][CR 726445])

Attachments

(4 files)

WebRTC with H.264 on v2.1 is completely broken both upstream and downstream doesn't work. The same build with VP8 seems to be working
[Blocking Requested - why for this release]:

WebRTC with H.264 is a key feature for v2.1.
blocking-b2g: --- → 2.1?
QA Wanted - Is this possible to reproduce using the mobile Loop app with Flame KK 2.1 builds?
Keywords: qawanted
Whiteboard: [CR 726445]
Whiteboard: [CR 726445] → [caf priority: p2][CR 726445]
Whiteboard: [caf priority: p2][CR 726445] → [caf priority: p1][CR 726445]
Ni :tchung to help with the qawanted request as its been a few days, and that will help us make a blocking call.

Also Ni :mreavy/:jesup to get their initial input on this bug.
Flags: needinfo?(tchung)
Flags: needinfo?(rjesup)
Flags: needinfo?(mreavy)
(In reply to bhavana bajaj [:bajaj] from comment #4)
> Ni :tchung to help with the qawanted request as its been a few days, and
> that will help us make a blocking call.
> 
> Also Ni :mreavy/:jesup to get their initial input on this bug.

Randall/ Nils, do we have h.264 content that we can test against the loop client?   once we get that, we can try out this test to reproduce
Flags: needinfo?(tchung) → needinfo?(drno)
Tony -- We just need to try an H.264 call between the two phones.  However, we think this is a dupe of Bug 1067437.  We talked about it with some of the QC's folks on our Wednesday call with QC.  The fix for bug 1067437 is being checked in now to trunk. (It's a work around for a bug in the DSP code.)
Flags: needinfo?(rjesup)
Flags: needinfo?(mreavy)
Flags: needinfo?(drno)
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #6)
> Tony -- We just need to try an H.264 call between the two phones.  However,
> we think this is a dupe of Bug 1067437.  We talked about it with some of the
> QC's folks on our Wednesday call with QC.  The fix for bug 1067437 is being
> checked in now to trunk. (It's a work around for a bug in the DSP code.)

I believe bug 1067437 happens on v2.0 and you at least see a few seconds of good video before it gets corrupted.

Whereas this bug happens on v2.1 and essentially the video is never established successfully.

Randell, do you believe the dynamic patch in bug 1067437 will fix v2.1?
Flags: needinfo?(rjesup)
(In reply to Diego Wilson [:diego] from comment #7)
> (In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #6)
> > Tony -- We just need to try an H.264 call between the two phones.  However,
> > we think this is a dupe of Bug 1067437.  We talked about it with some of the
> > QC's folks on our Wednesday call with QC.  The fix for bug 1067437 is being
> > checked in now to trunk. (It's a work around for a bug in the DSP code.)
> 
> I believe bug 1067437 happens on v2.0 and you at least see a few seconds of
> good video before it gets corrupted.
> 
> Whereas this bug happens on v2.1 and essentially the video is never
> established successfully.
> 
> Randell, do you believe the dynamic patch in bug 1067437 will fix v2.1?

I just briefly talked to Nils, and he tried it on a 2.1 flame with KK.  It was a quick conversation, but he wasnt able to see any H.264 video output as well.   If you think the patch in bug 1067437 is worth trying out, please comment here if so.
Flags: needinfo?(drno)
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #6)
> Tony -- We just need to try an H.264 call between the two phones.  However,
> we think this is a dupe of Bug 1067437.  We talked about it with some of the
> QC's folks on our Wednesday call with QC.  The fix for bug 1067437 is being
> checked in now to trunk. (It's a work around for a bug in the DSP code.)


This issue is different than Bug 1067437. For 1067437, it will at least start the conversation for first few frames until resolution change is issued. But, this bug doesn't send a single frame.
(In reply to Tony Chung [:tchung] from comment #8)
> (In reply to Diego Wilson [:diego] from comment #7)
> > (In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #6)
> > > Tony -- We just need to try an H.264 call between the two phones.  However,
> > > we think this is a dupe of Bug 1067437.  We talked about it with some of the
> > > QC's folks on our Wednesday call with QC.  The fix for bug 1067437 is being
> > > checked in now to trunk. (It's a work around for a bug in the DSP code.)
> > 
> > I believe bug 1067437 happens on v2.0 and you at least see a few seconds of
> > good video before it gets corrupted.
> > 
> > Whereas this bug happens on v2.1 and essentially the video is never
> > established successfully.
> > 
> > Randell, do you believe the dynamic patch in bug 1067437 will fix v2.1?
> 
> I just briefly talked to Nils, and he tried it on a 2.1 flame with KK.  It
> was a quick conversation, but he wasnt able to see any H.264 video output as
> well.   If you think the patch in bug 1067437 is worth trying out, please
> comment here if so.

1067437 is different problem. As you mentioned, in this case, there is no video output, but for 1067437, the video starts as normal until resolution change is issued.
Attached file v2.1_not_working.txt (deleted) —
Attached, please find the CSF log.

I was wrong that the encoder doesn't send any frame. The problem seems to be on the decoding path. When I connect v2.1 to v2.0 device, v2.0 device properly shows the image. However, v2.1 device is completely gray. This proves that the encoder path is ok, but the problem is in decoding path
So far I have not been able to get a Flame phone with a build from Sep 22 to either send me SDP offer with H264 in it, nor to answer a SDP offer with H264.
Jay: can you describe how do you get the phone to place a H264 call (instead of VP8)?
Flags: needinfo?(drno) → needinfo?(jaywang)
Nils,

We enable H264 through pref("media.peerconnection.video.h264_enabled"). This is set to false by default from MOZ release. We enable this through vendor specific pref.
You can change pref in [1] to "true" and it should enable it

Another option that you can try is that adding "pref('media.peerconnection.video.h264_enabled', true);" in /system/b2g/defaults/pref/user.js. You just pull the file from target through adb pull and push back once it is modified. Then, restart the device. 

Here are command:
adb root
adb remount
adb pull  /system/b2g/defaults/pref/user.js
....(after modification) 
adb push user.js /system/b2g/defaults/pref/user.js 
adb shell chmod 777 /system/b2g/defaults/pref/user.js 

This allows you to enable H.264 without re-build the image. However, I am not sure if your build has user.js on target or not. 


[1] http://git.mozilla.org/?p=releases/gecko.git;a=blob;f=modules/libpref/init/all.js;h=4612b631476d5d3999596e73d664b6e9af94d5e1;hb=refs/heads/v2.1#l309
Flags: needinfo?(jaywang)
Thanks Jay for the description, that helped.

I was able to reproduce the problem between a Flame with a build from Sep 22 and a desktop (Mac). I enabled H264 via the user.js preference on the Flame. The desktop was 35 from today with preferred_codec set to H264. I used apprtc to establish the call.
The result is the call gets setup. Audio seems to work. The desktop side sends video according to about:webrtc, but never receives a single video packet from the Flame. Unfortunately I failed to open about:webrtc on the Flame because of bug 1068557.
(In reply to Nils Ohlmeier [:drno] from comment #14)
> I enabled H264 via the user.js preference on the Flame.

We should work with the Flame vendor to ensure that they enable this preference by default. Does anyone know who's the right POC for this?
In v2.0 CAF builds this is done via partner-prefs.js [1]. Presumably flame builds are not using this or are overriding it.

[1] https://www.codeaurora.org/cgit/quic/lf/b2g/build/tree/gaia_settings/distribution/partner-prefs.js?h=v2.0#n31
Jay: your logs show successful encode and decode all the way through.....  This was from the phone that was failing to display decoded data, right?

Perhaps the problem is elsewhere.... (where?)  signaling:6 (7?) should show "DeliverFrame Buffer Size NNNN" logs if the decoded video gets out to the VideoConduit.
Flags: needinfo?(rjesup) → needinfo?(jaywang)
(In reply to Diego Wilson [:diego] from comment #15)
> (In reply to Nils Ohlmeier [:drno] from comment #14)
> > I enabled H264 via the user.js preference on the Flame.
> 
> We should work with the Flame vendor to ensure that they enable this
> preference by default. Does anyone know who's the right POC for this?

Francis Lee would be the mozilla contact to get in touch with Flame vendor.  ni? francis now. 

Francis, it is suggested to see if T2M can add the setting to the user.js pref at buildtime.  see comment 13.
Flags: needinfo?(frlee)
Jay: I tried with an inbound build of gecko, and I don't see it encoding video (not even asking for a frame to be encoded); it did select h.264 mode 1.

So, it's not the packetization code (*if* my test is valid).  gdb informs me we never even sent a video frame to VideoConduit to be encoded.   Perhaps an issue with the test page? (pc_test.html with one-way-call and require h.264 checked). And when I forced vp8 (media.navigator.video.preferred_codec 120), I have the same problem.  Note: not 2.1, this is inbound/2.2
Attached file logs from omx encoder and decoder (deleted) —
attached attachment 8493582 [details]. these are logs from file WebrtcOMXH264VideoCodec.cpp from omx encoder and decoder. I tested flame JB v2.2 and fire-e JB 2.0. Logs are from flame device. I don't see video on fire-e and get a green screen on flame.
More info:

My primary fast minimal test page (pc_test.html) seems to have been broken on b2g by some sipcc changes (doesn't work for vp8 or h.264).  Testing on other platforms does show the problem; I get logs like Oscar's (encode never generates data).

Looking at my logs, I see something oscar's might or might not have (he filtered to prlog statements):

I/PRLog   (15169): 17677184[a820e380]: [ViECaptureThrea|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:877: WebrtcOMXH264VideoEncoder:0xb0fff430 configuring encoder 320x240 @ 30 fps, rate 220 kbps
E/OMX-VENC-720p(  203): set_parameter: metamode is valid for input port only
E/OMXNodeInstance(  203): OMX_SetParameter() failed for StoreMetaDataInBuffers: 0x8000101a
E/ACodec  (15169): [OMX.qcom.video.encoder.avc] storeMetaDataInBuffers (output) failed w/ err -2147483648

Now, I don't know if that's a problem.

I compared WebrtcOMXH264VideoCodec.cpp between b2g32 and inbound - virtually identical.   OMXCodecWrapper.cpp is not (a number of changes to support other formats - if you remember, before video failed on 2.1/2.2 we had chroma issues on KK, which were supposed to be fixed by these changes to OMXCodecWrapper.cpp.  So that would be a primary question (though it wouldn't explain why decode has issues).  

OMXCodecWrapper support for QCOM Venus NV12 landed in Bug 997593 on 9/14

The other thing which may have changed was bug 1009410 "Expose graphic buffer to MediaCodec", or perhaps Bug 1036539 (Add async mode support to GonkNativeWindow)

Needinfoing the people who are responsible for the various landings that might be involved.
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(mhabicher)
Flags: needinfo?(jolin)
(In reply to Randell Jesup [:jesup] from comment #17)
> Jay: your logs show successful encode and decode all the way through..... 
> This was from the phone that was failing to display decoded data, right?
> 
> Perhaps the problem is elsewhere.... (where?)  signaling:6 (7?) should show
> "DeliverFrame Buffer Size NNNN" logs if the decoded video gets out to the
> VideoConduit.
Randell,
Yes, it did go through the encoding and decoding part. I tried to capture the video stream comes to the OMX decoder and I can play the stream with ffmpeg for first 3-5secs and stop after that. So, it looks to me the problem is in the stream coming to OMX decoder.

I had one device with V2.0 and another one with V2.1. I don't know if that causes the different behavior as what you observed.
Flags: needinfo?(jaywang)
One thing that I just noticed is v2.1 provides a single input with NAL 0x67 to the decoder, but v2.0 has multiple calls 

Log from V2.1 -
I PRLog   : 2014-09-23 18:11:03.573309 UTC - 18562648[b0f7e100]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:383: Decoder input: 5242 bytes (NAL 0x67), time 44550527222 (4009547450), flags 0x2

Log from V2.0 -
I PRLog   : 2014-09-23 18:26:47.530999 UTC - 33410712[b10a7d00]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:382: Decoder input: 17 bytes (NAL 0x67), time 46439655800 (4179569022), flags 0x2
I PRLog   : 2014-09-23 18:26:47.548223 UTC - 33410712[b10a7d00]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:382: Decoder input: 8 bytes (NAL 0x68), time 46439655911 (4179569032), flags 0x2
I PRLog   : 2014-09-23 18:26:47.553891 UTC - 33410712[b10a7d00]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:382: Decoder input: 4403 bytes (NAL 0x65), time 46439656022 (4179569042), flags 0x1

I believe the encoder send them in multiple packet but decoder side combines them into one.

Any reason to have them combined? Can we try to process the packet as V2.0 and see if the problem goes away?
Flags: needinfo?(rjesup)
Attachment #8493952 - Flags: review?(pkerr)
Flags: needinfo?(rjesup)
Attachment #8493952 - Flags: review?(pkerr) → review+
(In reply to Randell Jesup [:jesup] from comment #25)
> Created attachment 8493952 [details] [diff] [review]
> Explode H.264 NALs into separate decode buffers for OMX
This patch works for me. With the latest DSP image, I can have both encoder and decoder running ok. With older DSP image 1520, it should work with this patch + resolution change disabled.
Development seems to be able to reproduce this on its own.
Keywords: qawanted
Randell,

For clarification on the decoder requirements to handle SPS/PPS + keyframe, 8x10 decoder requires that
-  when MediaCodec::BUFFER_FLAG_CODECCONFIG is set, only SPS/PPS should be included in the buffer.

so, it is required to split the SPS/PPS from keyframe.
(In reply to Randell Jesup [:jesup] from comment #22)
> More info:
> 
> My primary fast minimal test page (pc_test.html) seems to have been broken
> on b2g by some sipcc changes (doesn't work for vp8 or h.264).  Testing on
> other platforms does show the problem; I get logs like Oscar's (encode never
> generates data).
> 
> Looking at my logs, I see something oscar's might or might not have (he
> filtered to prlog statements):
> 
> I/PRLog   (15169): 17677184[a820e380]:
> [ViECaptureThrea|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:877:
> WebrtcOMXH264VideoEncoder:0xb0fff430 configuring encoder 320x240 @ 30 fps,
> rate 220 kbps
> E/OMX-VENC-720p(  203): set_parameter: metamode is valid for input port only
> E/OMXNodeInstance(  203): OMX_SetParameter() failed for
> StoreMetaDataInBuffers: 0x8000101a
> E/ACodec  (15169): [OMX.qcom.video.encoder.avc] storeMetaDataInBuffers
> (output) failed w/ err -2147483648
> 
> Now, I don't know if that's a problem.

 The error is ignored[1] so I guess this is okay.

 [1] https://www.codeaurora.org/cgit/quic/la/platform/frameworks/av/tree/media/libstagefright/ACodec.cpp?h=mozilla/b2g_kk_3.5#n1139

> 
> I compared WebrtcOMXH264VideoCodec.cpp between b2g32 and inbound - virtually
> identical.   OMXCodecWrapper.cpp is not (a number of changes to support
> other formats - if you remember, before video failed on 2.1/2.2 we had
> chroma issues on KK, which were supposed to be fixed by these changes to
> OMXCodecWrapper.cpp.  So that would be a primary question (though it
> wouldn't explain why decode has issues).  
> 
> OMXCodecWrapper support for QCOM Venus NV12 landed in Bug 997593 on 9/14

 As you mentioned on IRC (many thanks), the patch breaks WebRTC encoding path in 2.2. And it seems to me that other than PlanarYCbCrImage::IsValid(), GetDataSize() doesn't work well with SetDataNoCopy() either[2].

 I'll file a bug to fix them.

 [2] http://dxr.mozilla.org/mozilla-central/source/media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp#1204
Flags: needinfo?(jolin)
https://hg.mozilla.org/integration/mozilla-inbound/rev/c34ab3cbfec5

The 2.2 encoder bug is bug 1067442

Leaving NI to frlee about build issues
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(mhabicher)
Summary: WebRTC with H.264 doesn't work on v2.1 → WebRTC H.264 decoder doesn't work on v2.1
Whiteboard: [caf priority: p1][CR 726445] → [caf priority: p1][CR 726445][webrtc-uplift]
Target Milestone: --- → mozilla35
hi All, 

can you help me to understand what base image you have used? i supposed its v180 kk v2.0 engineering build, right?

is this issue caused by H.264 DSP version? will 1530.1 fix this?
it's not clear how i and partner can help you, if you could provide exactly what T2M should includes in the v2.0 base image, that will help.

thank you very much
Flags: needinfo?(frlee)
There are a number of bits not in the current v180 image that are needed for 2.0, such as the DSP 1530.1 image, and the AOSP & kernel changes QC has for supporting HW AEC through OpenSLES (bug 1059573), and the all.js config changes.  Also, as soon as we land & uplift bug 1059573 we'll need to add an ac_add_options to 8x10 KK builds with the AOSP/kernel changes to enable HW AEC in gecko.

Diego?  Michael? any more?   Can you provide more details for Francis?
Flags: needinfo?(dwilson)
(In reply to Randell Jesup [:jesup] from comment #33)
> There are a number of bits not in the current v180 image that are needed for
> 2.0, such as the DSP 1530.1 image, and the AOSP & kernel changes QC has for
> supporting HW AEC through OpenSLES (bug 1059573), and the all.js config
> changes.  Also, as soon as we land & uplift bug 1059573 we'll need to add an
> ac_add_options to 8x10 KK builds with the AOSP/kernel changes to enable HW
> AEC in gecko.
> 
> Diego?  Michael? any more?   Can you provide more details for Francis?

OK I think there are two distinct issues here:

1. The DSP image needs an update.

2. The Flame vendor needs to enable webrtc H.264 in their partner prefs.

For (1) Jay can give you more details on the build version what would have the updated DSP image.

For (2) we need someone (frlee?) to coordinate with T2M so they set the pref in their builds.
Flags: needinfo?(jaywang)
Flags: needinfo?(frlee)
Flags: needinfo?(dwilson)
(In reply to Diego Wilson [:diego] from comment #34)
> (In reply to Randell Jesup [:jesup] from comment #33)
> > There are a number of bits not in the current v180 image that are needed for
> > 2.0, such as the DSP 1530.1 image, and the AOSP & kernel changes QC has for
> > supporting HW AEC through OpenSLES (bug 1059573), and the all.js config
> > changes.  Also, as soon as we land & uplift bug 1059573 we'll need to add an
> > ac_add_options to 8x10 KK builds with the AOSP/kernel changes to enable HW
> > AEC in gecko.
> > 
> > Diego?  Michael? any more?   Can you provide more details for Francis?
> 
> OK I think there are two distinct issues here:
> 
> 1. The DSP image needs an update.
> 
> 2. The Flame vendor needs to enable webrtc H.264 in their partner prefs.
> 
> For (1) Jay can give you more details on the build version what would have
> the updated DSP image.
There is a separate issue with resolution change that can cause the encoder failure. The DSP image is not released yet but it will be released by this week. In the meantime, the workaround patch from  https://bugzilla.mozilla.org/show_bug.cgi?id=1067437#c28 removes the resolution change and can be used.

> 
> For (2) we need someone (frlee?) to coordinate with T2M so they set the pref
> in their builds.
Flags: needinfo?(jaywang)
https://hg.mozilla.org/mozilla-central/rev/c34ab3cbfec5
Assignee: nobody → rjesup
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
since its resolved. remove my ni.
Flags: needinfo?(frlee)
Francis, did T2M get notified as Diego mentioned in comment 34?
blocking-b2g: 2.1? → 2.1+
Flags: needinfo?(frlee)
Blocks: 1073025
(In reply to Andrew Overholt [:overholt] from comment #38)
> Francis, did T2M get notified as Diego mentioned in comment 34?

I opened bug 1073025 to follow up on this
Thanks, Diego.  I'm clearing Francis' needinfo request here.
Flags: needinfo?(frlee)
Nils, do you mind retesting when this is fixed on both master and 2.1?   To verify on 2.1, you will need to set flag: status-b2g-2.1=verified.   Thanks.
Flags: needinfo?(drno)
Keywords: verifyme
QA Contact: drno
Also, can we get an automated test for this?  in this case, i would think the easiest thing is to have a local H.264 file wake the decoder on the device, report the output someplace, and then maybe diff that data to a reference file.    This could all be done with a marionette test running on device, and get triggered through the current jenkins setup.    Nils, what do you think?
Still needs to land on Aurora AFAICT. Please add an approval request on the patch when you get a chance :)
Flags: needinfo?(rjesup)
Comment on attachment 8493952 [details] [diff] [review]
Explode H.264 NALs into separate decode buffers for OMX

Approval Request Comment
[Feature/regressing bug #]: mode 0 packetization changes

[User impact if declined]: no video in webrtc with OMX H.264

[Describe test coverage new/current, TBPL]: Manual verification; emulators don't support OMX hardware.

[Risks and why]: low risk - the "find a nal start" function was already there.

[String/UUID change made/needed]: none
Attachment #8493952 - Flags: approval-mozilla-aurora?
Flags: needinfo?(rjesup)
Attachment #8493952 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Flags: needinfo?(drno)
(In reply to Tony Chung [:tchung] from comment #42)
> Also, can we get an automated test for this?  in this case, i would think
> the easiest thing is to have a local H.264 file wake the decoder on the
> device, report the output someplace, and then maybe diff that data to a
> reference file.    This could all be done with a marionette test running on
> device, and get triggered through the current jenkins setup.    Nils, what
> do you think?

Good idea. We could try to pass in a H.264 recording into the decoder and then maybe pass it through the encoder back out again.
I guess it is probably best to start discussion (and work if possible) in another ticket: bug 1073731.
Whiteboard: [caf priority: p1][CR 726445][webrtc-uplift] → [caf priority: p1][CR 726445]
This issue is verified fixed on Flame 2.1.

Result: Based on Comment 6, I made a call between 2.1 devices on the Loop app. The video displayed properly on both devices.

Hello Version: b1e1edb

Device: Flame 2.1 (319mb, KK, Shallow Flash)
BuildID: 20141120001207
Gaia: f8d3bf44029e0afc0124600a4bb34dba8fc1ad21
Gecko: f70a67a7f846
Version: 34.0 (2.1)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: verifyme
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: