Closed Bug 1014360 Opened 10 years ago Closed 10 years ago

Cannot render MP4 video - vdec_open failed

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla32
tracking-b2g backlog

People

(Reporter: ethan, Assigned: sotaro)

References

Details

(Keywords: regression)

Attachments

(3 files)

I encountered this bug while testing RTSP streaming. However I found it's not RTSP-specific. Playing media through HTTP streaming also has this problem. Reproduction Steps: 1. Open B2G browser. 2. Navigate to a page contains a list of MP4 media links. (e.g. http://10.247.24.86) 3. Click one MP4 media link. (e.g. http://10.247.24.86/captain.mp4) 4. Press back to the previous page. 5. Click another MP4 media link (e.g. http://10.247.24.86/lance_burton.mp4) 6. Repeat step 3-5 for several times, until the video is not rendered on the screen. When this bug is occurring, we can see the audio is playing and the scrubber keeps going. There are some clues in the logcat: E/vdec ( 115): vdec: failed to allocate pmem arena (2621440 bytes) E/omx_vdec( 115): ERROR!!! vdec_open failed
Attached file unagi.log (deleted) —
The logcat output while the bug is happening.
Attached image device-2014-05-22-112115.png (deleted) —
A screenshot while the bug is happening.
Hi Bruce, I can reproduce this bug on the latest v2.0 build on both Unagi and Buri. It seems to be a critical issue. Can you kindly help to look into it?
Flags: needinfo?(brsun)
QA Wanted to check 1.4.
blocking-b2g: --- → backlog
Component: General → Video/Audio
Keywords: qawanted
Product: Firefox OS → Core
Version: unspecified → Trunk
Test results of PVT builds: - hamachi-eng-mozilla-central-20140512160204-ril01.02.00.019.102.zip is good. - hamachi-eng-mozilla-central-20140513040202-ril01.02.00.019.102.zip is not good.
...a regression bug?! Thanks Bruce! Have a nice day!
Flags: needinfo?(brsun)
Keywords: qawantedregression
blocking-b2g: backlog → 2.0?
Actually let's first confirm this doesn't happen on 1.4 before getting the window.
blocking-b2g: 2.0? → backlog
20140512: 20ca234fd62b 20140513: 4b6d63b05a0a That gives us a regression range of: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=20ca234fd62b&tochange=4b6d63b05a0a The only video related changes in that range that stuck and that run on B2G are two pushes by jwwang: fbae0dc61cd3 JW Wang — Bug 1004669 - Fix leaks in MediaTaskQueue::Dispatch(). r=cpearce 5ceb693e9733 JW Wang — Bug 1001317 - reset |MediaCacheStream::mDidNotifyDataEnded| so that it can notify data ended correctly upon 2nd download. r=roc You could try reverting them locally to see if that fixes your build.
We have similar logs in bug 920344 comment 3. Do Unagi and Buri both contain Qualcomm chips?
(In reply to Chris Pearce (:cpearce) from comment #8) > 20140512: 20ca234fd62b > 20140513: 4b6d63b05a0a > Quickly try to backout these two commits locally, but this issue seems still reproducible.
(In reply to Bruce Sun [:brsun] from comment #10) > Quickly try to backout these two commits locally, but this issue seems still > reproducible. Wrong info... I mean these two commits: > fbae0dc61cd3 JW Wang — Bug 1004669 - Fix leaks in MediaTaskQueue::Dispatch(). r=cpearce > 5ceb693e9733 JW Wang — Bug 1001317 - reset |MediaCacheStream::mDidNotifyDataEnded| so that it can notify data ended correctly upon 2nd download. r=roc
You could try bisecting the regression range to determine the exact commit in that range where the bug starts occurring.
Hi Sotaro, This issue occurs since this commit 7c62af2b1ac9. Could you help to share some clues of it?
Flags: needinfo?(sotaro.ikeda.g)
Blocks: 1000660
Component: Video/Audio → Graphics: Layers
Keywords: qawanted
For the STR here, how many times do you need to repeat step #6 to reproduce this bug?
Keywords: qawanted
(In reply to Bruce Sun [:brsun] from comment #13) > Hi Sotaro, > > This issue occurs since this commit 7c62af2b1ac9. > Could you help to share some clues of it? I am going to investigate about the problem. Thanks.
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
I seems to regenerate the problem on master hamachi by using the following. http://people.mozilla.org/~cpeterson/videos/H264_Baseline_Profile_Level_30_640x360p.mp4 STR -[1] start video playback by browser app -[2] push the browser app to backbround -[3] set the browser app to forground again. continues [2] [3] until the problem happens.
I confirmed that this is actually Bug 1000660's regression. It was memory leak:-(
(In reply to Sotaro Ikeda [:sotaro] from comment #18) > Created attachment 8429402 [details] [diff] [review] > patch - memory leak when DEALLOCATE_CLIENT is set I confirmed the fix of the STR in comment16 on master hamachi.
Attachment #8429402 - Flags: review?(nical.bugzilla)
Comment on attachment 8429402 [details] [diff] [review] patch - memory leak when DEALLOCATE_CLIENT is set Review of attachment 8429402 [details] [diff] [review]: ----------------------------------------------------------------- What would be nice, would be to rename this into ShouldDeallocateInFinalize(), add a pure virtual DeallocateSharedData method (to mirror TextureHost's API), and call that in Finalize, rather than relying on all textures to check this and do the right thing in their destructor. I'll file this as a followup.
Attachment #8429402 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 8429402 [details] [diff] [review] patch - memory leak when DEALLOCATE_CLIENT is set Review of attachment 8429402 [details] [diff] [review]: ----------------------------------------------------------------- Wait I reviewed this a bit too quickly! At some point we had a mechanism in place for DEALLOCATE_CLIENT: when this flag was set, the texture data would be held by the transaction until we had received the notification by the compositor that it is safe to delete the data. This patch will work with gralloc but will break shmem textures for instance, since the Compositor might still bu using it when ~TextureClient happens
Attachment #8429402 - Flags: review+ → review-
The mechanism I am referring to is what we removed in bug 8421991
(In reply to Nicolas Silva [:nical] from comment #23) > The mechanism I am referring to is what we removed in bug 8421991 Nical, is it a correct number?
Flags: needinfo?(nical.bugzilla)
(In reply to Nicolas Silva [:nical] from comment #22) > be held by the transaction until we had received the notification by the > compositor that it is safe to delete the data. This patch will work with > gralloc but will break shmem textures for instance, since the Compositor > might still bu using it when ~TextureClient happens nical, I am not sure about which code is broken by attachment 8429402 [details] [diff] [review]. Can you explain bout it more concretely?
Status: NEW → ASSIGNED
(In reply to Sotaro Ikeda [:sotaro] from comment #24) > (In reply to Nicolas Silva [:nical] from comment #23) > > The mechanism I am referring to is what we removed in bug 8421991 > > Nical, is it a correct number? Woops, sorry: I meant bug 971946
When shared data needs to be deallocated on the client side, we have two options. in both cases the client informs the host that it needs to let go of the shared data before the client deallocates the data. 1) Synchronously: it's the easiest way but it hurst performance so we try to avoid it: in this case we just need to wait for the end of the transaction to deallocate/recycle the shared data. 2) Asynchronously: - The client sends a message to the host side - The host forgets its references to the data, sends back an async reply - The client receives the reply and knows it can safely deallocate the data originally, we had TextureClientData, which was meant to hold the shared data between the moment the client sends the RemoveTexture message and the moment the reply is received from the host, at which point the shared data would be deallocated. We removed this in bug 971946, and I forgot why we decided we could do so, because now that I look back I think we shouldn't have removed it (I was the one reviewing the patch, so I am a bit embarrassed that I can't remember the rational behind it). One way to recover the correct behavior could be to use the AsyncTransactionTracker mechanism you added recently (which IIRC currently only works for ImageBridge) to have an object holding the data and waiting for the async reply to do the destruction. Anyway, deleting shared data in TextureClient's destructor is not safe because the data is still accessible on the other side.
Flags: needinfo?(nical.bugzilla)
> > Anyway, deleting shared data in TextureClient's destructor is not safe > because the data is still accessible on the other side. Nical, I am not sure about this part. "mActor->SendClearTextureHostSync()" in TextureClient::ForceRemove() synchronously remove data access from the other side, if TextureFlags::DEALLOCATE_CLIENT is set. Why is it not enough? void TextureClient::ForceRemove() { if (mValid && mActor) { if (GetFlags() & TextureFlags::DEALLOCATE_CLIENT) { if (mActor->IPCOpen()) { mActor->SendClearTextureHostSync(); mActor->SendRemoveTexture(); } } else { if (mActor->IPCOpen()) { mActor->SendRemoveTexture(); } } } MarkInvalid(); }
Flags: needinfo?(nical.bugzilla)
(In reply to Sotaro Ikeda [:sotaro] from comment #28) > > > > Anyway, deleting shared data in TextureClient's destructor is not safe > > because the data is still accessible on the other side. > > Nical, I am not sure about this part. "mActor->SendClearTextureHostSync()" > in TextureClient::ForceRemove() synchronously remove data access from the > other side, if TextureFlags::DEALLOCATE_CLIENT is set. Why is it not enough? > > > void TextureClient::ForceRemove() > { > if (mValid && mActor) { > if (GetFlags() & TextureFlags::DEALLOCATE_CLIENT) { > if (mActor->IPCOpen()) { > mActor->SendClearTextureHostSync(); > mActor->SendRemoveTexture(); > } > } else { > if (mActor->IPCOpen()) { > mActor->SendRemoveTexture(); > } > } > } > MarkInvalid(); > } Ah! that's the important part I forgot about :) r+
Flags: needinfo?(nical.bugzilla)
Attachment #8429402 - Flags: review- → review+
This means we always do synchronous deallocation when we do it on the client side, though. We can do better (we used to be able to have it async with TextureClientData). But at least your patch is correct.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
So glad to see this bug being fixed! ^o^ Thanks for everyone here!
Keywords: qawanted
Thanks everyone! Cannot reproduce it on the latest V2.0 build. * Build Information: - Gaia 8d865839d932bfbd5e157f376f74d8cb12bfdd51 - Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/1d4046a8cb6c - BuildID 20140610000223 - Version 32.0a2
Status: RESOLVED → VERIFIED
Blocks: 985772
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: