[vaapi] VA-API does no fallback to SW decode when we fail to create EGLImage
Categories
(Core :: Audio/Video: Playback, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox98 | --- | fixed |
People
(Reporter: emilio, Assigned: stransky)
References
(Blocks 1 open bug)
Details
Crash Data
Attachments
(6 files)
STR:
- Open the attachment in bug 1722662.
ER:
- A video.
AR:
- Just green. On macOS or without the vaapi pref it works as expected.
Assignee | ||
Comment 1•3 years ago
|
||
Works for me, it's h264 clip...run with MOZ_LOG="PlatformDecoderModule:5" for more info please.
Reporter | ||
Comment 2•3 years ago
|
||
Seems to be doing DMABuf stuff, but still all I see on the screen is green.
Reporter | ||
Comment 3•3 years ago
|
||
This is a clean build with vaapi pref flipped and MOZ_ENABLE_WAYLAND=1
Reporter | ||
Updated•3 years ago
|
Assignee | ||
Comment 4•3 years ago
|
||
Looks like the stream is decoded, perhaps dmabuf textures are not processes / exported correctly. Please run with MOZ_LOG="Dmabuf:5" env set to check any issues. Also please try to set widget.dmabuf-textures.enabled to true at about:config and restart browser and check MOZ_LOG="Dmabuf:5" log.
Reporter | ||
Comment 5•3 years ago
|
||
The dmabuf-textures pref doesn't make any difference on the output afaict.
The interesting bits seem to be "EGLImageKHR creation failed" from here.
Reporter | ||
Comment 6•3 years ago
|
||
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Assignee | ||
Comment 7•3 years ago
|
||
Yes, we fail to create a texture over dmabuf buffer and that's also reason why you see green screen - it's empty frame.
Can you try to put a breakpoint to DMABufSurface.cpp#1165 (DMABufSurfaceYUV::CreateTexture in chrome process) and attach content of 'this' ?
Also make sure you have disabled widget.dmabuf-textures.enabled to make sure only vaapi dmabuf frames are uploaded as textures.
I expect the dmabuf has some unusual format or so and we fail to create a texture for it.
Thanks.
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Comment 8•3 years ago
|
||
(gdb) p *this
$3 = (DMABufSurfaceYUV) {
<DMABufSurface> = {
_vptr$DMABufSurface = 0x7f7f9e33fb30 <vtable for DMABufSurfaceYUV+16>,
mRefCnt = {
mValue = {
<std::__atomic_base<unsigned long>> = {
_M_i = 2
},
}
},
mSurfaceType = DMABufSurface::SURFACE_NV12,
mBufferModifier = 0,
mBufferPlaneCount = 2,
mDmabufFds = {159, 160, -1, -1},
mDrmFormats = {538982482, 943215175, 0, 0},
mStrides = {1536, 1536, 0, 0},
mOffsets = {0, 909312, 0, 0},
mGbmBufferObject = {0x0, 0x0, 0x0, 0x0},
mMappedRegion = {0x0, 0x0, 0x0, 0x0},
mMappedRegionData = {0xe4e4e4e4e4e4e4e4, 0xe4e4e4e4e4e4e4e4, 0xe4e4e4e4e4e4e4e4, 0xe4e4e4e4e4e4e4e4},
mMappedRegionStride = {0, 0, 0, 0},
mSyncFd = -1,
mSync = 0x0,
mGL = [(mozilla::gl::GLContext *) 0x0],
mGlobalRefCountFd = 161,
mUID = 3,
mSurfaceLock = {
<mozilla::OffTheBooksMutex> = {
<mozilla::detail::MutexImpl> = {
platformData_ = {0x1, 0x1000603fd, 0x2, 0x0, 0x0}
},
<mozilla::BlockingResourceBase> = {
mChainPrev = 0x0,
mName = 0x7f7f93a89b8b "DMABufSurface",
mType = mozilla::BlockingResourceBase::eMutex,
mAcquired = true
},
members of mozilla::OffTheBooksMutex:
mOwningThread = 0x7f7f66965160
}, <No data fields>}
},
members of DMABufSurfaceYUV:
mWidth = {1280, 640, 0, 0},
mHeight = {584, 292, 0, 0},
mEGLImage = {0x0, 0x0, 0x0, 0x0},
mTexture = {0, 0, 0, 0},
mColorSpace = mozilla::gfx::YUVColorSpace::BT709
}
Digging a bit more, we fail to create the texture because this check returns false in the DRI driver, because pitch
is 768
but surf->u.gfx9.surf_pitch
is 640
.
pitch
comes from stride / 2
, and that stride comes from DMABufSurfaceYUV::mStrides
(1536
).
Does that help?
Assignee | ||
Comment 9•3 years ago
|
||
We need to check we can create EGLImage from the dmabuf surface and fallback to SW decode otherwise.
Assignee | ||
Comment 11•3 years ago
|
||
I think the best solution is to create EGLImage right after image decode in media process and pass EGLImage to IPC bridge instead of dmabuf fds. It may need some work to not break WebGL backend and it may consume some memory on media process to create GL context but we get rid of complex dmabuf setup pipelined through IPC bridge.
Assignee | ||
Comment 12•3 years ago
|
||
Hm, looks like EGLImage can't be shared across processes - that works in single process only when address space is shared. So we need to create the EGLImage for decoded images or pass the failure from compositor to decoder.
Assignee | ||
Comment 13•3 years ago
|
||
To verify EGLImage creation we need to create GL context in RDD process. jld, do you think we can create GL context in RDD process? It's blocked by RDD sandbox right now.
Comment 14•3 years ago
|
||
I didn't think we were blocking anything that would be needed to create a GL context (as of the patch that allowed DRI and various Mesa device-detection things), but something may be missing. In particular, we may need to allow creating new connections to the display server (like this, for content processes on X11), but it could also be something else like a config file. If there aren't any seccomp-bpf errors, setting MOZ_SANDBOX_LOGGING=1
and reading the broker logs should help. Alternately, if you have a patch I can try locally, I can try to see what's going wrong.
Assignee | ||
Comment 15•3 years ago
|
||
Thanks, there are the error code I see on RDD process:
Sandbox: Failed errno -2 op open flags 02000000 path /raid/src2/objdir/dist/bin/libEGL.so
Sandbox: Failed errno -2 op open flags 02000000 path /raid/src2/objdir/dist/bin/libGLdispatch.so.0
Sandbox: Failed errno -2 op open flags 02000000 path /raid/src2/objdir/dist/bin/libGL.so
Sandbox: Failed errno -2 op open flags 02000000 path /raid/src2/objdir/dist/bin/libGLX.so.0
Sandbox: SandboxBroker: denied op=open rflags=2204000 perms=0 path=/etc/glvnd/egl_vendor.d for pid=25285
Sandbox: Failed errno -13 op open flags 02204000 path /etc/glvnd/egl_vendor.d
Sandbox: SandboxBroker: denied op=open rflags=2204000 perms=0 path=/usr/share/glvnd/egl_vendor.d for pid=25285
Sandbox: Failed errno -13 op open flags 02204000 path /usr/share/glvnd/egl_vendor.d
Assignee | ||
Comment 16•3 years ago
|
||
Updated•3 years ago
|
Assignee | ||
Comment 17•3 years ago
|
||
I see the sandbox breakage with the patch above.
Thanks.
Reporter | ||
Comment 19•3 years ago
|
||
(ni? me to file comment 8 as a bug when back at my computer)
Comment 20•3 years ago
|
||
There are indeed some things I was forgetting. For one, the content policy allows reading all of /etc
and /usr/share/
, which takes care of the …/glvnd/eglvendor.d/…
files that specify what library to load. It also wants access the Mesa shader cache, and there are various drirc
files. I'll need to spend more time looking into this.
Assignee | ||
Comment 21•3 years ago
|
||
Jed, that would be great. With the checks we can enable va-api by default (on tested hardware) as we can reliably detect decode errors and fallback to SW decode in such case.
Reporter | ||
Comment 22•3 years ago
|
||
Reporter | ||
Comment 23•3 years ago
|
||
Martin, in the mesa issue above it was pointed out to me that these two lines are the ones causing the issue. Do you know why they are needed?
Assignee | ||
Comment 24•3 years ago
|
||
Frankly I don't know, I'll look at it.
As for the patch here, I think if we want to enable va-api by default we need such check anyway to fallback to SW decode in case of any possible issue with GL/drivers.
Assignee | ||
Comment 25•3 years ago
|
||
Keep VADRMPRIMESurfaceDescriptor width/height as is to avoid EGLImage import failures:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/5851
Comment 26•3 years ago
|
||
Comment 27•3 years ago
|
||
Comment 28•3 years ago
|
||
bugherder |
Comment 29•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 31•3 years ago
|
||
It seems that the issue does not reproduce on my end using Ubuntu 18.04 x64, with an affected Nightly build (92.0a1, Build ID 20210806213901). I had the build opened in a wayland session, and the pref media.ffmpeg.vaapi.enabled
set to true.
Emilio, would you mind checking if the fix works for you on latest Beta 98?
Updated•3 years ago
|
Reporter | ||
Comment 32•3 years ago
|
||
Yeah, this reproed only on AMD, and can confirm is fixed now.
Updated•2 years ago
|
Description
•