Closed Bug 1721617 Opened 3 years ago Closed 2 years ago

VAAPI with old i965-va-driver: GetAsSourceSurface: Crash in [@ mozalloc_abort | abort | _iris_batch_flush.cold]

Categories

(Core :: Audio/Video: Playback, defect)

x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1770560
Tracking Status
firefox-esr78 --- unaffected
firefox90 --- unaffected
firefox91 --- unaffected
firefox92 --- disabled
firefox100 --- disabled
firefox101 --- disabled

People

(Reporter: jan, Unassigned)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, nightly-community, regression)

Crash Data

Attachments

(4 files)

I had multiple YouTube tabs open and suddenly got a tab crash.
GetAsSourceSurface() (bug 1712588) is called in both crash reports.

bp-1265f7e9-e9ba-4593-bcc6-3cc090210721 21.07.21, 15:37 [@ IPCError-browser | ShutDownKill | __ioctl ]
bp-853e5f58-069f-427c-a5d2-4940d0210721 21.07.21, 15:37 [@ mozalloc_abort | abort | _iris_batch_flush.cold ]


Maybe Fission related. (DOMFissionEnabled=1)

Crash report: bp-853e5f58-069f-427c-a5d2-4940d0210721 [@ mozalloc_abort | abort | _iris_batch_flush.cold ]

MOZ_CRASH Reason: MOZ_CRASH()

Top 10 frames of crashing thread:

0 firefox-bin mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:33
1 firefox-bin abort memory/mozalloc/mozalloc_abort.cpp:86
2 libgallium_dri.so _iris_batch_flush.cold 
3 libgallium_dri.so iris_transfer_map src/gallium/drivers/iris/iris_resource.c:1952
4  @0x77fffffffff 
5 libgallium_dri.so st_ReadPixels src/mesa/state_tracker/st_cb_readpixels.c:518
6  @0x7f5b4a8405af 
7 libgallium_dri.so _mesa_ReadnPixelsARB src/mesa/main/readpix.c:1179
8 libgallium_dri.so _mesa_ReadPixels src/mesa/main/readpix.c:1194
9 libxul.so mozilla::gl::GLContext::raw_fReadPixels gfx/gl/GLContext.h:1566

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: bp-1265f7e9-e9ba-4593-bcc6-3cc090210721 [@ IPCError-browser | ShutDownKill | __ioctl ]

MOZ_CRASH Reason: MOZ_CRASH()

Top 10 frames of crashing thread:

0 libc.so.6 __ioctl 
1 libgallium_dri.so _iris_batch_flush 
2 libgallium_dri.so iris_transfer_map src/gallium/drivers/iris/iris_resource.c:1952
3  @0x33bffffffff 
4 libgallium_dri.so st_ReadPixels src/mesa/state_tracker/st_cb_readpixels.c:518
5  @0x7f6cac954b2f 
6 libgallium_dri.so _mesa_ReadnPixelsARB src/mesa/main/readpix.c:1179
7 libgallium_dri.so _mesa_ReadPixels src/mesa/main/readpix.c:1194
8 libxul.so mozilla::gl::GLContext::raw_fReadPixels gfx/gl/GLContext.h:1566
9 libxul.so mozilla::gl::ReadPixelsIntoDataSurface gfx/gl/GLReadTexImageHelper.cpp:379
Severity: -- → S2
Severity: S2 → --
Summary: VAAPI: Crash in [@ mozalloc_abort | abort | _iris_batch_flush.cold] → VAAPI: GetAsSourceSurface: Crash in [@ mozalloc_abort | abort | _iris_batch_flush.cold]

This is crash in MESA/Gfx drivers so better move to MESA tracker.

EGL / Gnome Xwayland. YouTube video plays the whole time. I click on the bookmark icon to open the edit bookmark panel. Video suddenly gets artifacts and becomes green. Then I click on the bookmark icon again and again and get a tab crash. bp-0212d4be-fe0d-4c06-8a8f-e187f0210725

From the log it looks like multi-threading issue when dmabuf is synced in MediaPDecoder #12 thread:

0 libc.so.6 __ioctl
1 libdrm.so.2 drmIoctl xf86drm.c:207
2 libdrm_intel.so.1 drm_intel_gem_bo_start_gtt_access intel/intel_bufmgr_gem.c:1902
Ø 3 i965_drv_video.so i965_drv_video.so@0x5d996
4 libva.so.2 vaSyncSurface va/va/va.c:1635
5 libxul.so mozilla::FFmpegVideoDecoder<58>::GetVAAPISurfaceDescriptor(_VADRMPRIMESurfaceDescriptor*) dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:701
6 libxul.so mozilla::FFmpegVideoDecoder<58>::CreateImageVAAPI(long, long, long, nsTArray<RefPtr<mozilla::MediaData> >&) dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:720
7 libxul.so mozilla::FFmpegVideoDecoder<58>::DoDecode(mozilla::MediaRawData*, unsigned char*, int, bool*, nsTArray<RefPtr<mozilla::MediaData> >&) dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:490
8 libxul.so mozilla::FFmpegDataDecoder<58>::DoDecode(mozilla::MediaRawData*, bool*, nsTArray<RefPtr<mozilla::MediaData> >&) dom/media/platforms/ffmpeg/FFmpegDataDecoder.cpp:182
9 libxul.so mozilla::FFmpegDataDecoder<58>::ProcessDecode(mozilla::MediaRawData*) dom/media/platforms/ffmpeg/FFmpegDataDecoder.cpp:138

while a dmabuf surface it read by GL in main thread:

0 firefox-bin mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:33 context
1 firefox-bin abort memory/mozalloc/mozalloc_abort.cpp:86 cfi
2 libgallium_dri.so _iris_batch_flush.cold cfi
3 libgallium_dri.so iris_transfer_map src/gallium/drivers/iris/iris_resource.c:1952 cfi
4 @0x77fffffffff cfi
5 libgallium_dri.so st_ReadPixels src/mesa/state_tracker/st_cb_readpixels.c:518 scan
6 @0x7f184d82ebdf cfi
7 libgallium_dri.so _mesa_ReadnPixelsARB src/mesa/main/readpix.c:1179 scan
8 libgallium_dri.so _mesa_ReadPixels src/mesa/main/readpix.c:1194 cfi
9 libxul.so mozilla::gl::GLContext::raw_fReadPixels(int, int, int, int, unsigned int, unsigned int, void*) gfx/gl/GLContext.h:1566 cfi
10 libxul.so mozilla::gl::ReadPixelsIntoDataSurface(mozilla::gl::GLContext*, mozilla::gfx::DataSourceSurface*) gfx/gl/GLReadTexImageHelper.cpp:379 cfi
11 libxul.so mozilla::layers::DMABUFSurfaceImage::GetAsSourceSurface() gfx/layers/DMABUFSurfaceImage.cpp:96

so looks like the galium driver is not thread safe here.

I'm unable to reproduce with amdgpu driver / h264 vaapi decoder.

Jan, I still feel this is intel driver bug - can you report that at Mesa please?

Flags: needinfo?(jan)

I scrolled down on https://www.polywork.com/ and a whole section was greenish pixelated.
I reloaded the page, scrolled down and got a tab crash.
bp-73bcfea7-c8de-450e-aaa1-43eb30210914

FYI looking at the crashes this seems to happen only on Mesa 21.1.8.0 and Mesa 21.2.2.0

I don't know whether I should really report this to Mesa.

From the duplicate bug:
(Martin Stránský [:stransky] (ni? me) from bug 1729902 comment #6)

Yes, you're running out of free vaapi dmabuf surfaces. We should handle that situation by caching frames and free vaapi dmabuf surfaces early.

Michel, do you think this may be a Mesa issue (see Comment 3) ?
Thanks.

Flags: needinfo?(michel)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #9)

Michel, do you think this may be a Mesa issue (see Comment 3) ?

Yeah, things seem to be pointing in that direction.

Flags: needinfo?(michel)
Severity: -- → S4

Jan, can you reproduce that with MOZ_LOG="Dmabuf:5, PlatformDecoderModule:5" env variable and attach the log here?
Thanks.

No, the bookmark panel STR (comment 2) doesn't work anymore because the video isn't displayed there anymore, just its background.
Need to run mozregression --find-fix tomorrow.

Attached file log.txt (deleted) —

This crash occured in the content process.
bug 1743638 occured in the RDD process.

comment 2 STR doesn't work in the RDD process: bug 1749623

comment 2 STR in content process with last autoland commit (2021-12-16) before bug 1743647:
This doesn't seem to crash, but video decoding fails while audio keeps playing when opening the "add bookmark" panel the second time.
MOZ_LOG="Dmabuf:5, PlatformDecoderModule:5" mozregression --repo autoland --launch b0a4dcad04ca3ae09d11ac39574e148a1894b3fe --pref gfx.x11-egl.force-enabled:true gfx.webrender.all:true media.ffmpeg.vaapi.enabled:true media.rdd-ffmpeg.enabled:false -P stdout -a https://bug1619882.bmoattachments.org/attachment.cgi?id=9149605 > log.txt

Flags: needinfo?(jan)

Back then I used the old Intel VAAPI driver (i965-va-driver) to prevent sandbox violation. Now I'm using intel-media-driver (intel-media-va-driver).

Attached file old-intel-driver.txt (deleted) —

comment 2 STR in content process with i965-va-driver with last autoland commit (2021-12-16) before bug 1743647:
Most often video decoding fails while audio keeps playing.
Sometimes the video becomes pixelated, even green.
At the end I got a tab crash.
MOZ_LOG="Dmabuf:5, PlatformDecoderModule:5" mozregression --repo autoland --launch b0a4dcad04ca3ae09d11ac39574e148a1894b3fe --pref gfx.x11-egl.force-enabled:true gfx.webrender.all:true media.ffmpeg.vaapi.enabled:true media.rdd-ffmpeg.enabled:false -P stdout -a https://bug1619882.bmoattachments.org/attachment.cgi?id=9149605 > old-intel-driver.txt


I assume I ran into bug 1743638 when I was still using i965-va-driver.

Attached image comment16_green.png (deleted) —
Has Regression Range: --- → yes

Jan, do you still see that? I expect we should restart without HW decode when there's any VA-API issue.
Thanks.

Flags: needinfo?(jan)

I can reproduce this one now.

Flags: needinfo?(jan)
Summary: VAAPI: GetAsSourceSurface: Crash in [@ mozalloc_abort | abort | _iris_batch_flush.cold] → VAAPI with old i965-va-driver: GetAsSourceSurface: Crash in [@ mozalloc_abort | abort | _iris_batch_flush.cold]

Primary cause of this bug (Bug 1770407) was fixed. The remaining issue is multiple assess to dmabuf surface which is covered by Bug 1770560 which switches to only one GL context for dmabuf surface.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: