Closed Bug 1692848 Opened 4 years ago Closed 4 years ago

Crash playing video on twitter on Mali-G72/76 Android 11 devices

Categories

(Core :: Graphics, defect)

ARM64
Android
defect

Tracking

()

RESOLVED FIXED
87 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox86 --- wontfix
firefox87 --- fixed

People

(Reporter: jnicol, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Attachments

(1 file)

In bug 1688017 and 1688705 we saw high crash rates on Mali G72 and G76 devices running Android 11. Users reported that this was reliably reproducible when playing video on twitter.

In those bugs we disabled webrender as a temporary solution. This bug tracks finding the actual cause of the crashes, fixing it, and re-enabling webrender.

I now have my hands on a Samsung Galaxy S10, and can indeed reproduce on twitter. But not youtube. The difference appears to be that on twitter, we render the video to an offscreen target (maybe because we clip the "cards" or something), so we use the brush_image_TEXTURE_EXTERNAL shader. On youtube we composite the video directly to the main framebuffer, using the composite_TEXTURE_EXTERNAL shader.

If I make can_promote_to_surface() always return SurfacePromotionResult::Failed then we will render the youtube video to each picture cache tile using brush_image, rather than directly to the main framebuffer using composite. This does reproduce the crash.

By reducing the brush_image shader, I found out that the textureSize(samplerExternalOES) call in the vertex shader causes the crash. If I remove it and hard code the texture size then there is no crash. Likewise if I move the call to the fragment shader and do the UV normalization there, then there is also no crash.

I received a response from ARM about the crash. If I understand correctly, the theory is that the driver expects the presence of some data due to the fact that the sampler is used (by textureSize). However, the shader compiler does not omit said data because the sampler is not actually sampled from, only its size is queried. So the driver crashes due to the missing data. The suggested workaround is to sample from the texture in the vertex shader (in a dynamic branch which is never actually taken) to force this data to be emitted. This appears to work, but the driver is actually quite good about optimizing out unused code so it must be done carefully. Having a think now about how to implement this best.

On some Mali devices we have encountered driver crashes caused by
calling textureSize(samplerExternalOES) in a shader without also
potentially sampling from the texture in the shader. ARM's suggested
workaround was to trick the driver in to thinking that the texture may
be sampled from (ie by sampling in a branch which is never dynamically
taken).

This is done by checking the value of a dummy uniform, and sampling
the texture if the value is non-default. Using a constant expression
did not work because the compiler would optimize the condition (and
therefore the sample) away.

Also re-enable webrender on Mali-72 and G76 devices, as it was blocked
due to this bug.

Pushed by jnicol@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/13aea562b9c2 Work around Mali driver crash caused by textureSize(samplerExternalOES). r=kvark
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: