Closed Bug 1779425 Opened 2 years ago Closed 2 years ago

UI freezes and the GPU process crashes when enabled

Tracking

()

Status:

VERIFIED FIXED

Milestone:

104 Branch

Tracking Flags:

Tracking

Status

firefox-esr91

---

unaffected

firefox-esr102

---

unaffected

firefox102

---

unaffected

firefox103

---

disabled

firefox104

---

fixed

People

(Reporter: tgnff242, Assigned: rmader)

References

(Regression)

Details

(Keywords: nightly-community, regression)

Attachments

(1 file)

Bug 1779425 - Check for GbmDevice before using it, r=stransky 2 years ago Robert Mader [:rmader] (deleted), text/x-phabricator-request		Details

tgn-ff

Reporter

Description

•

2 years ago

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:104.0) Gecko/20100101 Firefox/104.0

Steps to reproduce:

Enable the GPU process (layers.gpu-process.enabled:true).
Visit a site with WebGL, like https://webglsamples.org/aquarium/aquarium.html.

Actual results:

The GPU process crashes, the browser UI and content freeze and WebGL fails.

Expected results:

GPU crash report: https://crash-stats.mozilla.org/report/index/bb81a472-c961-4dbd-b200-7c65c0220713

Sometimes, the browser might not resume quickly enough (or at all ?). Here's a crash report after sending a SIGABRT to the main process in that case: https://crash-stats.mozilla.org/report/index/286eac31-97b0-431a-8c09-f49bb0220713

mozregression result:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=f745c54f85264261404d0be6dd14627788ff6485&tochange=ea595572b3922c6ae4e034b6af82c05750b692f4

tgn-ff

Reporter

Updated

•

2 years ago

Has STR: --- → yes

Keywords: nightly-community, regression

Regressed by: 1778114

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

2 years ago

Set release status flags based on info from the regressing bug 1778114

status-firefox102: --- → unaffected

status-firefox103: --- → affected

status-firefox104: --- → affected

status-firefox-esr102: --- → unaffected

status-firefox-esr91: --- → unaffected

BugBot [:suhaib / :marco/ :calixte]

Comment 2

•

2 years ago

:stransky, since you are the author of the regressor, bug 1778114, could you take a look?
For more information, please visit auto_nag documentation.

Flags: needinfo?(stransky)

Martin Stránský [:stransky] (ni? me)

Comment 3

•

2 years ago

This is a regression from Bug 1776563 - we don't create dmabuf device in GetDMABufDevice()->GetGbmDevice() but in gfxPlatform which is apparently not initialized in GPU process. Robert, any idea?

Flags: needinfo?(stransky) → needinfo?(robert.mader)

Regressed by: 1776563
No longer regressed by: 1778114

Ashley Hale [:ahale]

Comment 4

•

2 years ago

Triage - setting severity to S3, since this is only enabled on nightly currently.

Severity: -- → S3

Donal Meehan [:dmeehan]

Updated

•

2 years ago

status-firefox103: affected → disabled

BugBot [:suhaib / :marco/ :calixte]

Comment 5

•

2 years ago

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Martin Stránský [:stransky] (ni? me)

Comment 6

•

2 years ago

(In reply to Ashley Hale from comment #4)

Triage - setting severity to S3, since this is only enabled on nightly currently.

Is that really? I see GPU process disabled by default on Nighly (clean profile). It needs be enabled at about:config.

Robert Mader [:rmader]

Assignee

Comment 7

•

2 years ago

(In reply to Martin Stránský [:stransky] (ni? me) from comment #3)

This is a regression from Bug 1776563 - we don't create dmabuf device in GetDMABufDevice()->GetGbmDevice() but in gfxPlatform which is apparently not initialized in GPU process. Robert, any idea?

Ah yes, makes sense.

After bug 1776563 we only configure the dmabuf device in gfxPlatformGtk::InitDmabufConfig(), which is only called on the parent process.

Maybe we should add something like

if (XRE_IsParentProcess()) {
   (leave as it is)
} else if (gfxVars::UseDMABuf()) {
  nsCString failureId;
  GetDMABufDevice()->Configure(failureId);
}

Maybe even add a check for GPU or RDD process - IIUC that's the only processes where we want drm device access, right?

Flags: needinfo?(robert.mader)

Ashley Hale [:ahale]

Comment 8

•

2 years ago

(In reply to Martin Stránský [:stransky] (ni? me) from comment #6)

(In reply to Ashley Hale from comment #4)

Triage - setting severity to S3, since this is only enabled on nightly currently.

Is that really? I see GPU process disabled by default on Nighly (clean profile). It needs be enabled at about:config.

Oh sorry for the confusion, I didn't mean to imply it was definitely enabled, only that it could be enabled on nightly.

Robert Mader [:rmader]

Assignee

Comment 9

•

2 years ago

Attached file Bug 1779425 - Check for GbmDevice before using it, r=stransky (deleted) — Details

In some non-standard configurations we unexpectedly end up in this paths
without a GBM device - one example being the GPU process. Fail cleanly
instead of crashing in those cases, triggering fallback paths.

Context: in the past DMABuf usage was tightly coupled to GBM. Since the
introduction of the surfaceless and device EGL platforms that is not
longer the case, thus we can't make checks like IsDMABufWebGLEnabled()
depend on the presence of a GBM device.

Optimally all affected cases get fixed eventually. Until then and also
for future cases it makes sense to fail softly.

Phabricator Automation

Updated

•

2 years ago

Assignee: nobody → robert.mader

Status: NEW → ASSIGNED

Pulsebot

Comment 10

•

2 years ago

Pushed by robert.mader@posteo.de: https://hg.mozilla.org/integration/autoland/rev/c29ee933bf30 Check for GbmDevice before using it, r=stransky,jgilbert

Robert Mader [:rmader]

Assignee

Comment 11

•

2 years ago

Regarding comment 7: that approach doesn't work as the GPU- and RDD-process don't seem to initialize the platform. So just add some error handling in the affected cases, ensuring we take fallback paths.

Natalia Csoregi [:nataliaCs]

Comment 12

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c29ee933bf30

Status: ASSIGNED → RESOLVED

Closed: 2 years ago

status-firefox104: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → 104 Branch

tgn-ff

Reporter

Updated

•

2 years ago

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.