Closed Bug 1275798 Opened 8 years ago Closed 3 years ago

Graphics driver crash sometimes causing permanently black tabs

Categories

(Core :: Graphics, defect, P3)

defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: markh, Assigned: jerry)

References

Details

(Whiteboard: [gfx-noted])

Attachments

(2 files)

Attached file about-support-graphics.txt (deleted) —
(In reply to David Anderson [:dvander] from bug 1232087 comment #21) > Great! I think we should be good now, with bug 1255711 and friends fixed - > if it happens again please file a new bug. It is happening again on currently Nightlies :( What I see now is after the crash the visible tabs are repainted, but then very quickly go back to being black. All non-visible tabs are also black once I switch to them. I need to restart to resolve.
Of the last 4 crashes I had, 2 recovered and 2 did not.
Summary: Graphics driver crash again causing permanently black tabs → Graphics driver crash sometimes causing permanently black tabs
Whiteboard: [gfx-noted]
Is the above about:support from a clean session, or from a session that had a driver crash?
A clean session following the crash. If the next crash does recover such that I can see about:support, I'll attach a new version from.
Mark, next time it happens, can you try using CTRL+N and see if the new window draws at all? CTRL+L to grab the location bar might also have some success.
Attached file about-support-graphics.txt (deleted) —
I had a few crashes since I last commented, but all recovered. This time was a little strange - the current tab redrew correctly, but other windows remained black for a number of seconds before Firefox reported a couple of tabs had crashed, generating https://crash-stats.mozilla.com/report/index/8859468f-b196-4a5b-9332-68ee12160602. This appears to be flash related and the 2 tabs that I saw had crashed do use flash. I grabbed this data from about:support immediately after that happened - I guess you are after: Failure Log (#0) Error: Detected rendering device reset on refresh (#1) Error: Detected rendering device reset on refresh (#2) Assert: [D2D1.1] 4CreateBitmap failure Size(66,28) Code: 0x8899000c format 0 (#3): CP+[GFX1-]: Detected rendering device reset on refresh (#4): CP+[GFX1-]: Detected rendering device reset on refresh (#5): CP+[GFX1-]: Detected rendering device reset on refresh (#6): CP+[GFX1]: [D3D11] 2 CreateTexture2D failure Size(1808,1152) Code: 0x887a0005 (#7): CP+[GFX1-]: Detected rendering device reset on refresh (#8): CP+[GFX1-]: Detected rendering device reset on refresh (#9): CP+[GFX1-]: Detected rendering device reset on refresh (#10): CP+[GFX1-]: Detected rendering device reset on refresh (#11): CP+[GFX1-]: Detected rendering device reset on refresh (#12): CP+[GFX1-]: Detected rendering device reset on refresh (#13): CP+[GFX1 6]: Timeout on the D3D11 sync lock (but given the tabs without flash did seem to recover OK, I'm not sure it's actually useful - I'll still update this bug if I see all tabs remain black after a reset)
Jeff can reproduce on a local machine, and David should know this code.
Assignee: nobody → jgilbert
Flags: needinfo?(dvander)
You probably won't need it since you can reproduce locally, but gfx.logging.crash.length controls the number of error messages that are kept around and shown in about:support.
All of the messages are from the child process (the CP+ prefix.)
(In reply to Mark Hammond [:markh] from comment #5) > ... > (#13): CP+[GFX1 6]: Timeout on the D3D11 sync lock > > (but given the tabs without flash did seem to recover OK, I'm not sure it's > actually useful - I'll still update this bug if I see all tabs remain black > after a reset) This (sync lock timeout) crashes nightly and dev edition, otherwise we would have kept going. Not sure how bad things would have gotten though.
Note, however, that this is somewhat unexpected, I believe: (#12): CP+[GFX1-]: Detected rendering device reset on refresh (#13): CP+[GFX1 6]: Timeout on the D3D11 sync lock #13 should only happen if gfxWindowsPlatform::GetPlatform()->DidRenderingDeviceReset() returns false. So, somewhere along the way, we detect the driver reset, we call gfxWindowsPlatform::HandleDeviceReset(), which resets the flag, and then we call the SyncObjectD3D11::FinalizeFrame() which hits this timeout. That looks a lot like an equivalent to bug 1133623. So, there is a non-crash and a crash scenario, perhaps different bugs.
The duplicate bugs may have more examples and information.
With a debug build, I have caught twice in DrawTargetD2D1::OptimizeSourceSurface, where CreateBitmap returns 0x8899000c, which is D2DERR_RECREATE_TARGET. I tried calling Init() again to recreate the device context. CreateDeviceContext seems to work fine (S_OK, updates mDC), but the CreateBitmap below that fails again with D2DERR_RECREATE_TARGET. It sounds like the whole device needs to be torn down? Or I'm getting device loss again immediately. (or maybe the driver's bad! AMD beta drivers) Notably, content isn't crashing, just losing its device and staying that way.
Reproduced on the GDC2015 machine temporarily using my R9 390. The error this time was from d3d: 0x887a0005 "DXGI_ERROR_DEVICE_REMOVED" msdn notes say to destroy and recreate the device when this happens, which is reasonable. Other notable errors that are specified but *not* hit (so far): DXGI_ERROR_DEVICE_RESET (badly formed commands sent by application) DXGI_ERROR_DEVICE_HUNG (same as DEVICE_RESET, but "design-time" not "run-time"??) DXGI_ERROR_DRIVER_INTERNAL_ERROR (driver gave up the ghost)
It's notable that this GDC machine is hooked up to a pretty boring monitor (dell 1920x1200 60Hz), as opposed to my complicated system at home.
(In reply to Milan Sreckovic [:milan] from comment #11) > Note, however, that this is somewhat unexpected, I believe: > > (#12): CP+[GFX1-]: Detected rendering device reset on refresh > (#13): CP+[GFX1 6]: Timeout on the D3D11 sync lock > > #13 should only happen if > gfxWindowsPlatform::GetPlatform()->DidRenderingDeviceReset() returns false. > So, somewhere along the way, we detect the driver reset, we call > gfxWindowsPlatform::HandleDeviceReset(), which resets the flag, and then we > call the SyncObjectD3D11::FinalizeFrame() which hits this timeout. > > That looks a lot like an equivalent to bug 1133623. So, there is a > non-crash and a crash scenario, perhaps different bugs. (#12): CP+[GFX1-]: Detected rendering device reset on refresh then (#13): CP+[GFX1 6]: Timeout on the D3D11 sync lock That could be happened. 1. detect driver-removed 2. nsWindow::OnPaint() ->gfxWindowsPlatform::GetPlatform()->UpdateRenderMode() // clear the driver reset flag 3. SyncObjectD3D11::FinalizeFrame() ->gfxWindowsPlatform::GetPlatform()->DidRenderingDeviceReset() // return false here ->gfxDevCrash(LogReason::D3D11FinalizeFrame) << "Without device reset: " << hexa(hr); // hit the crash So I think we should get the driver-remove-status from the device directly instead of our cached value.
Jerry, can you follow up since you're already dealing with the other timeout bug? Jeff can help test any patches, as he can reproduce this quickly.
Assignee: jgilbert → hshih
Are there any updates on this issue? I'd really like to get back to poking at e10s.
I only seem to get back screens when I maximize my Fx window which is how I operate.
I can pretty much get a black screen by going to the gasbuddy map and zooming in and out until I get it: http://www.gasbuddy.com/GasPriceMap
Disabling d2d seems to have eliminated these failures. I can double-check next week.
Flags: needinfo?(jgilbert)
While disabling d2d might prevent the black screens it is not the answer as I'm sure you will agree.
(In reply to Gary [:streetwolf] from comment #23) > While disabling d2d might prevent the black screens it is not the answer as > I'm sure you will agree. Of course. I'm just adding to the data here.
(In reply to Gary [:streetwolf] from comment #21) Just tried that too, and I got the same issue! Recently I get the black screen issue on a ton of pages, and after the latest nightly it seems to be even more aggressive. After I got the black screen this is what's in my log: >(#0) CP+[GFX1-]: Detected rendering device reset on refresh: 4 >(#89) CP+[GFX1]: Failed to create software bitmap: Size(32,32) Code: 0x8899000c >(#90) CP+[GFX1]: Failed to create software bitmap: Size(32,32) Code: 0x8899000c >(#91) CP+[GFX1]: Failed to create software bitmap: Size(32,32) Code: 0x8899000c >(#92) CP+[GFX1-]: Detected rendering device reset on refresh: 4 >(#93) CP+[GFX1]: Failed to create software bitmap: Size(32,32) Code: 0x8899000c >(#195) CP+[GFX1]: [D2D1.1] 4CreateBitmap failure Size(340,453) Code: 0x8899000c format 1
I get a lot of "Failed 2 buffer" messages on a 16k x 8k buffer request, but no driver reset or black screen on Nvidia Quadro 600 under Windows 7. Anyone having this problem that is *not* using Intel graphics? Looks like the black screen comes from driver reset, but lets see if we can figure out what's causing the driver resets. For those that see this problem - can you set layers.enable-tiles to true, and restart Firefox, and see if the driver resets are any fewer?
(In reply to Milan Sreckovic [:milan] from comment #26) > I get a lot of "Failed 2 buffer" messages on a 16k x 8k buffer request, but > no driver reset or black screen on Nvidia Quadro 600 under Windows 7. > > Anyone having this problem that is *not* using Intel graphics? > > Looks like the black screen comes from driver reset, but lets see if we can > figure out what's causing the driver resets. > > For those that see this problem - can you set layers.enable-tiles to true, > and restart Firefox, and see if the driver resets are any fewer? Wow, 16kx8k we should basically expect to fail, but it sounds like we're handling that fine. My repro of black content was on AMD, not Intel though.
Hi Jeff, What's your STR for this bug? Is it crash or just hanging for a long time(We have a lot of sync_object waiting code)?
Status: NEW → ASSIGNED
(In reply to Jerry Shih[:jerry] (PTO:6/8-6/12) (UTC+8) from comment #28) > Hi Jeff, > What's your STR for this bug? > Is it crash or just hanging for a long time(We have a lot of sync_object > waiting code)? Browse amazon.com products, particularly having it load a bunch of the product images. It doesn't crash, but the Device is Lost and I get black content rendering after that. Occasionally, an image will draw properly. (mousing over a product image on Amazon will show the zoomed image sometimes) This only happens on my R9 390, not on any of my other systems.
Flags: needinfo?(jgilbert)
This is pretty important to fix, as it made Firefox basically unusable for me. (until I disabled direct2d, after which I haven't seen it repro) I will re-enable direct2d and check if it still repros if you cannot repro on an affected GPU after about 5m of Amazon browsing.
Severity: normal → major
I also have an r9 390.
Not sure if I should report this here or in a different bug, but my driver just crashed again and took Nightly down with it. There seem to be 3 crash reports for it https://crash-stats.mozilla.com/report/index/bp-04b51929-03b2-4f97-b196-013002160701 https://crash-stats.mozilla.com/report/index/14694082-e337-469b-820d-91cfe2160701 https://crash-stats.mozilla.com/report/index/dc2ed126-6a7a-42c2-a305-a7cf62160701
(In reply to Mark Hammond [:markh] from comment #32) > Not sure if I should report this here or in a different bug, but my driver > just crashed again and took Nightly down with it. There seem to be 3 crash > reports for it > > https://crash-stats.mozilla.com/report/index/bp-04b51929-03b2-4f97-b196- > 013002160701 > https://crash-stats.mozilla.com/report/index/14694082-e337-469b-820d- > 91cfe2160701 > https://crash-stats.mozilla.com/report/index/dc2ed126-6a7a-42c2-a305- > a7cf62160701 I think those are another error.
Hi Jeff, Could you please show your browser's failure log like: (#0) Error: Detected rendering device reset on refresh (#1) Error: Detected rendering device reset on refresh Currently, I have no idea about this black content. If that content is a video, gecko can't recover if the device change from hardware to software. But I think gecko could recover for other content rendering.
Flags: needinfo?(jgilbert)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #34) > Hi Jeff, > Could you please show your browser's failure log like: > (#0) Error: Detected rendering device reset on refresh > (#1) Error: Detected rendering device reset on refresh > > Currently, I have no idea about this black content. If that content is a > video, gecko can't recover if the device change from hardware to software. > But I think gecko could recover for other content rendering. My logs are posted in bug 1277751.
Flags: needinfo?(jgilbert)

Hey Mark,
Can you still reproduce this or should we close it?

Flags: needinfo?(markh)

I've lost access to the crashing card, so :shrug

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(markh)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: