GPU hangs on ivybridge and sandybridge with backdrop filter blur
Categories
(Core :: Graphics, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | unaffected |
firefox103 | --- | wontfix |
firefox104 | + | wontfix |
firefox105 | + | wontfix |
firefox106 | + | affected |
People
(Reporter: kml, Assigned: bradwerth, NeedInfo)
References
Details
Attachments
(7 files, 5 obsolete files)
User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0
Steps to reproduce:
After the last Firefox update (103.0.2) Intel Graphics driver 9.17.10.2932 crashes constantly on some sites. If between restarts it's possible to close the tab with the site causing problems, the crashes stop.
User agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0
Driver Version: 9.17.10.2932 (latest from laptop manufacturer)
Computer: laptop ASUS K56CB (Intel Core-i5 3317U + HD Graphics 4000)
Windows 7 64 bit (Windows_NT 6.1 7601)
Example site that causes crash:
https://www.asus.com/bt/SupportOnly/K56CB/HelpDesk_Knowledge/
and scroll down
(Some sites causes crash instantly, some after scrolling down).
This didn't happen until the latest Firefox update.
Comment 1•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Graphics' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Assignee | ||
Comment 2•2 years ago
|
||
Thank you for filing. Would you please post your "about:support" to this Bug? We'll try to correlate this to a crash report and figure out where the crash is occurring.
https://www.intel.com/content/www/us/en/download/18606/intel-graphics-driver-for-windows-15-33.html
https://www.intel.com/content/www/us/en/support/articles/000005654/graphics.html
Try updating to the last Win7 driver version released by Intel, 10.18.10.5161. 9.17.10.2932 is very old version.
(In reply to Brad Werth [:bradwerth] from comment #2)
Thank you for filing. Would you please post your "about:support" to this Bug? We'll try to correlate this to a crash report and figure out where the crash is occurring.
I attached the "about:support" info.
(In reply to GMA from comment #3)
https://www.intel.com/content/www/us/en/download/18606/intel-graphics-driver-for-windows-15-33.html
https://www.intel.com/content/www/us/en/support/articles/000005654/graphics.htmlTry updating to the last Win7 driver version released by Intel, 10.18.10.5161. 9.17.10.2932 is very old version.
When I try to install driver version 15.33, I can't do this because the following error appears: "The driver being installed is not validated for this computer. Please obtain the appropriate driver from the computer manufacturer."
Of course, I can try to find workarounds, but still this is the last official release of the manufacturer, and the driver did not crash until the last update.
Assignee | ||
Comment 6•2 years ago
|
||
I'll add the old driver to the blocklist.
Updated•2 years ago
|
Assignee | ||
Comment 7•2 years ago
|
||
I'll add something like the blocking of old nvidia drivers, but for intel. This will activate software WebRender for users in a similar situation, which should solve this problem for this class of users.
Updated•2 years ago
|
Assignee | ||
Comment 8•2 years ago
|
||
I can confirm this error.
After updating to version 103.0.1, FF has started to cause an Intel video driver (v 9.17.10.4229) error on some sites
on Lenovo monoblocks with Win 7 x64 OS in our office
Comment 10•2 years ago
|
||
We should try reproducing this locally to get a regression range.
Assignee | ||
Comment 11•2 years ago
|
||
(In reply to Alex AC from comment #9)
I can confirm this error.
After updating to version 103.0.1, FF has started to cause an Intel video driver (v 9.17.10.4229) error on some sites
on Lenovo monoblocks with Win 7 x64 OS in our office
I've updated the patch to include this version in the blocklist. Obviously an imperfect solution, but if the problem is only occurring with a 10-year-old driver, we can decide if we want to draw the line there.
Comment 12•2 years ago
|
||
Alex AC, can attach the graphics section of your about:support to the bug as well?
Comment 13•2 years ago
|
||
kml,
Are you able to run mozregression to find out what change introduced the problem?
Reporter | ||
Comment 14•2 years ago
|
||
Reporter | ||
Comment 15•2 years ago
|
||
Reporter | ||
Comment 16•2 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #13)
kml,
Are you able to run mozregression to find out what change introduced the problem?
I have attached two new files - log and buildinfo text files from mozregression. I got this from there:
At the end of the log was the following message:
2022-08-16T01:48:13.070000: DEBUG : Found commit message:
Bug 1578503 - Enable backdrop-filter by default r=gfx-reviewers,jrmuizel
There are still a few remaining issues with the updated backdrop
filter implementation, specifically:
- We don't use reflectMode yet for blurs (quality issue in some cases)
- Performance may not be optimal in all use cases
However, we can try enabling by default now and work on these as
follow ups.
Differential Revision: https://phabricator.services.mozilla.com/D148684
2022-08-16T01:48:13.070000: DEBUG : Did not find a branch, checking all integration branches
2022-08-16T01:48:13.072000: INFO : The bisection is done.
2022-08-16T01:48:13.074000: INFO : Stopped
In addition, when the driver crashed, the following errors appeared in the log:
2022-08-16T01:45:18.359000: INFO : b'[Parent 16720, IPC I/O Parent] WARNING: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_win.cc:167'
2022-08-16T01:45:34.849000: INFO : b'[GFX1-]: Internal D3D11 error: HRESULT: 0x887A0005: Error allocating VertexShader'
2022-08-16T01:45:34.879000: INFO : b'[GFX1-]: Context has been lost.'
2022-08-16T01:45:34.879000: INFO : b'[GFX1-]: Failed to link shader program: cs_scale'
2022-08-16T01:45:34.889000: INFO : b''
2022-08-16T01:45:34.889000: INFO : b'[2022-08-15T22:45:34Z ERROR webrender::device::gl] Failed to link shader program: cs_scale'
2022-08-16T01:45:34.889000: INFO : b''
2022-08-16T01:45:34.889000: INFO : b'[GFX1-]: Failed to compile vertex shader: cs_scale_TEXTURE_2D'
2022-08-16T01:45:34.889000: INFO : b''
2022-08-16T01:45:34.889000: INFO : b'[2022-08-15T22:45:34Z ERROR webrender::device::gl] Failed to compile vertex shader: cs_scale_TEXTURE_2D'
2022-08-16T01:45:34.889000: INFO : b''
2022-08-16T01:45:34.889000: INFO : b'[GFX1-]: wr_renderer_render: Shader(Link("cs_scale", ""))'
2022-08-16T01:45:34.889000: INFO : b'[GFX1-]: wr_renderer_render: Shader(Compilation("cs_scale_TEXTURE_2D", ""))'
2022-08-16T01:45:34.889000: INFO : b'[GFX1]: Device reset due to WR device: 0x887a0006'
2022-08-16T01:45:34.889000: INFO : b'[GFX1-]: GFX: RenderThread detected a device reset in PostUpdate'
2022-08-16T01:45:35.905000: INFO : b'[GFX1-]: Fallback WR to SW-WR + D3D11'
2022-08-16T01:45:35.931000: INFO : b'[GFX1-]: Failed to make render context current during destroying.'
2022-08-16T01:45:44.258000: INFO : b'[GFX1-]: Receive IPC close with reason=AbnormalShutdown'
2022-08-16T01:45:44.261000: INFO : b'[GFX1-]: Receive IPC close with reason=AbnormalShutdown'
2022-08-16T01:45:44.262000: INFO : b'[GFX1-]: Receive IPC close with reason=AbnormalShutdown'
2022-08-16T01:45:44.262000: INFO : b'[GFX1-]: Receive IPC close with reason=AbnormalShutdown'
2022-08-16T01:45:44.273000: INFO : b'Exiting due to channel error.'
Hope this helps.
Comment 17•2 years ago
|
||
Yes, that helps a lot.
Reporter | ||
Comment 18•2 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #17)
Yes, that helps a lot.
I set the preference "layout.css.backdrop-filter.enabled" to "false" in "about:config" and video driver crashes stopped.
It might be useful for you to know this.
Comment 19•2 years ago
|
||
Glenn, any guesses as to how backdrop filters would cause the vertex shader not to build?
Updated•2 years ago
|
Comment 20•2 years ago
|
||
No, that doesn't make any sense to me at all - there's no shaders that are specific to backdrop-filter, I can't imagine why it would cause a link failure in cs_scale
.
From [GFX1-]: Internal D3D11 error: HRESULT: 0x887A0005: Error allocating VertexShader'
maybe some kind of coincidental corruption or other bug coming from ANGLE or the driver?
Comment 21•2 years ago
|
||
FWIW, that HRESULT is DXGI_ERROR_DEVICE_REMOVED
.
Comment 22•2 years ago
|
||
It looks like there might be other instances of this happening: https://www.reddit.com/r/firefox/comments/wd7z5h/why_do_i_keep_getting_white_screen_flashes/
Comment 23•2 years ago
|
||
Updated•2 years ago
|
Comment 24•2 years ago
|
||
I believe I can reproduce this locally
Comment 25•2 years ago
|
||
I see it on Win10 with 9.17.10.4459
Updated•2 years ago
|
Updated•2 years ago
|
Comment 26•2 years ago
|
||
9.17.10.4459 is the newest driver available to me on Windows update
Updated•2 years ago
|
Comment 27•2 years ago
|
||
Comment 28•2 years ago
|
||
It seems the 'background-color' is important for reproducing the problem
Updated•2 years ago
|
Comment 29•2 years ago
|
||
Recording the problem in GPUview suggests that it's a GPU hang. I see packet submitted by Firefox taking 14-15 seconds.
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 30•2 years ago
|
||
I can't reproduce this, and we're not going to handle it as a driver blocklist issue, so I'll take myself off the bug.
Comment 31•2 years ago
|
||
I can not reproduce the problem with the 10.18.10.5161 driver
Comment 32•2 years ago
|
||
doesn't reproduce with 10.18.10.4425
Comment 33•2 years ago
|
||
(In reply to kml from comment #0)
User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0
Steps to reproduce:
After the last Firefox update (103.0.2) Intel Graphics driver 9.17.10.2932 crashes constantly on some sites. If between restarts it's possible to close the tab with the site causing problems, the crashes stop.
User agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0
Driver Version: 9.17.10.2932 (latest from laptop manufacturer)
Computer: laptop ASUS K56CB (Intel Core-i5 3317U + HD Graphics 4000)
Windows 7 64 bit (Windows_NT 6.1 7601)Example site that causes crash:
https://www.asus.com/bt/SupportOnly/K56CB/HelpDesk_Knowledge/
and scroll down
(Some sites causes crash instantly, some after scrolling down).This didn't happen until the latest Firefox update.
You need to update your video driver to 15.33.53.5161 because it's likely a driver bug.
Please download the latest video driver from here https://www.intel.com/content/www/us/en/products/sku/65707/intel-core-i53317u-processor-3m-cache-up-to-2-60-ghz/downloads.html and see if your issue goes away.
Comment 34•2 years ago
|
||
Just FYI that driver may not be available for Windows 7?
Comment 35•2 years ago
|
||
(In reply to Ashley Hale from comment #34)
Just FYI that driver may not be available for Windows 7?
It explicitly states "Windows 7, 32-bit*,Windows 8.1, 32-bit*,Windows 7, 64-bit* 3 More"
Comment 36•2 years ago
|
||
Oh cool, thanks for the correction. I was going by another bug where a Windows 7 laptop could not update, but that may have just been OEM restrictions.
Comment 37•2 years ago
|
||
(In reply to Ashley Hale from comment #36)
Oh cool, thanks for the correction. I was going by another bug where a Windows 7 laptop could not update, but that may have just been OEM restrictions.
Lenovo is notorious for OEM lock. Not sure if an older Asus with Ivy Bridge will be too. I hope not.
Reporter | ||
Comment 38•2 years ago
|
||
(In reply to Arthur K. [He/Him] from comment #33)
You need to update your video driver to 15.33.53.5161 because it's likely a driver bug.
Please download the latest video driver from here https://www.intel.com/content/www/us/en/products/sku/65707/intel-core-i53317u-processor-3m-cache-up-to-2-60-ghz/downloads.html and see if your issue goes away.
Well, as I wrote in comment 5, when I try to install driver version 15.33, I can't do this because the following error appears: "The driver being installed is not validated for this computer. Please obtain the appropriate driver from the computer manufacturer."
I have the latest available driver from the manufacturer installed (as well as through Windows Update). The driver at your link is in the form of an exe-file, so I can't install it manually.
Comment 39•2 years ago
|
||
(In reply to kml from comment #38)
(In reply to Arthur K. [He/Him] from comment #33)
You need to update your video driver to 15.33.53.5161 because it's likely a driver bug.
Please download the latest video driver from here https://www.intel.com/content/www/us/en/products/sku/65707/intel-core-i53317u-processor-3m-cache-up-to-2-60-ghz/downloads.html and see if your issue goes away.
Well, as I wrote in comment 5, when I try to install driver version 15.33, I can't do this because the following error appears: "The driver being installed is not validated for this computer. Please obtain the appropriate driver from the computer manufacturer."
I have the latest available driver from the manufacturer installed (as well as through Windows Update). The driver at your link is in the form of an exe-file, so I can't install it manually.
That's what I get for not scrolling through the discussion. The ZIP is located here: https://www.intel.com/content/www/us/en/download/18606/intel-graphics-driver-for-windows-15-33.html If you have 7zip or some other freebie extractor, you can extract the .ZIP to some temp folder.
When updating your driver, you should be able to bypass this warning by using the "Have Disk" method and point it to the .inf for the newer driver and just force it to use the newer driver when it complains about it not being "from the computer manufacturer". This method has worked for me for eons. Up to you if you want to go the extra mile.
Updated•2 years ago
|
Comment 40•2 years ago
|
||
To replace OEM driver with Intel's release, we may need to manually remove OEM driver first. Below are the detailed steps:
- Disconnect the internet connection so Windows Update won't automatically reinstall a previous OEM driver.
- Open Device Manager > Display Adapters > right-click [Intel Graphics] > Uninstall Device
Important: Check-mark "delete the driver software for this device" - Right-click anywhere in device manager > select Scan for Hardware Changes
Note: Many older versions can be stored on the system to roll back to - If another Intel Graphics is reinstalled, repeat 2 & 3 until Basic Display Adapter is shown, not the Intel driver.
Comment 41•2 years ago
|
||
The bug is marked as tracked for firefox104 (beta) and tracked for firefox105 (nightly). We have limited time to fix this, the soft freeze is in a day. However, the bug still isn't assigned.
:bhood, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Comment 42•2 years ago
|
||
Reducing the size of the Firefox window to 1920/2 prevents the problem from happening
Comment 43•2 years ago
|
||
and reducing the blur radius to 1px doesn't help
Comment 44•2 years ago
|
||
Does reproduce with:
9.17.10.4229 5/25/2015
9.17.10.2867 9/26/2012
9.17.10.2843 8/21/2012
Does not reproduce with:
8.15.10.2351 4/10/2011
8.15.10.2401 5/21/2011
8.15.10.2559 10/21/2011
8.15.10.2778 6/6/2012
8.15.10.2879 10/30/2012
Comment 45•2 years ago
|
||
I haven't been able to reproduce this with a capture even after updating mozangle to the same version of ANGLE as is in Firefox.
Comment 46•2 years ago
|
||
I was able to get an apitrace recording. The hang happens when executing a ID3D11DeviceContext4::Draw
call. I believe this Draw is coming from here: https://searchfox.org/mozilla-central/rev/14fd7ed50b087ca4d46d33e0f818360c32294afa/gfx/angle/checkout/src/libANGLE/renderer/d3d/d3d11/Clear11.cpp#797 when we try to clear a depth buffer.
Comment 47•2 years ago
|
||
If I change the preceding RSSetScissorRects
to set 0 rects instead of 1 the draw call doesn't hang.
Updated•2 years ago
|
Comment 48•2 years ago
|
||
The clear seems to be coming from here: https://searchfox.org/mozilla-central/rev/14fd7ed50b087ca4d46d33e0f818360c32294afa/gfx/wr/webrender/src/renderer/mod.rs#3274
Comment 49•2 years ago
|
||
Disabling enable_clear_scissor seems to prevent the crash
Comment 50•2 years ago
|
||
During the daily I noticed that ANGLE's Clear fallback is still using a scissor when drawing to the depth:
https://searchfox.org/mozilla-central/source/gfx/angle/checkout/src/libANGLE/renderer/d3d/d3d11/Clear11.cpp#790
But the clear fallback that we use elsewhere does not.
Comment 51•2 years ago
|
||
It seems plausible that the hangs are related to this problem: https://gitlab.freedesktop.org/mesa/mesa/-/commit/714b4f6184db84a738cf2d063980f0e19ab03b4b
Comment 52•2 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #51)
It seems plausible that the hangs are related to this problem: https://gitlab.freedesktop.org/mesa/mesa/-/commit/714b4f6184db84a738cf2d063980f0e19ab03b4b
I take it we're current with ANGLE version such that there's nothing there that would work around or fix it?
Comment 53•2 years ago
|
||
This is generated from the apitrace of Firefox. It hasn't been reduced that much yet.
Comment 54•2 years ago
|
||
Comment 55•2 years ago
|
||
This version of the program is somewhat readable.
There are two draws. The first one uses dual source blending. The second one does not. The second one hangs. I suspect this bug has the same underlying cause as bug 1633628
Updated•2 years ago
|
Comment 56•2 years ago
|
||
This is still present in 104?
Comment 57•2 years ago
|
||
The underlying problem is still in 104 but bug 1785366 which is in 104 tries to avoid hitting it.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 59•2 years ago
|
||
So it turns out I misdiagnosed bug 1633628. The cause of that was not the ClearView
call hanging but the depth only draw that happened afterward. I confirmed this by getting a new apitrace recording of that hang and replaying successfully past ClearView. The reason the fix there helped is that by avoiding ClearView
we cleared the color and depth targets together thus avoiding doing a depth only draw.
I'm not sure why we're doing a depth only clear in this case but avoiding that is a temporary option for avoiding this hang.
I'm not sure what a better fix for ANGLE is at this point.
Comment 60•2 years ago
|
||
Glenn, what would be a good way to measure the performance gain from scissoring during clear?
Comment 61•2 years ago
|
||
kvark wrote a small gl-benchmarking harness [1]. We could probably extend that slightly to support scissored clears, and run that along with the fill benchmark on a variety of intel GPUs?
[1] https://github.com/kvark/gl-bench/blob/master/src/main.rs
Assignee | ||
Comment 62•1 year ago
|
||
(In reply to Brad Werth [:bradwerth] from comment #7)
I'll add something like the blocking of old nvidia drivers, but for intel. This will activate software WebRender for users in a similar situation, which should solve this problem for this class of users.
Given we don't have a better solution, we will handle this by adding to the blocklist.
Assignee | ||
Comment 63•1 year ago
|
||
Updated•1 year ago
|
Assignee | ||
Comment 64•1 year ago
|
||
(In reply to Brad Werth [:bradwerth] from comment #62)
Given we don't have a better solution, we will handle this by adding to the blocklist.
I misunderstood. We still have hope of affecting this calling pattern either in Angle or within WebRender.
Comment 65•1 year ago
|
||
(In reply to Glenn Watson [:gw] from comment #61)
kvark wrote a small gl-benchmarking harness [1]. We could probably extend that slightly to support scissored clears, and run that along with the fill benchmark on a variety of intel GPUs?
[1] https://github.com/kvark/gl-bench/blob/master/src/main.rs
so something like this gives me full clear:
1| windows | "4.6.0 - Build 31.0.101.4502" | "Intel(R) Iris(R) Xe Graphics" | 1920x1200 | 1 | 0.50 ms | 0 mcs | 217 mcs | 34 mcs |
Scissored:
1| windows | "4.6.0 - Build 31.0.101.4502" | "Intel(R) Iris(R) Xe Graphics" | 1920x1200 | 1 | 0.17 ms | 2 mcs | 74 mcs | 31 mcs |
So scissored seems faster, but also I'm scissoring to only a quarter of the screen so not sure how representative that is.
Glen, does the commit above seem reasonable? Do the results match your expectations? Thanks
Assignee | ||
Comment 66•1 year ago
|
||
It looks like there's only one way that the ColorRenderTarget
clear_color
is ever set to None
. That was done as part of Bug 1764005. If that target also will return true
for needs_depth()
, then we're setting up a depth-only clear. And of course if the clear_color
is set to None
here and then later the conditions change to make needs_depth()
true, then that would lead to the same problem. Tricky.
Glenn, should we be doing something more complicated here to ensure we don't attempt a depth-only clear?
Assignee | ||
Comment 67•1 year ago
|
||
This adds a capabilities boolean to note whether or not the device can
successfully depth-only clear. It is set false for Sandybridge and
Ivybridge hardware; true for others. At the point of clearing, it panics
if a depth-only clear is attempted. A later part will need to detect when
we are about to submit a depth-only clear and supply a color when required
by the device.
Assignee | ||
Comment 68•1 year ago
|
||
Jeff, in your reproduction case, does a build with attachment 9350178 [details] applied hit the panic instead of the driver crash?
Assignee | ||
Updated•1 year ago
|
Comment 69•1 year ago
|
||
If I'm reading that correctly it suggests that scissored clears are still likely to be a clear performance win on Xe, at least for that specific rectangle size (the overall % I guess depends on what our average / typical clear region within targets it).
Description
•