Closed
Bug 1373578
Opened 7 years ago
Closed 7 years ago
Intermittent Assertion failure: [GFX1]: Failed to create DrawTarget, Type: 3 Size: Size(800,1000), at z:\build\build\src\obj-firefox\dist\include\mozilla/gfx/Logging.h:520
Categories
(Core :: Layout, defect, P3)
Core
Layout
Tracking
()
RESOLVED
FIXED
mozilla57
People
(Reporter: aryx, Assigned: jmaher)
References
(Blocks 2 open bugs)
Details
(Keywords: assertion, intermittent-failure, Whiteboard: [stockwell disabled])
Attachments
(2 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
gbrown
:
review+
|
Details | Diff | Splinter Review |
https://treeherder.mozilla.org/logviewer.html#?job_id=107575203&repo=autoland
23:40:06 INFO - REFTEST TEST-START | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/w3c-css/submitted/shapes1/shape-outside-polygon-018.html == file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/w3c-css/submitted/shapes1/shape-outside-polygon-018-ref.html
23:40:06 INFO - REFTEST INFO | RESTORE PREFERENCE pref(layout.css.shape-outside.enabled,false)
23:40:06 INFO - REFTEST INFO | SET PREFERENCE pref(layout.css.shape-outside.enabled,true)
23:40:06 INFO - REFTEST TEST-LOAD | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/w3c-css/submitted/shapes1/shape-outside-polygon-018.html | 1631 / 7783 (20%)
23:40:06 INFO - ++DOMWINDOW == 287 (12A05C00) [pid = 3940] [serial = 7065] [outer = 1B2A5400]
23:40:06 INFO - [GFX1-]: Failed 2 buffer db=00000000 dw=00000000 for 0, 0, 800, 1000
23:40:06 INFO - [GFX1]: Failed to create DrawTarget, Type: 3 Size: Size(800,1000)
23:40:06 INFO - Assertion failure: [GFX1]: Failed to create DrawTarget, Type: 3 Size: Size(800,1000), at c:\builds\moz2_slave\autoland-w32-d-000000000000000\build\src\obj-firefox\dist\include\mozilla/gfx/Logging.h:519
23:40:24 INFO - #01: mozilla::gfx::Log<1,mozilla::gfx::CriticalLogger>::Flush() [obj-firefox/dist/include/mozilla/gfx/Logging.h:283]
23:40:24 INFO -
23:40:24 INFO - #02: mozilla::gfx::Factory::CreateDrawTarget(mozilla::gfx::BackendType,mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const &,mozilla::gfx::SurfaceFormat) [gfx/2d/Factory.cpp:397]
23:40:24 INFO -
23:40:24 INFO - #03: gfxPlatform::CreateDrawTargetForBackend(mozilla::gfx::BackendType,mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const &,mozilla::gfx::SurfaceFormat) [gfx/thebes/gfxPlatform.cpp:1454]
23:40:24 INFO -
23:40:24 INFO - #04: mozilla::layers::PersistentBufferProviderBasic::Create(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>,mozilla::gfx::SurfaceFormat,mozilla::gfx::BackendType) [gfx/layers/PersistentBufferProvider.cpp:73]
23:40:24 INFO -
23:40:24 INFO - #05: mozilla::layers::LayerManager::CreatePersistentBufferProvider(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const &,mozilla::gfx::SurfaceFormat) [gfx/layers/Layers.cpp:149]
23:40:24 INFO -
23:40:24 INFO - #06: mozilla::layers::ClientLayerManager::CreatePersistentBufferProvider(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const &,mozilla::gfx::SurfaceFormat) [gfx/layers/client/ClientLayerManager.cpp:920]
23:40:24 INFO -
23:40:24 INFO - #07: mozilla::dom::CanvasRenderingContext2D::TrySharedTarget(RefPtr<mozilla::gfx::DrawTarget> &,RefPtr<mozilla::layers::PersistentBufferProvider> &) [dom/canvas/CanvasRenderingContext2D.cpp:1892]
23:40:24 INFO -
23:40:24 INFO - #08: mozilla::dom::CanvasRenderingContext2D::EnsureTarget(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits,float> const *,mozilla::dom::CanvasRenderingContext2D::RenderingMode) [dom/canvas/CanvasRenderingContext2D.cpp:1697]
23:40:24 INFO -
23:40:24 INFO - #09: mozilla::dom::CanvasRenderingContext2D::DrawWindow(nsGlobalWindow &,double,double,double,double,nsAString const &,unsigned int,mozilla::ErrorResult &) [dom/canvas/CanvasRenderingContext2D.cpp:5601]
23:40:24 INFO -
23:40:24 INFO - #10: mozilla::dom::CanvasRenderingContext2DBinding::drawWindow [obj-firefox/dom/bindings/CanvasRenderingContext2DBinding.cpp:2311]
23:40:24 INFO -
23:40:24 INFO - #11: mozilla::dom::GenericBindingMethod(JSContext *,unsigned int,JS::Value *) [dom/bindings/BindingUtils.cpp:2960]
23:40:24 INFO -
....
23:40:24 ERROR - TEST-UNEXPECTED-FAIL | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/w3c-css/submitted/shapes1/shape-outside-polygon-018.html | application terminated with exit code 1
23:40:24 INFO - REFTEST INFO | Copy/paste: C:\slave\test\build\win32-minidump_stackwalk.exe c:\users\cltbld\appdata\local\temp\tmpubwyc8.mozrunner\minidumps\48e261cc-30b6-4ed0-aa5d-ceb0bed311a0.dmp C:\slave\test\build\symbols
23:40:26 INFO - REFTEST INFO | Saved minidump as C:\slave\test\build\blobber_upload_dir\48e261cc-30b6-4ed0-aa5d-ceb0bed311a0.dmp
23:40:26 INFO - REFTEST INFO | Saved app info as C:\slave\test\build\blobber_upload_dir\48e261cc-30b6-4ed0-aa5d-ceb0bed311a0.extra
23:40:26 INFO - REFTEST PROCESS-CRASH | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/w3c-css/submitted/shapes1/shape-outside-polygon-018.html | application crashed [@ xul.dll + 0x8f2fe3]
23:40:26 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpubwyc8.mozrunner\minidumps\48e261cc-30b6-4ed0-aa5d-ceb0bed311a0.dmp
23:40:26 INFO - Operating system: Windows NT
23:40:26 INFO - 6.1.7601 Service Pack 1
23:40:26 INFO - CPU: x86
23:40:26 INFO - GenuineIntel family 6 model 45 stepping 7
23:40:26 INFO - 8 CPUs
23:40:26 INFO -
23:40:26 INFO - GPU: UNKNOWN
23:40:26 INFO -
23:40:26 INFO - Crash reason: EXCEPTION_BREAKPOINT
23:40:26 INFO - Crash address: 0x5d922fe3
23:40:26 INFO - Process uptime: 767 seconds
23:40:26 INFO -
23:40:26 INFO - Thread 0 (crashed)
23:40:26 INFO - 0 xul.dll + 0x8f2fe3
23:40:26 INFO - eip = 0x5d922fe3 esp = 0x001da418 ebp = 0x001da420 ebx = 0x001da598
23:40:26 INFO - esi = 0x6098f950 edi = 0x00000208 eax = 0x00000000 ecx = 0x70e906ef
23:40:26 INFO - edx = 0x00000060 efl = 0x00000206
23:40:26 INFO - Found by: given as instruction pointer in context
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Assignee | ||
Comment 3•7 years ago
|
||
This really started up on July 13th and was failing quite often by July 15th. I think this is on track for a high frequency failure (24 failures since July 13th). This is the same failure as posted above.
:jet, could you help find someone to look at fixing this crash in the next 2 weeks?
Flags: needinfo?(bugs)
Whiteboard: [stockwell needswork]
Reporter | ||
Updated•7 years ago
|
Summary: Intermittent layout/reftests/w3c-css/submitted/shapes1/shape-outside-polygon-018.html | application crashed [@ xul.dll + 0x8f2fe3] → Intermittent Assertion failure: [GFX1]: Failed to create DrawTarget, Type: 3 Size: Size(800,1000), at z:\build\build\src\obj-firefox\dist\include\mozilla/gfx/Logging.h:519
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Comment 9•7 years ago
|
||
This is happening often enough that we should disable tests, but there are a lot of different tests failing this way currently. (Many, but not all in css-break.)
Comment 10•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-9) (PTO: back August 2nd) from comment #3)
> This really started up on July 13th and was failing quite often by July
> 15th. I think this is on track for a high frequency failure (24 failures
> since July 13th). This is the same failure as posted above.
This failure was seen much earlier, but reported in other bugs. For instance, July 3, https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=56cb20e086a2d67a467809d406d5fed673267aca&filter-searchStr=windows+debug+reftest.
AFAIK, all of these failures are Windows 7, non-e10s, so perhaps this is another case where we can wait for non-e10s testing to end.
Comment 11•7 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #10)
> AFAIK, all of these failures are Windows 7, non-e10s, so perhaps this is
> another case where we can wait for non-e10s testing to end.
I'm not sure that all the "Failed to create DrawTarget" errors really are constrained to non-e10s Win32. +cc: Milan who may know more about that.
Flags: needinfo?(bugs) → needinfo?(milan)
This is a failure with backend type Skia - which means non-accelerated Windows. I think the rest of the Windows testing has acceleration, right? So, it's a coincidence that we're only seeing it on those configurations, although I guess e10s could play part of this if it turns out it is "just" running out of memory.
Mason, was there any OMTP that landed mid July that could explain some of these problems increasing in frequency? I guess we don't usually have a recorded, so there shouldn't be, but just checking.
Flags: needinfo?(milan) → needinfo?(mchang)
Comment 13•7 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #12)
> This is a failure with backend type Skia - which means non-accelerated
> Windows. I think the rest of the Windows testing has acceleration, right?
> So, it's a coincidence that we're only seeing it on those configurations,
> although I guess e10s could play part of this if it turns out it is "just"
> running out of memory.
>
> Mason, was there any OMTP that landed mid July that could explain some of
> these problems increasing in frequency? I guess we don't usually have a
> recorded, so there shouldn't be, but just checking.
I scanned a couple of the crash signatures and all of them are coming from Canvas. All the OMTP stuff only changed in ClientPaintedLayer, which isn't in the callstack at all here. From comment 3, this started on July 13th. I did a bugzilla search to see what OMTP patches landed between July 13-15 and came up with bug 1380493, and bug 1380483. Bug 1380483 might be the most suspicious in that we start checking that content client exists before recording, but this doesn't actually create a DrawTarget and would've happened by default since we don't enable OMTP on inbound yet.
Flags: needinfo?(mchang)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Week over week, 61/888 -> 1/901, beta only.
Comment 19•7 years ago
|
||
This was happening on non-e10s. We are only running e10s windows reftests now. This might return if we start running non-e10s windows reftests again.
Whiteboard: [stockwell needswork] → [stockwell disabled]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Summary: Intermittent Assertion failure: [GFX1]: Failed to create DrawTarget, Type: 3 Size: Size(800,1000), at z:\build\build\src\obj-firefox\dist\include\mozilla/gfx/Logging.h:519 → Intermittent Assertion failure: [GFX1]: Failed to create DrawTarget, Type: 3 Size: Size(800,1000), at z:\build\build\src\obj-firefox\dist\include\mozilla/gfx/Logging.h:520
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 26•7 years ago
|
||
we are looking at 160+ failures in the last week, all on windows 7 debug non-e10s:
https://brasstacks.mozilla.com/orangefactor/index.html?display=Bug&bugid=1373578
there is no test to turn off as this seems to happen inside of reftests (as pointed out earlier css-break)
Is this a graphics specific crash? I see comments regarding non accelerated graphics from :milan and :mchang.
:jet, I see you are the triage owner- can you help find the right person to fix this in the next week?
Flags: needinfo?(bugs)
Whiteboard: [stockwell disabled] → [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 30•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #26)
> :jet, I see you are the triage owner- can you help find the right person to
> fix this in the next week?
As noted above, we're running into resource contention on the Win32 Debug variants. This will only get worse on those platforms as more reftests get added. I'm willing to take a cheap fix here (e.g., restart the browser after n tests, or shrink the test bucket sizes) but it seems unlikely that we'll see a fix in Skia for a use case (thousands of sequential large canvas bitmap snapshots in debug builds) that only our test harness has.
Flags: needinfo?(bugs)
Assignee | ||
Comment 31•7 years ago
|
||
bug 1302203 is to run reftests per manifest that should fix this- we could disable this on non-e10s win7-debug
Depends on: 1302203
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 33•7 years ago
|
||
I don't think there is a test to disable, this leaves us with either disabling all the tests on win7/debug (non-e10s), or possibly running in more chunks.
this patch would disable the reftests on win7-debug for non-e10s- as a note, this is the configuration that we specifically turned back on for non-e10s; our coverage of non-e10s reftests will be android only.
I am happy to do more chunks instead or other ideas.
Attachment #8903174 -
Flags: review?(bugs)
Comment 34•7 years ago
|
||
I tried running with 16 chunks, but still hit this failure: I don't think more chunks is feasible.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=326923e740a99e12e9280d3798ed6cdd77d85b35
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 36•7 years ago
|
||
it seems that just disabling reftests wholesale is our only option right now. We have been working on making the harness restart between manifests and that is a lot harder than it would seem. Possibly we could run a really small subset of the tests if we find there is important value on non-e10s for a specific feature or two?
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 39•7 years ago
|
||
https://treeherder.mozilla.org/logviewer.html#?job_id=128115082&repo=mozilla-inbound - non-e10s Win7 debug devtools, so good luck with just not running reftests.
Assignee | ||
Comment 40•7 years ago
|
||
if 98% of the failures go away with reftest, then we will win.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Attachment #8903174 -
Flags: review?(bugs) → review+
Comment 44•7 years ago
|
||
This patch disables reftests on win32/debug/non-e10s. I gave it an r+ given the frequency of this failure on this build config, and after receiving a release-drivers notice today that e10s-multi is shipping to 100% of eligible users on the 55 Release.
mrbkap: please comment if you're concerned that we're losing too much test coverage too soon here. Thx!
Flags: needinfo?(mrbkap)
Comment 45•7 years ago
|
||
What is the definition of "eligible users"? What fraction of our Windows user base is _not_ getting e10s-multi?
Also, do we have non-debug reftest coverage on win32/e10s?
Flags: needinfo?(bugs)
Assignee | ||
Comment 46•7 years ago
|
||
this would only be disable don win7-non-e10s (which is only run on debug for win7). We have opt/debug/pgo reftests on windows for e10s coverage still.
Comment 47•7 years ago
|
||
Sorry, I misspoke. Do we still have non-debug reftest coverage on win32/non-e10s?
Assignee | ||
Comment 48•7 years ago
|
||
no, this is it for win32. We do have android reftest non-e10s coverage, and some on linux when profiling with jsdcov.
Comment 49•7 years ago
|
||
OK. Then my main question remains: are we still shipping win32/non-e10s to users?
Comment 50•7 years ago
|
||
I just had the same conversation with Jet. Unfortunately, "eligible users" is still not 100% of all users. In particular, we have users on old-style extensions that disable e10s on 55 and 56 (release and beta currently) as well as users with a11y enabled that disable e10s everywhere.
I don't know what the breakdown of those users is wrt win32 vs win64. Jim, would you know?
Flags: needinfo?(mrbkap) → needinfo?(jmathies)
Updated•7 years ago
|
Flags: needinfo?(bugs)
Attachment #8903174 -
Flags: review+ → review?(bugs)
Comment 51•7 years ago
|
||
I spoke with dbolter on irc about this. A11y users on win32 still get non-e10s and will continue to do so at least through FF56. A go/no-go decision for A11y+e10s on FF57 is expected this Friday. Per dbolter: "Basically we'd like a week before maybe saying yeah turn off single process tests. I'm pushing hard to ship."
Let's revisit this one next week. Resetting the r? on the patch.
Assignee | ||
Comment 52•7 years ago
|
||
sounds good, thanks for the discussion here
Comment hidden (Intermittent Failures Robot) |
Comment 54•7 years ago
|
||
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #49)
> OK. Then my main question remains: are we still shipping win32/non-e10s to
> users?
Sure, accessibility users and incompat add-on users, plus anyone who disables e10s. Our 64-bit distro #s are still really small (2% of release I think?) but this is changing (starting on beta this week) as we've started migrating 32-bit users on 64-bit machines over to 64-bit builds.
Flags: needinfo?(jmathies)
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Priority: -- → P3
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 58•7 years ago
|
||
this continues to be our #1 intermittent, who is working on fixing this?
Flags: needinfo?(bugs)
Comment 59•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #58)
> this continues to be our #1 intermittent, who is working on fixing this?
@dbolter asked to have until Friday (9/15) for the decision status on shipping a11y+e10s which will reduce the need for these tests on this configuration.
Flags: needinfo?(bugs) → needinfo?(dbolter)
Comment 60•7 years ago
|
||
Correct, agreed. This NI should make me come back Friday, but if I don't please feel free to ping. Note while there is a chance we'll ride 57 to Beta, it doesn't mean we'll stick.
Flags: needinfo?(dbolter)
Updated•7 years ago
|
Flags: needinfo?(dbolter)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 64•7 years ago
|
||
:davidb, checking in with you here- I assume all is well?
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 65•7 years ago
|
||
chatted with :davidb online, he would like to see us keep reftests in non-e10s mode around at until we ship with a11y+e10s (ideally 57 release in November).
I thought of solving this problem another way, what if we run the tests in many chunks- this has the browser session doing a lot less, and in the end I have ~1000 reftest jobs and 1 instance of this:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1ce7c0b8206e16566c625d03279dd2e276dbd22b
We typically see 400-500 jobs/week for reftest-non-e10s, so this appears to be a significant win for this bug and bug 1393934.
:gbrown, what do you think of this approach?
Flags: needinfo?(gbrown)
Comment 66•7 years ago
|
||
That result is slightly inconsistent with my experience in comment 34...but you used more chunks, so maybe it will work? Let's try it!
Flags: needinfo?(gbrown)
Assignee | ||
Comment 67•7 years ago
|
||
yeah, I looked back at that try push, 16 chunks with 5 data points/chunk (80 jobs total) and I see 2 instances of this error.
Now to figure out how to chunk 32 times on non-e10s only!
Assignee | ||
Comment 68•7 years ago
|
||
This is the simplest way I could think of solving this bug- I am open to other ideas if you have them.
Attachment #8909459 -
Flags: review?(gbrown)
Comment 69•7 years ago
|
||
Comment on attachment 8909459 [details] [diff] [review]
win7/debug non-e10s reftests at 32 chunks
Review of attachment 8909459 [details] [diff] [review]:
-----------------------------------------------------------------
I don't have a better suggestion. Thanks for calling it out as a hack.
How can we help ensure this gets cleaned up when we stop running non-e10s? Maybe a note in tests.yml?
Attachment #8909459 -
Flags: review?(gbrown) → review+
Comment 70•7 years ago
|
||
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ac60362e1cc
split reftest non-e10s into 32 chunks. r=gbrown
Comment 71•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox57:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Updated•7 years ago
|
Assignee: nobody → jmaher
status-firefox55:
--- → wontfix
status-firefox56:
--- → wontfix
status-firefox-esr52:
--- → wontfix
Updated•7 years ago
|
Flags: needinfo?(dbolter)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•7 years ago
|
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Updated•7 years ago
|
Attachment #8903174 -
Flags: review?(bugs)
You need to log in
before you can comment on or make changes to this bug.
Description
•