Closed
Bug 614772
Opened 14 years ago
Closed 12 years ago
crash with taskbar preview [@ gfxContext::NewPath() ] mainly with Firefox Showcase
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla22
People
(Reporter: scoobidiver, Assigned: bbondy)
References
Details
(Keywords: crash)
Crash Data
Attachments
(4 files)
(deleted),
patch
|
jrmuizel
:
review-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jrmuizel
:
review-
|
Details | Diff | Splinter Review |
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
jrmuizel
:
review+
|
Details | Diff | Splinter Review |
It is a residual crash signature that exist in 3.0, 3.6 and trunk builds.
It is #225 top crasher in 4.0b7 for the last week.
Two comments say:
"clicked a close (X) button on the windows 7 taskbar tab preview of the leftmost tab"
"something with windows 7 taskbar preview and multiple tabs. it just quit unexpectedly."
Signature gfxContext::NewPath()
UUID ad59b146-9156-41b7-ba7d-7a2352101124
Time 2010-11-24 18:13:05.623157
Uptime 2410
Install Age 2410 seconds (40.2 minutes) since version was first installed.
Product Firefox
Version 4.0b8pre
Build ID 20101124042634
Branch 2.0
OS Windows NT
OS Version 6.1.7600
CPU x86
CPU Info GenuineIntel family 6 model 23 stepping 7
Crash Reason EXCEPTION_ACCESS_VIOLATION_READ
Crash Address 0x4
App Notes AdapterVendorID: 10de, AdapterDeviceID: 06e4
Crashing Thread
Frame Module Signature [Expand] Source
0 xul.dll gfxContext::NewPath gfx/thebes/gfxContext.cpp:113
1 xul.dll PresShell::RenderDocument layout/base/nsPresShell.cpp:5301
2 xul.dll nsCanvasRenderingContext2D::DrawWindow content/canvas/src/nsCanvasRenderingContext2D.cpp:3636
3 xul.dll nsIDOMCanvasRenderingContext2D_DrawWindow obj-firefox/js/src/xpconnect/src/dom_quickstubs.cpp:3566
4 mozjs.dll js::Interpret js/src/jsinterp.cpp:4748
5 mozjs.dll js::RunScript js/src/jsinterp.cpp:657
6 mozjs.dll js::Invoke js/src/jsinterp.cpp:737
7 mozjs.dll js::ExternalInvoke js/src/jsinterp.cpp:858
8 mozjs.dll JS_CallFunctionValue js/src/jsapi.cpp:4973
9 xul.dll nsXPCWrappedJSClass::CallMethod js/src/xpconnect/src/xpcwrappedjsclass.cpp:1694
10 xul.dll nsXPCWrappedJS::CallMethod js/src/xpconnect/src/xpcwrappedjs.cpp:588
11 xul.dll PrepareAndDispatch xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:114
12 xul.dll SharedStub xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:141
13 xul.dll mozilla::widget::TaskbarPreview::DrawBitmap widget/src/windows/TaskbarPreview.cpp:404
More reports at:
http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=exact&range_value=4&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=gfxContext%3A%3ANewPath%28%29
Comment 1•14 years ago
|
||
It looks like the thebes surface on the canvas is null. This is probably due to bug 559613 but I'm not sure exactly how.
Comment 3•14 years ago
|
||
See the newly duped bug 644124 for STR :-)
Updated•13 years ago
|
Crash Signature: [@ gfxContext::NewPath() ]
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → netzen
Assignee | ||
Comment 4•13 years ago
|
||
This crash happened to me as I was mouse hovering over each of the tab previews back and forth. I can't reproduce anymore though.
This happens in thebes context when mCairo is not created correctly.
I.e. in the constructor of gfxContext we call:
> mCairo = cairo_create(surface->CairoSurface());
Trying currently to find a way in code to reproduce without using the debugger to manually set NULL on mCairo. Perhaps by continually creating a surface.
Comment 5•13 years ago
|
||
cairo_create should always be giving back a valid cairo object which either performs no-ops or correct behaviour.
Assignee | ||
Comment 6•13 years ago
|
||
Ya I seen the comment in cairo_create and I seen the calls to cairo_create_in_error on error.
Currently debugging to see if there is another way to get it to crash at the same location other than having a NULL mCairo.
Assignee | ||
Comment 7•13 years ago
|
||
The only way I can reproduce (manually with the debugger) is with a NULL or invalid mCairo. Not with a NULL surface nor context.
The noop object we return is one per error code from 0 to CAIRO_STATUS_LAST_STATUS. I think perhaps we're offsetting that by too much and getting passed the array and returning an invalid cairo noop pointer.
Still debugging though.
Assignee | ||
Comment 8•13 years ago
|
||
As per Comment 7:
I couldn't find a case where the cairo status is less than 0 or more than CAIRO_STATUS_LAST_STATUS but I think this is the cause of the crash. Perhaps because of an invalid cairo surface passed to cairo_create or perhaps -1 is being returned as a status somewhere.
In any case I think the fix is worth it and we can see if the crashes go away on Nightly or not for new builds.
The crashes happen and are reported every day a few times, and happen in FF6 dozens of times per day.
It's better than what we currently do in cairo_create_in_error: blindly index into _cairo_nil__objects with the status error code.
Attachment #557582 -
Flags: review?(joe)
Comment 9•13 years ago
|
||
Comment on attachment 557582 [details] [diff] [review]
Possible crash fix for taskbar previews
Jeff's a better person for this, I think. At minimum, you're going to want to put a patch in cairo/ and add it to our list of patches in the README.
Attachment #557582 -
Flags: review?(joe) → review?(jmuizelaar)
Assignee | ||
Comment 10•13 years ago
|
||
As per Joe's comment 9.
Attachment #558463 -
Flags: review?(jmuizelaar)
Comment 11•13 years ago
|
||
Comment on attachment 557582 [details] [diff] [review]
Possible crash fix for taskbar previews
I thought I had reviewed this already, but I guess I forgot to submit.
I don't think we should do this it feels too much like a band-aid. Can you find the caller that's setting this and fix it there?
Attachment #557582 -
Flags: review?(jmuizelaar) → review-
Assignee | ||
Comment 12•13 years ago
|
||
I'll try to find a root cause but if found, I think in addition to that fix we should still do this check since it's better than a memory violation if ever a bad error code is returned from anything in cairo.
Comment 13•13 years ago
|
||
(In reply to Brian R. Bondy [:bbondy] from comment #12)
> I'll try to find a root cause but if found, I think in addition to that fix
> we should still do this check since it's better than a memory violation if
> ever a bad error code is returned from anything in cairo.
It seems like that is more likely to postpone problems. If we do add a check, I think that it ensure that we crash rather than avoid it.
Updated•13 years ago
|
Attachment #558463 -
Flags: review?(jmuizelaar) → review-
Assignee | ||
Comment 14•13 years ago
|
||
I can't reproduce this crash ever anymore for the past few weeks. Could we do something like add telemetry to notify us if the status code I suspect is out of range is indeed out of range? It would basically just be a patch we put on nightly temporary and revert it out after we have the data?
Comment 15•13 years ago
|
||
(In reply to Brian R. Bondy [:bbondy] from comment #14)
> I can't reproduce this crash ever anymore for the past few weeks. Could we
> do something like add telemetry to notify us if the status code I suspect is
> out of range is indeed out of range? It would basically just be a patch we
> put on nightly temporary and revert it out after we have the data?
Definitely. I've done something like this in the past:
http://mxr.mozilla.org/mozilla-central/ident?i=CrashSpline
Assignee | ||
Comment 16•13 years ago
|
||
Great, I'll approach it that way and re-submit a patch. Will probably be next week as I'm on several other things this week. Thanks!
Updated•13 years ago
|
Crash Signature: [@ gfxContext::NewPath() ] → [@ gfxContext::NewPath() ]
[@ gfxContext::NewPath]
Reporter | ||
Comment 18•13 years ago
|
||
It's correlated in 9.0.1 with Firefox Showcase:
99% (134/136) vs. 0% (220/101230) {89506680-e3f4-484c-a2c0-ed711d481eda} (Firefox Showcase, https://addons.mozilla.org/addon/1810)
Comment 19•13 years ago
|
||
(In reply to Scoobidiver from comment #18)
> It's correlated in 9.0.1 with Firefox Showcase:
> 99% (134/136) vs. 0% (220/101230) {89506680-e3f4-484c-a2c0-ed711d481eda}
> (Firefox Showcase, https://addons.mozilla.org/addon/1810)
I just had a user confirm that disabling the Showcase extension fixed their crash problem - https://support.mozilla.org/en-US/questions/913408
Reporter | ||
Updated•13 years ago
|
Summary: crash with taskbar preview [@ gfxContext::NewPath() ] → crash with taskbar preview [@ gfxContext::NewPath() ] mainly with Firefox Showcase
Assignee | ||
Comment 20•13 years ago
|
||
Awesome, I'll try to fix it with this new info this week or next.
Assignee | ||
Comment 21•13 years ago
|
||
I downloaded the extension, opened over 100 tabs and tried using each of the buttons available with that extension, but unfortunately could still not reproduce.
Likely the crashes are seen with this extension because it provides much more previews than is available via the taskbar previews option which is capped by Windows for the number of previews that are shown.
It anyone has more information on how to reproduce please let me know.
Comment 22•13 years ago
|
||
(In reply to Brian R. Bondy [:bbondy] from comment #21)
> I downloaded the extension, opened over 100 tabs and tried using each of the
> buttons available with that extension, but unfortunately could still not
> reproduce.
>
> Likely the crashes are seen with this extension because it provides much
> more previews than is available via the taskbar previews option which is
> capped by Windows for the number of previews that are shown.
>
> It anyone has more information on how to reproduce please let me know.
I'm Showcase's author; the users that have contacted me with this issue reported that it was triggered by enabling the "Thumbnail caching" option; this option can be found in the "Advanced" panel, and is disabled by default.
Assignee | ||
Comment 23•13 years ago
|
||
Thanks for the info, I tried that option but cannot reproduce still.
Comment 24•13 years ago
|
||
It instantly crashed for me :)
This is what I did:
- Have some tabs loaded with real websites (blank thumbnails are special cases).
- Press F12 to bring Showcase.
- Close it
- Go to the Showcase settings panel, "Advanced" tab.
- Toggle the cache option.
At this point, it crashed; if it doesn't in your computer, try reloading a tab, it appears to be what triggers the crash.
I can provide any information you need; to start with, the user agent is:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0) Gecko/20100101 Firefox/10.0
browser.startup.homepage_override.buildID 20120111092507
Comment 25•13 years ago
|
||
OK, that's odd... now it doesn't crash following the same instructions. I'll be trying different things, see if I can reliably reproduce the issue.
Comment 26•13 years ago
|
||
After trying different things, it appears that the crashing happens, but it's random in most cases; however, opening http://www.terra.es/ seemed to reliably crash Firefox when the "thumbnail cache" in Showcase was enabled.
My feeling is that the problem happens when a tab is painted with "drawWindow" multiple times at once; a testcase like that should be trivial to implement.
Assignee | ||
Comment 27•13 years ago
|
||
I tried with that site and with thumbnail cache enabled, but no crashing.
I tried reloading several times as well as opening over 100 other tabs.
I'll investigate further through code since I can't reproduce.
Comment 28•13 years ago
|
||
Yes, I just tried the website I referred earlier, and now it doesn't crash; perhaps it was something that was present in that website at that moment.
I also tried to modify the code to paint the thumbnail multiple times, but is not crashing either.
I'll try to create a proper test case that causes the crash, but it appears to be tricky to reproduce.
Comment 29•13 years ago
|
||
It appears to crash quite consistently when opening flash games; for instance, try this one (which is very basic and small):
http://armorgames.com/play/21/phit
Assignee | ||
Comment 30•13 years ago
|
||
No crash yet, I have about 20 tabs open with that page.
I noticed that the extension has an export settings option. Could you export your settings to me so I can import them?
Comment 31•13 years ago
|
||
I was quite sure I was using the default configuration, but just in case I reset to defaults, and only change the thumbnail caching option; the crash still happens.
Are you using any Flash/ad blocker?
Assignee | ||
Comment 32•13 years ago
|
||
OK so we have the same config, no I don't have a flash / ad blocker.
Comment 33•13 years ago
|
||
Just tried the "Enable cache" option with 10.0 beta 5, with hardware acceleration enabled and disabled, and the crash still happens very often.
It's not about opening a lot of windows, it seems to be about browsing; also, it happens to crash a lot on "session restore" with GMail.
If there's any information or test I can provide that would help you, let me know.
Comment 34•13 years ago
|
||
One more piece of information; I had that thumbnail caching option enabled and running without problems when toggling the following option:
extensions.showcase.updateThumbnailWhenScroll (from the default value, enabled, to disabled)
In the code, this will disable the thumbnail refreshing when scrolling; this is done through a nsITimer instance, to prevent a large number of thumbnail refreshing; maybe it would be a good idea to correlate this code with the one that is failing in the Firefox side to see if they've anything in common.
Comment 35•13 years ago
|
||
I was trying different things to try to fix the problem, and I've come up with a change in my extension that seems to increase quite a lot the crashing frequency; should I upload it here?
Assignee | ||
Comment 36•13 years ago
|
||
Yes please, more of the same crashes would be great
Comment 37•13 years ago
|
||
Comment 38•13 years ago
|
||
You still need to set the "Enable thumbnail caching" from the "Advanced" tab of this extension's settings panel.
The difference between this version and the previous one is that between creating the canvas and setting its size, it rendered the window using a "setTimeout" with 0ms delay; doing it immediately increases the amount of crashes in a noticeable way; it even crashed when trying to bring the "Add-ons" panel.
Things that I also tried, but without any effect, is to set safe parameters in the drawWindow call, ensure the size of the canvas is valid, and use a "setTimeout" instead of a nsITimer.
Also, I confirmed that the crash happens in the drawWindow call; commenting that operation gets rid of the crashes.
Let me know if you're able to reproduce the problem with this version.
Assignee | ||
Comment 39•13 years ago
|
||
Sorry I can't reproduce even with the new extension. Perhaps I can do a remote login to your machine with gotomeeting, logmein, or something similar to get a call stack?
Assignee | ||
Comment 40•13 years ago
|
||
OK I can reproduce now with the new extension attached above.
I can reproduce when is set to gfx.canvas.azure.enabled; false and/or when I disable hardware acceleration using this tool:
"C:\Program Files (x86)\Microsoft DirectX SDK (June 2010)\Utilities\bin\x86\dxcpl.exe"
It is most easy to reproduce with many tabs open and when you turn on session restore. When you click the restore button it crashes right then.
Strangely enough I can reproduce with a conditional breakpoint that
nsCanvasRenderingContext2D::DrawWindow's mThebes is non null but once it is inside RenderDocument the parameter is NULL.
> rv = presShell->RenderDocument(r, renderDocFlags, bgColor, mThebes);
I'm not quite sure what's going on but it sounds like stack corruption of some sort.
I do notice that nsCanvasRenderingContext2D::Reset() is called after the call to nsCanvasRenderingContext2D::DrawWindow only when it crashes from a SetDimensions call.
I'm not sure why the value is non null before the call and null once inside the call though.
I'm going to unassign myself in hopes that someone who knows this code better will take this bug.
If anyone has any ideas for me I can re-take the bug.
Assignee: netzen → nobody
Assignee | ||
Comment 41•12 years ago
|
||
Comment on attachment 557582 [details] [diff] [review]
Possible crash fix for taskbar previews
Review of attachment 557582 [details] [diff] [review]:
-----------------------------------------------------------------
While investigating bug 777703 I found several dozen instances where we store cairo_int_status_t which has max value 106 inside of cairo_status_t which has max value 40.
The nil object array we index in cairo_create_in_error has a max size of CAIRO_STATUS_LAST_STATUS which is 40.
Would you reconsider landing this patch? I think we're causing memory corruption that could have been fixed for almost a year since this patch was suggested.
How about if I add an assert(status < CAIRO_STATUS_LAST_STATUS) before the changed lines?
I will file a follow up bug and I'll take it to actually fix the references I can find. But I think that bug 777703 proves that having this fix is a really good idea. C enums are not typesafe and so errors like this are very common. I'd rather have a debug assertion than some random stack corruption. And users would rather not crash.
Attachment #557582 -
Flags: review- → review?(jmuizelaar)
Assignee | ||
Comment 42•12 years ago
|
||
Preferring stack corruption over simply adding a debug assertion and fixing the safety of the code seems bad from a security standpoint as well. Especially now that we have proof that cairo_status_t does store values that are out of bound, it's not just based on my suspicion as it was earlier.
Comment 43•12 years ago
|
||
Given comment #41 and #42 I'm nominating this for tracking on all active branches.
It sounds to me as if Brian is pretty sure there's memory corruption happening because of this, and we've had topcrashers all the time that have some memory corruption happening, so any potential source of corruptions should be fixed where possible (with memory/stack corruption crashes, it's usually impossible to know from our minidumps what caused the corruption, we only know we crashed because something is corrupted).
tracking-firefox-esr10:
--- → ?
tracking-firefox15:
--- → ?
tracking-firefox16:
--- → ?
tracking-firefox17:
--- → ?
Assignee | ||
Comment 44•12 years ago
|
||
I'm not sure there is stack corruption but I suspected that values out of bound were passed into cairo_create_in_error before. If such values are passed then an index is blindly offset and a cairo object is returned from it.
I recently found a few dozen cases of such errors. Whether any of them ever make it up to cairo_create_in_error or not I'm not sure, but the cairo_status_t and cairo_int_status_t are used a lot interchangeably when they shouldn't. The cairo_int_sattus_t values start at 100 and the array I mentioned earlier only goes up to 40.
Comment 45•12 years ago
|
||
Not new to recent or upcoming releases, and not directly suspected of causing any recent top crashers. Tracking flags mean something very specific - an effort around fixing a long tail of possible memory corruption issues would need to be handled like crit-smash, crash-kill, or snappy.
If this was likely to be causing significant issues, we'd treat it differently. Please re-nominate if that's the case.
Comment 46•12 years ago
|
||
Comment on attachment 557582 [details] [diff] [review]
Possible crash fix for taskbar previews
Review of attachment 557582 [details] [diff] [review]:
-----------------------------------------------------------------
I'd rather we just crash in _cairo_create_in_error and try to track down where these things are coming from. That should help us separate out the different cases and perhaps fix them.
Attachment #557582 -
Flags: review?(jmuizelaar) → review-
Assignee | ||
Comment 47•12 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #46)
> Comment on attachment 557582 [details] [diff] [review]
> Possible crash fix for taskbar previews
>
> Review of attachment 557582 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I'd rather we just crash in _cairo_create_in_error and try to track down
> where these things are coming from. That should help us separate out the
> different cases and perhaps fix them.
I don't follow. assert (status != CAIRO_STATUS_SUCCESS); will only be hit on debug builds. And I don't see a check there to abort if the value is out of range.
Would you accept a patch that did something like this instead?
+ /* Sanity check */
+ if (status < 0 || status > CAIRO_STATUS_LAST_STATUS) {
+ abort();
+ }
Comment 48•12 years ago
|
||
This crash fix seems to have stalled waiting for a reply from Jeff.
Assignee: nobody → netzen
Flags: needinfo?(jmuizelaar)
Comment 49•12 years ago
|
||
(In reply to Brian R. Bondy [:bbondy] from comment #47)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #46)
> > Comment on attachment 557582 [details] [diff] [review]
> > Possible crash fix for taskbar previews
> >
> > Review of attachment 557582 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > I'd rather we just crash in _cairo_create_in_error and try to track down
> > where these things are coming from. That should help us separate out the
> > different cases and perhaps fix them.
>
> I don't follow. assert (status != CAIRO_STATUS_SUCCESS); will only be hit
> on debug builds. And I don't see a check there to abort if the value is out
> of range.
>
> Would you accept a patch that did something like this instead?
>
> + /* Sanity check */
> + if (status < 0 || status > CAIRO_STATUS_LAST_STATUS) {
> + abort();
> + }
Yes that would be better. Sorry about the delay.
-Jeff
Flags: needinfo?(jmuizelaar)
Assignee | ||
Comment 50•12 years ago
|
||
> Yes that would be better. Sorry about the delay.
No problem, at least it was within a year ;) New patch with abort() in the place I suspect stack corruption happens.
Attachment #723426 -
Flags: review?(jmuizelaar)
Comment 51•12 years ago
|
||
Comment on attachment 723426 [details] [diff] [review]
Patch v2.
Review of attachment 723426 [details] [diff] [review]:
-----------------------------------------------------------------
We'll need to be careful with this as it could cause unexpected aborts but those will still be interesting to see.
Attachment #723426 -
Flags: review?(jmuizelaar) → review+
Assignee | ||
Comment 52•12 years ago
|
||
If it does that's better than unexpected stack corruptions, and if we are in that case then I think we should land the first submitted patch after we know it happens.
Do I need a documentation patch similar to this one before landing? ( https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=614772&attachment=558463 )
Assignee | ||
Updated•12 years ago
|
Flags: needinfo?(jmuizelaar)
Assignee | ||
Comment 54•12 years ago
|
||
Target Milestone: --- → mozilla22
Comment 55•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•