Closed
Bug 1291190
Opened 8 years ago
Closed 8 years ago
Crash in RefCounted<T>::Release from DrawTargetD2D1::PopClip when scrolling PDF
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
People
(Reporter: mathieu.marquer, Assigned: nical)
References
Details
(Keywords: sec-other, Whiteboard: [gfx-noted][fixed by disabling patch for Windows in bug 1285271][adv-main49-])
Crash Data
Attachments
(1 file)
(deleted),
patch
|
bas.schouten
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-b9bd34ab-5834-4aa0-bc99-9c1c92160802. ============================================================= Crashing when opening a PDF file and scrolling into it. Example: http://www.stif.org/IMG/pdf/STIF_-_CA_02072014_CP_Renfort_d_offre_bus.pdf
I'm not able to reproduce on Win7 x64 with FF47. Does it crash for you: 1) in safe mode: https://support.mozilla.org/fr/kb/resoudre-problemes-firefox-mode-sans-echec 2) with a fresh profile: https://support.mozilla.org/fr/kb/utiliser-gestionnaire-profils-creer-supprimer-profils
Flags: needinfo?(mathieu.marquer)
Reporter | ||
Comment 2•8 years ago
|
||
Surprising results to me: - Doesn't crash in safe mode - Crashes in "normal mode" after manually disabling all my add-ons and restarting firefox Another crash signature: https://crash-stats.mozilla.com/report/index/93d5c12c-34ae-47e2-bb7a-c3fea2160802
Flags: needinfo?(mathieu.marquer)
It would be useful to test with a fresh profile, sometimes changes in prefs (in about:config e.g.) can make Firefox crash. In addition, try in "normal" mode with HWA disabled (you need to restart FF to apply). https://support.mozilla.org/fr/kb/desactiver-acceleration-materielle
Reporter | ||
Comment 4•8 years ago
|
||
It does crash with a fresh profile as well, with two new signatures : https://crash-stats.mozilla.com/report/index/4eb9cda6-c737-4537-8ec1-622052160803 leads to https://bugzilla.mozilla.org/show_bug.cgi?id=1291536 https://crash-stats.mozilla.com/report/index/d163d90a-40b0-424a-803d-9c1292160803 leads to https://bugzilla.mozilla.org/show_bug.cgi?id=1291535
Reporter | ||
Comment 5•8 years ago
|
||
Does *not* crash with HWA disabled.
(In reply to Mathieu Marquer from comment #5) > Does *not* crash with HWA disabled. That's why it doesn't crash in safe mode, HWA is off in safe mode. Could you type about:support and copy here the "graphics" section.
Oh, you report the crash for FF51, not 48! I'm able to reproduce the crash too. :)
Reporter | ||
Comment 8•8 years ago
|
||
No problem :) The graphics section is the following: Graphics Features Compositing Direct3D 11 Asynchronous Pan/Zoom wheel input enabled; touch input enabled WebGL Renderer Google Inc. -- ANGLE (Intel(R) HD Graphics Family Direct3D11 vs_5_0 ps_5_0) WebGL2 Renderer WebGL creation failed: * Refused to create native OpenGL context because of blacklist entry: WEBGL_NATIVE_GL_OLD_INTEL * Exhausted GL driver options. Hardware H264 Decoding Yes; Using D3D9 API Audio Backend wasapi Direct2D true DirectWrite true (6.2.9200.17568) GPU #1 Active Yes Description Intel(R) HD Graphics Family Vendor ID 0x8086 Device ID 0x0a16 Driver Version 10.18.14.4029 Driver Date 11-18-2014 Drivers igdumdim64 igd10iumd64 igd10iumd64 igdumdim32 igd10iumd32 igd10iumd32 Subsys ID 503c17aa RAM Unknown Diagnostics AzureCanvasAccelerated 0 AzureCanvasBackend direct2d 1
[Tracking Requested - why for this release]: [Tracking Requested - why for this release]: I tested and I'm able to reproduce with Nightly only with e10s+HWA both enabled. Regression range: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=d0cb076a4e736e89f8393ba5658148ae5be4171d&tochange=e5db12322fd393fe7970e726cd1f4b64845f6d23 Clearly related to bug 1289816 probably.
Blocks: 1289816
Status: UNCONFIRMED → NEW
Has Regression Range: --- → yes
Has STR: --- → yes
status-firefox48:
--- → unaffected
status-firefox49:
--- → unaffected
status-firefox50:
--- → affected
status-firefox51:
--- → affected
tracking-firefox50:
--- → ?
tracking-firefox51:
--- → ?
Component: DOM → Graphics: Layers
Ever confirmed: true
Flags: needinfo?(nical.bugzilla)
Hardware: Unspecified → x86_64
Version: unspecified → 50 Branch
Blocks: e10s
tracking-e10s:
--- → ?
Summary: Crash in nsPIDOMWindow<T>::GetDoc → [e10s] Crash in nsPIDOMWindow<T>::GetDoc
Comment 10•8 years ago
|
||
This bug has various crash signatures (from CR of the reporter and from my own CRs): [@ nsPIDOMWindow<T>::GetDoc ] [@ nsPIDOMWindowInner::HasActiveDocument ] [@ xpc::NativeGlobal ] [@ IPCError-browser | ShutDownKill ] [@ mozilla::detail::RefCounted<T>::Release ] The last one is probably more significant about crash in Graphics Layers: https://crash-stats.mozilla.com/report/index/f43b3b28-7526-4aaa-b8f1-80bfc2160803
Crash Signature: [@ nsPIDOMWindow<T>::GetDoc] → [@ nsPIDOMWindow<T>::GetDoc]
[@ mozilla::detail::RefCounted<T>::Release ]
Summary: [e10s] Crash in nsPIDOMWindow<T>::GetDoc → [e10s] Crash in nsPIDOMWindow<T>::GetDoc when scrolling PDF
Updated•8 years ago
|
Whiteboard: [gfx-noted]
Updated•8 years ago
|
Assignee | ||
Comment 11•8 years ago
|
||
The nsPIDOMWindow signatures are very different from the DrawTarget::PopClip crashes like bug 1291531 and other canvas signatures that spiked recently. It should be investigated separately.
Flags: needinfo?(nical.bugzilla)
Updated•8 years ago
|
Updated•8 years ago
|
Group: gfx-core-security
Comment 12•8 years ago
|
||
This sounds like it could be memory corruption, so I'm hiding it. Somebody on Linux should try the STR from comment 0 with a Nightly Linux ASan build. Kamil, could you do that? Thanks.
Flags: needinfo?(kjozwiak)
Comment 14•8 years ago
|
||
I'm going to conservatively mark this sec-critical, as there's a big spike in Nightly crashes that look like they hitting garbage memory.
Keywords: regression,
sec-critical
:nical, I'm assuming this is a partial fallout from the canvas stuff (e.g., https://crash-stats.mozilla.com/report/index/f4e8c512-6171-4d7c-9a50-80b5a2160805 is noisy about persistent buffer provider related issues.)
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
Lots of these are high uptime runs, in the actor destroy - are these computers going to sleep and we crash?
Assignee | ||
Comment 17•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #15) > :nical, I'm assuming this is a partial fallout from the canvas stuff I can't see the original crash stack from comment 0 https://crash-stats.mozilla.com/report/index/b9bd34ab-5834-4aa0-bc99-9c1c92160802 being related to the canvas stuff. There are canvas crashes, one of which has the same STR as comment 0, but we have to be clear about which stack we are investigating here, I don't think they are the same thing. I am currently focusing on whatever as a trace of gfx in the stack, so only the PopClip stuff. Someone from the DOM team should have a look at the nsPIDOMWindow crashes. Changing the title since I am assigned on this bug.
Crash Signature: [@ nsPIDOMWindow<T>::GetDoc]
[@ mozilla::detail::RefCounted<T>::Release ] → [@ mozilla::detail::RefCounted<T>::Release ]
Flags: needinfo?(nical.bugzilla)
Summary: [e10s] Crash in nsPIDOMWindow<T>::GetDoc when scrolling PDF → Crash in RefCounted<T>::Release from DrawTargetD2D1::PopClip when scrolling PDF
Comment 18•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #12) > This sounds like it could be memory corruption, so I'm hiding it. > > Somebody on Linux should try the STR from comment 0 with a Nightly Linux > ASan build. Kamil, could you do that? Thanks. I tried but cannot reproduce :/
Andrew, comment 17.
Flags: needinfo?(overholt)
Updated•8 years ago
|
Flags: needinfo?(kjozwiak)
Comment 20•8 years ago
|
||
Just to close the loop from my end, I've been trying to reproduce the crash using an asan build under an Ubuntu VM but I couldn't reproduce the crash. I'm assuming it's due to fx blocking the VM gfx adapters? Only machine that I managed to reproduce the crash was my Win 10 desktop machine at home.
Comment 21•8 years ago
|
||
We were looking at the related bug 1291536. Michael, wanna confirm the nsPIDOMWIndow crashes noted here in comment 17 are the same as bug 1291536?
Flags: needinfo?(overholt) → needinfo?(michael)
Updated•8 years ago
|
Updated•8 years ago
|
Comment 22•8 years ago
|
||
Those failures appear very similar. They both appear to occur when trying to get a document off of a nsPIDOMWindow, and they both have the same strange address (-1). I'm inclined to believe that they are related.
Flags: needinfo?(michael)
Reporter | ||
Comment 23•8 years ago
|
||
Can't reproduce on my Linux machine either.
Assignee | ||
Comment 24•8 years ago
|
||
PersistentBufferProviderShared was disabled again on windows (landed in central last saturday 2016-08-06). We'll uplift that to aurora if the crashes stop. If this crash persists, the good news is there aren't a lot of changes left that could have caused it. In any case, please leave this bug open so that I have a place to investigate until I can re-enable PersistentBufferProviderShared.
Reporter | ||
Comment 25•8 years ago
|
||
Confirm: not crashing anymore.
Comment 26•8 years ago
|
||
Yup, it works fine on my end too now.
Chances are this is not a sec-critical anymore, and probably not a security bug at all...
Updated•8 years ago
|
Keywords: regression
Updated•8 years ago
|
Comment 29•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #28) > Chances are this is not a sec-critical anymore, and probably not a security > bug at all... I believe the bad code is still in aurora, as the backout hasn't been uplifted, but I'm not confident.
Comment 30•8 years ago
|
||
It looks like the disabling was in bug 1285271. That has Aurora approval, but has not been landed yet, so I'm going to leave 50 as "affected". Once that has landed and somebody has confirmed the crash has gone away we can unhide this bug and remove the sec-critical flag.
Comment 31•8 years ago
|
||
I guess you are tracking these in the other direction.
Updated•8 years ago
|
Updated•8 years ago
|
Comment 34•8 years ago
|
||
Backout just landed on Aurora.
Assignee | ||
Comment 35•8 years ago
|
||
I think that for at least a subset of the crashes, the DrawTarget is not dangling but it's actually the clip stack that isn't properly maintained, leading us to pop more clips that we push which explodes when we call std::vector::back on the empty clip stack. There's been some improvement on the clip management code since the backout so we may be in a better spot, but i'd like to replace the scary crash with a gfxCriticalError so that we can more easily verify the problem and identify it again in the future. At a glance, it looks like skia does not crash release builds when saves and restores are unbalanced, which would explain this happening only on windwows+D2D.
Attachment #8782460 -
Flags: review?(bas)
Comment 36•8 years ago
|
||
I confirmed that this crash seems to have been fixed by the backout on Nightly and Aurora, so I'm changing this to sec-other.
Keywords: sec-critical → sec-other
Comment 37•8 years ago
|
||
Comment on attachment 8782460 [details] [diff] [review] Don't crash in DrawTargetD2D if PopClip is called once too many. Review of attachment 8782460 [details] [diff] [review]: ----------------------------------------------------------------- Can we make this a gfxDevCrash? I'd like to confirm this for our aurora and beta builds..
Attachment #8782460 -
Flags: review?(bas) → review+
Assignee | ||
Comment 38•8 years ago
|
||
landed as a gfxDevCrash https://hg.mozilla.org/integration/mozilla-inbound/rev/eb6b3562334e
Comment 39•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/eb6b3562334e
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Comment 40•8 years ago
|
||
Crash volume for signature 'mozilla::detail::RefCounted<T>::Release': - nightly (version 51): 755 crashes from 2016-08-01. - aurora (version 50): 7564 crashes from 2016-08-01. - beta (version 49): 48 crashes from 2016-08-02. - release (version 48): 120 crashes from 2016-07-25. - esr (version 45): 9 crashes from 2016-05-02. Crash volume on the last weeks (Week N is from 08-22 to 08-28): W. N-1 W. N-2 W. N-3 - nightly 35 108 604 - aurora 2729 3649 846 - beta 5 16 11 - release 33 37 29 - esr 1 0 2 Affected platform: Windows Crash rank on the last 7 days: Browser Content Plugin - nightly #36 - aurora #1539 #2 - beta #856 #2872 - release #645 - esr #2459
This is not the same crash? https://crash-stats.mozilla.com/report/index/01cb18ff-54cb-40f8-92ae-e0a472160822 We fail in BasicLayerManager::PushGroupForLayer, then things get worse.
Flags: needinfo?(nical.bugzilla)
Updated•8 years ago
|
Updated•8 years ago
|
Assignee | ||
Comment 42•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #41) > This is not the same crash? > https://crash-stats.mozilla.com/report/index/01cb18ff-54cb-40f8-92ae- > e0a472160822 We fail in BasicLayerManager::PushGroupForLayer, then things > get worse. I think that this is the same type of bug: we improperly track the number of clips pushed and end up popping more than we pushed. But it's a different part of the code.
Status: RESOLVED → REOPENED
Flags: needinfo?(nical.bugzilla)
Resolution: FIXED → ---
Assignee | ||
Comment 43•8 years ago
|
||
I have landed a whole lot of things on various bugs which should improve the situation, could you download this build https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com-2f7ed76b0208490b2438260b1f5f396e0d72fcf8/try-win32/firefox-51.0a1.en-US.win32.zip and tell me if you can still reproduce the issue? It's the latest nightly build with the copy-on-write canvas pref enabled. You can also just update your installation of firefox nightly (I don't know if all of the patches are already in today's build and if some will make it in tomorrow's nightly though) and enable "layers.shared-buffer-provider.enabled" in about:config.
Flags: needinfo?(thesweetlilycake)
Flags: needinfo?(mathieu.marquer)
Reporter | ||
Comment 44•8 years ago
|
||
Not crashing for me with the build and the STR I provided in comment 0.
Flags: needinfo?(mathieu.marquer)
Comment 45•8 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #43) > I have landed a whole lot of things on various bugs which should improve the > situation, could you download this build > https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com- > 2f7ed76b0208490b2438260b1f5f396e0d72fcf8/try-win32/firefox-51.0a1.en-US. > win32.zip and tell me if you can still reproduce the issue? > It's the latest nightly build with the copy-on-write canvas pref enabled. > You can also just update your installation of firefox nightly (I don't know > if all of the patches are already in today's build and if some will make it > in tomorrow's nightly though) and enable > "layers.shared-buffer-provider.enabled" in about:config. No crash for me with this build when scrolling the PDF, but I got some invalidation areas (black background for 1 or 2 pages of the PDF) when switching from another tab to the PDF tab. Is there a bug open about this issue? NB: e10s+HWA are turned on.
Assignee | ||
Comment 46•8 years ago
|
||
Excellent, thanks! I'll switch the pref back on soon, then. > No crash for me with this build when scrolling the PDF, but I got some > invalidation areas (black background for 1 or 2 pages of the PDF) when > switching from another tab to the PDF tab. Is there a bug open about this > issue? Not that I know of, I filed bug 1300121. If you can reproduce this issue, could you attach a screenshot to the bug?
Comment 47•8 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #43) It doesn't crash for me either.
Flags: needinfo?(thesweetlilycake)
Comment 48•8 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #46) > Not that I know of, I filed bug 1300121. If you can reproduce this issue, > could you attach a screenshot to the bug? I added 2 screenshots.
Assignee | ||
Comment 49•8 years ago
|
||
49 and 50 were fixed by backing out the copy-on-write canvas pref, and the issue this bug is focused on seems to have been fixed since, so closing.
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
Whiteboard: [gfx-noted][fixed by disabling patch for Windows in bug 1285271] → [gfx-noted][fixed by disabling patch for Windows in bug 1285271][adv-main49-]
Updated•8 years ago
|
Group: gfx-core-security → core-security-release
So, we do hit the assert that was added in this bug; not that often, but non-zero. https://crash-stats.mozilla.com/report/index/85eea596-e4c9-4883-99cf-59b672170130 They appear to come from PopClip called by gfxContext::~gfxContext. Nical, does this help any?
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 51•7 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #50) > So, we do hit the assert that was added in this bug; not that often, but > non-zero. > https://crash-stats.mozilla.com/report/index/85eea596-e4c9-4883-99cf- > 59b672170130 > They appear to come from PopClip called by gfxContext::~gfxContext. > Nical, does this help any? It mostly indicates that we have a bug somewhere where we pop more clips than we push on a canvas (probably related to the logic that tries to track the pushed clips between frames). Considering the volume it's not something I would spend time on in the near future.
Flags: needinfo?(nical.bugzilla)
Updated•7 years ago
|
Group: core-security-release
This could also be the cause behind Cairo specific bug 1339762 - the unbalance in cairo_save/cairo_restore could happen because of unbalanced clip push/pop. And we have a reproducible case for that. Worth following up on that one.
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(nical.bugzilla)
You need to log in
before you can comment on or make changes to this bug.
Description
•