Closed Bug 788436 Opened 12 years ago Closed 12 years ago

Firefox 17 crash in nsGfxScrollFrameInner::ScrollToImpl

Categories

(Core :: Layout, defect)

17 Branch
x86
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla18
Tracking Status
firefox17 + verified
firefox18 + verified

People

(Reporter: scoobidiver, Assigned: tnikkel)

References

Details

(4 keywords)

Crash Data

Attachments

(1 file)

There's a spike in crashes from 18.0a1/20120831 making it #1 top crasher on Mac with 35% of all crashes. The regression range for the spike is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=706174d31a02&tochange=fcc533f691e9 It might be a regression from bug 786672 or bug 674477. Some comments say: "Ateempt two-finger scroll" "HuffingtonPost front page!" Signature nsGfxScrollFrameInner::ScrollToImpl More Reports Search UUID df4e7dcd-3a84-461d-ac4b-be07c2120905 Date Processed 2012-09-05 00:58:13 Uptime 1702 Last Crash 28.4 minutes before submission Install Age 30.3 minutes since version was first installed. Install Time 2012-09-05 00:27:45 Product Firefox Version 18.0a1 Build ID 20120904030512 Release Channel nightly OS Mac OS X OS Version 10.6.8 10K549 Build Architecture x86 Build Architecture Info GenuineIntel family 6 model 14 stepping 8 Crash Reason EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE Crash Address 0x0 App Notes AdapterVendorID: 0x1002, AdapterDeviceID: 0x71c5 EMCheckCompatibility True Adapter Vendor ID 0x1002 Adapter Device ID 0x71c5 Frame Module Signature Source 0 XUL nsGfxScrollFrameInner::ScrollToImpl nsGfxScrollFrame.cpp:2098 1 XUL nsGfxScrollFrameInner::AsyncScrollCallback nsGfxScrollFrame.cpp:1676 More reports at: https://crash-stats.mozilla.com/report/list?signature=nsGfxScrollFrameInner%3A%3AScrollToImpl https://crash-stats.mozilla.com/report/list?signature=nsGfxScrollFrameInner%3A%3AAsyncScrollCallback
Severity: normal → critical
Assignee: nobody → dzbarsky
Blocks: 780692
(In reply to Scoobidiver from comment #0) > More reports at: > https://crash-stats.mozilla.com/report/ > list?signature=nsGfxScrollFrameInner%3A%3AScrollToImpl > https://crash-stats.mozilla.com/report/ > list?signature=nsGfxScrollFrameInner%3A%3AAsyncScrollCallback Also, these links include a lot of other bugs. I'd suggest: https://crash-stats.mozilla.com/report/list?signature=nsGfxScrollFrameInner%3A%3AScrollToImpl&version=Firefox:18.0a1 https://crash-stats.mozilla.com/report/list?signature=nsGfxScrollFrameInner%3A%3AAsyncScrollCallback&version=Firefox:18.0a1
Crash Signature: [@ nsGfxScrollFrameInner::ScrollToImpl] [@ nsGfxScrollFrameInner::AsyncScrollCallback] → [@ nsGfxScrollFrameInner::ScrollToImpl] [@ nsGfxScrollFrameInner::AsyncScrollCallback] [@ @0x0 | nsGfxScrollFrameInner::AsyncScrollCallback] [@ nsGfxScrollFrameInner::ScrollToWithOrigin] [@ @0x0 | nsGfxScrollFrameInner::ScrollToWithOrigin]
Any chance this bug is related to (dupe of) bug 753251? If so, we've been seeing this for a while and it looks like there's a new avenue of investigation here.
Yeah, bug 753251 look like possible, since the suspicious code is plugin stuff here.
I have no idea. I believe that my patch for bug 674477 touches only safe objects...
Yeah, I think so too. This is something else.
David, I doubt that https://hg.mozilla.org/mozilla-central/rev/1bc40cb5b27c is responsible for this. That patch only added code in the case of touch events, while these crashes are occurring on OSX.
Assignee: dzbarsky → nobody
Keywords: reproducible
I tried scrolling around on that page in the 9/10 nightly on my 10.6.8 machine and I didn't get a crash.
(In reply to Bill McCloskey (:billm) from comment #9) > I'm running a 9/10 nightly on MacOS 10.6.8. I get a reproducible crash when > doing any scrolling on this page: Does it happen in Safe Mode?
It stops happening if I disable Flash. I'm running Flash version 11.4.402.265.
Hmm, I have the same version of Flash too (and it's enabled).
We AddScrollPositionListener/RemoveScrollPositionListener in nsPluginInstanceOwner::SetFrame. It looks like the nsPluginInstanceOwner is dieing after calling AddScrollPositionListener without calling RemoveScrollPositionListener for some reason.
If you need to test out a hypothesis or something, I can run a try build. This bug is super annoying. I basically crash when scrolling any page with Flash content on it.
Bill, what other plugins do you have installed?
There's a fairly strong correlation with the "DivX Plus Web Player HTML5" extension: 43% (6/14) vs. 20% (16/80) {23fcfd51-4958-4f00-80a3-ae97e717ed8b} http://www.systemlookup.com/FF_Extensions/15-DivX_Plus_Web_Player_HTML5.html
The nsGfxScrollFrameInner::ScrollToImpl crashes occur in large numbers (over the last week) even in FF 14.0.1! So if they're all the same crash, the spike isn't due to a change in our code.
I made a fresh profile and disabled every plugin except Flash. The crash still happens. Can having a plugin installed but disabled affect anything? Here's another page where I crash every time when scrolling: http://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode
That page doesn't even have plugins. Do you have accessibility enabled in about:support?
(In reply to Timothy Nikkel (:tn) from comment #20) > That page doesn't even have plugins. Do you have accessibility enabled in > about:support? Weird. I have zeroes for the accessibility fields. Here's the about:support data from my fresh profile: Application Basics Name Firefox Version 18.0a1 User Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:18.0) Gecko/18.0 Firefox/18.0 Profile Folder Show in Finder Enabled Plugins about:plugins Build Configuration about:buildconfig Crash Reports about:crashes Memory Use about:memory Extensions Name Version Enabled ID Important Modified Preferences Name Value browser.cache.disk.capacity 1048576 browser.cache.disk.smart_size.first_run false browser.cache.disk.smart_size.use_old_max false browser.cache.disk.smart_size_cached_value 358400 browser.places.smartBookmarksVersion 4 browser.startup.homepage_override.buildID 20120911140351 browser.startup.homepage_override.mstone 18.0a1 extensions.lastAppVersion 18.0a1 network.cookie.prefsMigrated true places.history.expiration.transient_current_max_pages 53688 privacy.sanitize.migrateFx3Prefs true Graphics Vendor ID 0x1002 Device ID 0x71c5 WebGL Renderer ATI Technologies Inc. -- ATI Radeon X1600 OpenGL Engine -- 2.0 ATI-1.6.36 GPU Accelerated Windows 0. Blocked for your graphics card because of unresolved driver issues. AzureCanvasBackend quartz AzureFallbackCanvasBackend none AzureContentBackend none JavaScript Incremental GC 1 Accessibility Activated 0 Prevent Accessibility 0 Library Versions Expected minimum version Version in use NSPR 4.9.2 4.9.2 NSS 3.13.6.0 Basic ECC 3.13.6.0 Basic ECC NSS Util 3.13.6.0 3.13.6.0 NSS SSL 3.13.6.0 Basic ECC 3.13.6.0 Basic ECC NSS S/MIME 3.13.6.0 Basic ECC 3.13.6.0 Basic ECC
(In reply to Steven Michaud from comment #18) > The nsGfxScrollFrameInner::ScrollToImpl crashes occur in large numbers (over > the last week) even in FF 14.0.1! So if they're all the same crash, the > spike isn't due to a change in our code. I compared their ratios in each channel: * 15.0.1: 1.5% * 16.0b2: 1.0% * 17.0a2: 32.5% <-- drop support for OS X 10.5 * 18.0a1: 20.0% The spike was uplifted to Aurora. Based on comment 10, I don't think dropping the support for OS X 10.5 is fully responsible for the spikes. There's another cause.
No longer blocks: 780692
Summary: Firefox 18 crash in nsGfxScrollFrameInner::ScrollToImpl → Firefox 17 crash in nsGfxScrollFrameInner::ScrollToImpl
Version: 18 Branch → 17 Branch
I ripped out the plugin scroll position listener code to try to narrow down why Bill is crashing and made this try build http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tnikkel@gmail.com-0db479a5eedb/try-macosx64/
(In reply to Timothy Nikkel (:tn) from comment #23) > I ripped out the plugin scroll position listener code to try to narrow down > why Bill is crashing and made this try build > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tnikkel@gmail.com- > 0db479a5eedb/try-macosx64/ I tried both pages with the build and didn't get any crashes.
Ok thanks for testing. I made a new try build which makes the scroll position listeners noops (but they are still registered). It will be available in about an hour at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tnikkel@gmail.com-c932bd93b2e8/
It occurred to me that I have a pretty old Intel Core Duo Mac that can only run 32-bit apps. Tim, are you able to reproduce if you run FF in 32-bit mode?
Yes! I can reproduce in 32 bit mode!
Well at least a crash, it had a different signature.
(In reply to Timothy Nikkel (:tn) from comment #28) > Well at least a crash, it had a different signature. Scratch that, it has the right signature.
nsPluginHost::InstantiateEmbeddedPluginInstance creates an nsPluginInstanceOwner, its event model is carbon by default. InstantiateEmbeddedPluginInstance then calls nsPluginInstanceOwner::SetFrame which registers the scroll position listener because the event model is carbon. InstantiateEmbeddedPluginInstance then calls SetUpPluginInstance which eventually results in us getting a setEventModel from the plugin which sets us to cocoa. Now that the event model is cocoa we never unregister the scroll position listener.
Attached patch patch (deleted) — Splinter Review
Assignee: nobody → tnikkel
Attachment #661352 - Flags: review?(joshmoz)
Comment on attachment 661352 [details] [diff] [review] patch Review of attachment 661352 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the patch! Happy to have this fixed. BTW - I'm planning to remove all this code (scroll bar listening in plugins) pretty soon, when I remove Carbon NPAPI support entirely. Probably in the next release cycle. ::: dom/plugins/base/nsPluginInstanceOwner.cpp @@ +779,5 @@ > NS_IMETHODIMP nsPluginInstanceOwner::SetEventModel(int32_t eventModel) > { > #ifdef XP_MACOSX > +#ifndef NP_NO_CARBON > + bool eventModelChange = (mEventModel != static_cast<NPEventModel>(eventModel)); Maybe just save the statically-casted event model variable at the top of the method and use it all three times instead of having three casts. @@ +3698,5 @@ > +{ > + // Our frame is changing or going away, unregister for a scroll position listening. > + // It's OK to unregister when we didn't register, so don't be strict about unregistering. > + // Better to unregister when we didn't have to than to not unregister when we should. > + if (mRegisteredScrollPositionListener && GetEventModel() == NPEventModelCarbon) { Do you really need to check the event model here? If you know that you already registered a scroll position listener then you should probably unregister regardless of what the current event model is.
Attachment #661352 - Flags: review?(joshmoz) → review+
(In reply to Josh Aas (Mozilla Corporation) from comment #32) > BTW - I'm planning to remove all this code (scroll bar listening in plugins) > pretty soon, when I remove Carbon NPAPI support entirely. Probably in the > next release cycle. Yeah, I was aware (and looking forward to it), just didn't know how soon. This patch didn't take much time so still worth it. > Maybe just save the statically-casted event model variable at the top of the > method and use it all three times instead of having three casts. Ok. > Do you really need to check the event model here? If you know that you > already registered a scroll position listener then you should probably > unregister regardless of what the current event model is. Good point. I changed the patch.
Had to back this out for more bustage. https://hg.mozilla.org/integration/mozilla-inbound/rev/645486a66baf PluginProcessChild.cpp ../../../../dom/plugins/base/nsPluginInstanceOwner.cpp:782:16: error: unused variable 'newEventModel' [-Werror,-Wunused-variable] NPEventModel newEventModel = static_cast<NPEventModel>(eventModel); ^ 1 error generated. make[6]: *** [nsPluginInstanceOwner.o] Error 1 make[5]: *** [plugins/base_libs] Error 2 make[5]: *** Waiting for unfinished jobs....
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Comment on attachment 661352 [details] [diff] [review] patch [Approval Request Comment] Bug caused by (feature/regressing bug #): not sure, there are crashes for this on all current Firefox versions, not sure if the volume changed User impact if declined: crashes on common pages with plugins when using 32-bit version of Firefox on Mac. Testing completed (on m-c, etc.): will be in the next nightly Risk to taking this patch (and alternatives if risky): this patch should be safe, if there are any regressions we should notice them quickly on nightly String or UUID changes made by this patch: none requesting this now, although I'd like this to survive at least a day on nightlies before landing on aurora
Attachment #661352 - Flags: approval-mozilla-aurora?
Comment on attachment 661352 [details] [diff] [review] patch [Triage Comment] Top crash fix, I can confirm that 15/16 are unaffected.
Attachment #661352 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
When I went to apply this patch to aurora I found out that the code had changed. Turns out that bug 598401 introduced the bug that this patch fixed by checking the event model before unregistering instead of just unregistering unconditionally. Bug 598401 is only on nightly so it makes no sense to uplift my patch anywhere. The crashes on <= 17 must be a different issue.
Blocks: 598401
(In reply to Timothy Nikkel (:tn) from comment #41) > The crashes on <= 17 must be a different issue. The spike is gone after 17.0a2/20120920. The working range is: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=eb69897f7d24&tochange=c039da3793f7 I think bug 790856 may have fixed it.
(In reply to Scoobidiver from comment #42) > The spike is gone after 17.0a2/20120920. I was mislead because of bug 793088. There are still crashes in 17.0a2/20120923 and above. If the level stays lower than previously, it will be considered as unaffected because bug 753251 tracks remaining crashes.
It still accounts for 35% of crashes in 17.0a2 over the last 3 days, so it's not bug 753251 that accounts for 1% in 15.0.1 and 1.5% in 16.0b4 for the same period. Here are some comment in Aurora: "Repro: https://twitter.com/MarsCuriosity Expand and collapse tweets until crash (happened twice on those with videos but also non-video tweets)." "Trying to read an article here http://techland.time.com/2012/09/18/50-best-websites-2012/?iid=tl-main-lede#introduction-2 Has already crashed 5 or 6 times in a matter of less than 10 minutes! :~(" "trying to browse Amazon--all Amazon pages are slow to load & allow scrolling"
(In reply to Scoobidiver from comment #44) > "Repro: https://twitter.com/MarsCuriosity Expand and collapse tweets until > crash (happened twice on those with videos but also non-video tweets)." It's pretty easy to reproduce following these steps, thanks! The problem is that the plugin is in an iframe that gets removed from the document. When it gets removed we first disconnect the root view of the iframe (thus disconnecting the document) and then later we destroy all the frames in the iframe. When we destroy the frame for the plugin we get to nsPluginInstanceOwner::SetFrame with a new null frame and we remove scroll position listeners up the frame tree. But we've been disconnected from our parent document already so we can't unregister the scroll position listeners we have on scroll frames in the parent document. The "disconnect root view then destroy frames" behaviour was introduced by bug 775965. Which explains the volume on the different versions. Porting the patch from this bug would fix this type of crash for all plugins except Carbon plugins (but I think Carbon plugins are very rare nowadays). This is because we would unregister all scroll position listeners during initialization of the plugins when we change the event model away from Carbon. I'm not sure what a proper direct fix for this problem would be, so I think I'll just port this patch and then the Carbon crashes will be fixed when we drop support for Carbon plugins shortly.
Blocks: 775965
Crash Signature: [@ nsGfxScrollFrameInner::ScrollToImpl] [@ nsGfxScrollFrameInner::AsyncScrollCallback] [@ @0x0 | nsGfxScrollFrameInner::AsyncScrollCallback] [@ nsGfxScrollFrameInner::ScrollToWithOrigin] [@ @0x0 | nsGfxScrollFrameInner::ScrollToWithOrigin] → [@ nsGfxScrollFrameInner::ScrollToImpl] [@ @0x0 | nsGfxScrollFrameInner::ScrollToImpl] [@ nsGfxScrollFrameInner::AsyncScrollCallback] [@ @0x0 | nsGfxScrollFrameInner::AsyncScrollCallback] [@ nsGfxScrollFrameInner::ScrollToWithOrigin] [@ @0x0 | nsGfxS…
Please verify by checking Socorro.
Keywords: verifyme
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/17.0 Firefox/17.0 I verified the Socorro reports and I found 94 crash reports that are related with 2 of the signatures of this bug on Firefox 17 beta 1, Firefox 17 beta 2 and Firefox 17 beta 3. The crash reports on Firefox 17 beta are: [@ nsGfxScrollFrameInner::ScrollToImpl] http://bit.ly/Svq4Rc - 17 Beta 1 http://bit.ly/Rsv3RI - 17 Beta 2 http://bit.ly/TWUsWr - 17 Beta 3 [@ @0x0 | nsGfxScrollFrameInner::ScrollToImpl] http://bit.ly/Rvt0L8 - 17 Beta 1 http://bit.ly/SwMJvw - 17 Beta 2 http://bit.ly/PlBztE - 17 Beta 3 Is this expected in any way or should I file a separate bug?
(In reply to Simona B [QA] from comment #48) > Is this expected in any way or should I file a separate bug? It's bug 753251.
Verified the crash reports in Socorro for all the signatures mentioned in this bug and there are no other crash reports after the fix landed (except for the ones from Comment 48 - Bug 753251 is filed for that matter). Based on Comment 49 - setting the tracking flag for Firefox 17 to VERIFIED.
QA Contact: simona.marcu
Verified in Socorro and there are no crashes on Firefox 18 beta for the signature mentioned in this bug. Setting the tracking flag for Firefox 18 to VERIFIED.
(In reply to Simona B [QA] from comment #51) > Verified in Socorro and there are no crashes on Firefox 18 beta for the > signature mentioned in this bug. There are still crashes in 18.0 Beta for this crash signature (see https://crash-stats.mozilla.com/report/list?version=Firefox:18.0&signature=nsGfxScrollFrameInner%3A%3AScrollToImpl) but it's bug 753251.
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: