Closed Bug 1097094 Opened 10 years ago Closed 10 years ago

crash in DispatchScrollViewChangeEvent : Crash when viewing google.com search results.

Categories

(Core :: Layout, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla36
Tracking Status
b2g-v2.1 --- unaffected
b2g-v2.2 --- verified

People

(Reporter: Marty, Assigned: TYLin)

References

Details

(Keywords: crash, regression, reproducible, Whiteboard: [2.2-Daily-Testing][b2g-crash])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is report bp-63479a1c-2d3c-4ea0-97a2-f64292141111. ============================================================= Description: I experienced this crash on a google.com search result page for the search term 'polygon.' I believe I was scrolling the page as this happened, and may have just pressed the Home Button, but I am not sure. This occurred after I performed an OTA from a previous build, and I had the Messages and Settings apps open at the time. I have not been able to reproduce this crash. Repro Steps: 1) Update a Flame device to BuildID: 20141110040206 via Shallow Flash 2) Connect to WiFi and perform an OTA to BuildID: 20141111040202 3) Open the Messaging app and send an SMS 4) Open the Settings app 5) Open the Browser app, navigate to 'google.com' and search for 'polygon' 6) Scroll through the search results Actual: The Browser app crashed, presenting the user with the option to submit a crash report. Expected: The Browser app does not crash. Environmental Variables: Device: Flame 2.2 Master BuildID: 20141111040202 Gaia: 6af3a8a833eb8bb651e8b188cb3f3c3a43bb4184 Gecko: 76b85b90a8cb Gonk: Version: 36.0a1 (2.2 Master) Firmware: V188 User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0 Repro frequency: once
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Whiteboard: [2.2-Daily-Testing] → [2.2-Daily-Testing][b2g-crash]
Adding steps-wanted to see if we can get more reliable STR's.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Keywords: steps-wanted
Able to get the browser to crash just by scrolling through any search engine results pages. STR: 1. Flash to the build below. 2. Launch Browser and search for something on any search engine. (google, yahoo, bing) 3. Scroll down to the bottom of the page. 4. Tap to go to the next page of results. 5. Repeat steps 3 and 4 and crash will occur. Results: Crash occurs. Repro Rate: 7/10 Tested with Shallow Flash on 319mb using Engineering builds. Also happens with mem set to 512mb. Environmental Variables: Device: Flame 2.2 KK BuildID: 20141111042605 Gaia: 6af3a8a833eb8bb651e8b188cb3f3c3a43bb4184 Gecko: c60fc2c11c0e Version: 36.0a1 (2.2) Firmware: V188-1 User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Keywords: steps-wantedregression
QA Contact: croesch
Crash does not occur on Flame 2.1 KK
nomming for 2.2 - repeatable crash
blocking-b2g: --- → 2.2?
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
QA Contact: croesch
QA Contact: jmitchell
I can repro as well on the latest Master build. Adding Boris since the crash report points to a change in http://hg.mozilla.org/mozilla-central/annotate/c0d559389a5c/layout/base/nsIPresShell.h#l294. Assuming also this bug needs to move to the Core Gecko component and out of Gaia:Browser.
Flags: needinfo?(bzbarsky)
Keywords: reproducible
Here is the initial / Central window (in case it's useful to anyone looking) I'll finish up the Inbound window tomorrow Central Regression Window: Last Working: Device: Flame 2.2 Build ID: 20141110041655 Gaia: e02facadb0bc3fe32227b52b31ca47822f7f4322 Gecko: 776f65ae74c6 Gonk: Could not pull gonk. Did you shallow Flash? Version: 36.0a1 (2.2) Firmware Version: v188-1 User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0 First Broken: Device: Flame 2.2 Build ID: 20141110052655 Gaia: e02facadb0bc3fe32227b52b31ca47822f7f4322 Gecko: 1a09297482e1 Gonk: Could not pull gonk. Did you shallow Flash? Version: 36.0a1 (2.2) Firmware Version: v188-1 User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0 Gaia/Gecko Swap Last Working Gaia First Broken Gecko: Issue DOES reproduce Gaia: e02facadb0bc3fe32227b52b31ca47822f7f4322 Gecko: 1a09297482e1 First Broken Gaia Last Working Gecko: Issue does NOT reproduce Gaia: e02facadb0bc3fe32227b52b31ca47822f7f4322 Gecko: 776f65ae74c6 Gecko pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=776f65ae74c6&tochange=1a09297482e1
> Adding Boris since the crash report points to a change in The crash is at address 0x4. The crashing line is "return mDocument;". mDocument is the first member of nsIPresShell, and since the class has a vtable it's at 4 bytes into the object. In other words, someone called GetDocument() on a null presshell pointer. Looking at DispatchScrollViewChangeEvent it does: 996 nsCOMPtr<nsIDocument> doc = aPresShell->GetDocument(); which presumably means aPresShell is null. The caller here is, I assume, SelectionCarets::AsyncPanZoomStarted, which does: 1024 DispatchScrollViewChangeEvent(mPresShell, dom::ScrollState::Started, aScrollPos); How did mPresShell get to be null? Presumably someone called SelectionCarets::Terminate but after the mContainer on the nsPresContext was nulled out, so we didn't remove ourselves from the docshell observer list but did null out the presshell. In any case, given the regression range and the above analysis, this is clearly fallout from bug 1094607.
Blocks: 1094607
Component: Gaia::Browser → Layout
Flags: needinfo?(bzbarsky)
Product: Firefox OS → Core
Oh, and the point is that Terminate is called from PresShell::Destroy, which is called typically by nsDocumentViewer::DestroyPresShell. The mContainer of the prescontext can be nulled out by nsPresContext::Detach. This is called by either nsDocumentViewer::DestroyPresContext (which always comes after DestroyPresShell) or by nsDocumentViewer::Destroy, which can come _before_ DestroyPresShell if we're going into bfcache. In that situation we do the Detach() when going into bfcache, but don't destroy the presentation, and can destroy it sometime later. We need to decide what exactly should be happening with this SelectionCarets stuff while the presshell is in bfcache. Presumably it should not continue to observe teh docshell.
I'll take this bug. We need to check aPresShell against null in DispatchScrollViewChangeEvent(). Since we cannot get docShell, dispatch the event is not needed. By the way, we don't need to worry about SelectionCarets cannot be remove as a scroll observer in SelectionCarets::Terminate due to docShell isn't available. Once SelectionCarets is destroyed, it will be removed by nsDocShell::NotifyScrollObservers() automatically.
Assignee: nobody → tlin
Status: NEW → ASSIGNED
Other modification in this patch are minor coding style changes. Try result: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=ce1bebfd6514
Attachment #8521237 - Flags: review?(bzbarsky)
> Once SelectionCarets is destroyed, it will be removed by > nsDocShell::NotifyScrollObservers() automatically. Yes, but it might not be destroyed for a while, and in the meantime will keep using up memory in the list and CPU being notified, no?
Comment on attachment 8521237 [details] [diff] [review] Check aPresShell against null before calling GetDocument(). r=bz (v1) roc, can you handle this? I'm swamped with other review right now, and you've looked at other stuff around this before...
Attachment #8521237 - Flags: review?(bzbarsky) → review?(roc)
Perhaps we can keep a copy of weakptr of docshell obtained in SelectionCarets::Init() so that we can always have docShell in SelectionCarets::Terminate(). roc, does this make sense to you?
I'm going to strike regression-window for this one, it seems like the Central window was sufficient as comment 7 states "this is clearly fallout from bug 1094607." AND there is already a patch in the works.
QA Whiteboard: [QAnalyst-Triage+]
QA Contact: jmitchell
When SelectionCarets::Terminate() is called, it's not guaranteed that we can get nsDocShell from PresContext. It causes that SelectionCarets cannot remove itself as an observer. To fix this, we keep a member WeakPtr<nsDocShell> so that we can always have nsDocShell in SelectionCarets::Terminate().
Attachment #8521237 - Attachment is obsolete: true
Attachment #8521237 - Flags: review?(roc)
Attachment #8522056 - Flags: review?(roc)
Comment on attachment 8522056 [details] [diff] [review] Keep a WeakPtr to nsDocShell in SectionCarets. r=roc (v1) help fix it
Add #include "nsDocShell.h" in SelectionCarets.cpp.
Attachment #8522056 - Attachment is obsolete: true
Attachment #8522295 - Flags: review+
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
This issue is verified fixed on 2.2. Result: Scrolling on a search result page works properly without causing any crash. Repro Rate: 0/10 Device: Flame 2.2 (319mb, KK, Shallow Flash) BuildID: 20141118040205 Gaia: 4aee256937afe9db2520752650685ba61ce6097d Gecko: 7913c9392c5f Version: 36.0a1 (2.2) Firmware Version: v188-1 User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Clearing the blocking nom for 2.2? as this is already fixed/verified on that branch per the status flag
blocking-b2g: 2.2? → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: