Closed Bug 1080674 Opened 10 years ago Closed 7 years ago

[Browser][Memory] Browser experiences OOM when opening a www.polygon.com article, often freezing the device at a blank white screen.

Categories

(Firefox OS Graveyard :: Gaia::System::Window Mgmt, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(tracking-b2g:+, b2g-v2.0 unaffected, b2g-v2.1 affected, b2g-v2.2 affected, b2g-master affected)

RESOLVED WONTFIX
tracking-b2g +
Tracking Status
b2g-v2.0 --- unaffected
b2g-v2.1 --- affected
b2g-v2.2 --- affected
b2g-master --- affected

People

(Reporter: Marty, Unassigned, NeedInfo)

References

Details

(Keywords: regression, Whiteboard: [systemsfe][319MB-Flame-Support])

Attachments

(4 files)

Attached file logcat-Crash.txt (deleted) —
Description: I've had this issue happen every time I navigate to this URL: 'http://www.polygon.com/2014/10/9/6951329/hatsune-miku-letterman-tonight-show-watch' The webpage begin to load, but greatly slows down, and eventually causes the phone to return to the Homescreen. About half of the time, the phone fails to return to the homescreen properly, and the device displays only a blank white screen, becoming totally unresponsive. This seems to be more likely to happen if the user is actively scrolling the webpage at the time of the app kill. When the device displays the white screen, if the device is plugged into a computer via USB, the computer will suddenly recognize about 6 different removable disks all at once, exposing internal system files from the device. Note: This issue happens regardless if 'Async Pan/Zoom' is enabled or disabled. Repro Steps: 1) Update a Flame device to BuildID: 20141009040206 2) Open the Browser app. 3) Navigate to this URL: 'http://www.polygon.com/2014/10/9/6951329/hatsune-miku-letterman-tonight-show-watch' 4) Scroll back and forth on the webpage as it loads. Actual: Browser app is closed and user is returned to the Homescreen, possibly freezing the phone. Expected: Browser displays the website properly. Environmental Variables: Device: Flame 2.2 Master (319MB) BuildID: 20141009040206 (Full Flash) Gaia: 7b92615bdc97e5c675cd385ec68bc5e47e0c5288 Gecko: f0bb13ef0ee4 Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Version: 35.0a1 (2.2 Master) Firmware: v184 User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0 Notes: This issue occurs on both v180 and v184 Firmware Repro frequency: 10/10 See attached: logcat
Attached file logcat-Crash-2.txt (deleted) —
Attached file logcat--OOM.txt (deleted) —
Attached file logcat-OOM-2.txt (deleted) —
I've attached 4 separate logcats. 2 for an OOM kill where the device successfully recovers and returns to the Homescreen, and another 2 where the device freezes at the white screen.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
This issue DOES occur on Flame 2.1. Browser app crashes and device becomes unresponsive at white screen after navigating to a Polygon.com article. Environmental Variables: Device: Flame 2.1 (319MB) BuildID: 20141009000203 (Full Flash) Gaia: 7e2ef41d3ac98757acaf490b5413fb42061ad3e6 Gecko: 75ebb70f8b38 Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Version: 34.0a2 (2.1) Firmware: v184 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
QAWanted for branch checks.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Keywords: qawanted
QA Contact: ckreinbring
The bug repros on Flame 2.2 engineering, Flame 2.2 nightly and Flame 2.1 nightly. The nightly build were on full flash and the engineering build was on shallow flash. Actual result: While scrolling up and down the website in step 3 as it was loading, the device will eventually close the browser app and return to the homescreen. Flame 2.2 engineering BuildID: 20141011031924 Gaia: 95f580a1522ffd0f09302372b78200dab9b6f322 Gecko: 3f6a51950eb5 Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Platform Version: 35.0a1 Firmware Version: V180 User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0 Flame 2.2 nightly BuildID: 20141011040204 Gaia: 95f580a1522ffd0f09302372b78200dab9b6f322 Gecko: 3f6a51950eb5 Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Platform Version: 35.0a1 Firmware Version: V180 User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0 Flame 2.2 nightly BuildID: 20141011000201 Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1 Gecko: d813d79d3eae Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Platform Version: 34.0a2 Firmware Version: V180 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0 -------------------------------------------------------------------------------------------------------- The bug does not repro on Flame 2.1 engineering using shallow flash. Actual result: While scrolling up and down the website in step 3 as it was loading, the device will still load the site with no errors. BuildID: 20141011000201 Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1 Gecko: d813d79d3eae Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Platform Version: 34.0a2 Firmware Version: V180 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0 -------------------------------------------------------------------------------------------------------- When attempting to repro on Flame 2.0 nightly on full flash, the browser app will eventually show a "Well this is embarassing" error page, but the browser app will not close and return to the homescreen. It will show this error even if the user does not scroll the page. I am considering this a non-repro. BuildID: 20141011000202 Gaia: 6effca669c5baaf6cd7a63c91b71a02c6bd953b3 Gecko: 54ec9cb26b59 Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Platform Version: 32.0 Firmware Version: V180 User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Keywords: qawantedregression
[Blocking Requested - why for this release]: OOM Just to reiterate the highlights: When attempting to repro on Flame 2.0 nightly on full flash, the browser app will eventually show a "Well this is embarrassing" error page, but the browser app will not close and return to the homescreen. We are considering this a no-repro (but welcome discussion on the matter) however this will be present as a blocking issue should we try and find a regression window.
blocking-b2g: --- → 2.1?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell)
Ben, can you take a look here? Is there something we can do or is this behavior expected?
Flags: needinfo?(bfrancis)
I haven't been able to reproduce this with 1024MB RAM, but I can reproduce it on a Flame with 2.2 engineering build if I reduce the RAM to 319MB RAM, the browser window exits and the window manager returns to the homescreen. It looks like an OOM, the background homescreen process is getting killed, then the foreground process shortly afterwards: > E/QCALOG ( 326): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent] > D/charger_monitor( 395): AICL: start > E/QCALOG ( 326): [MessageQ] ProcessNewMessage: [XTWWAN-PE] unknown deliver target [OS-Agent] > E/QCALOG ( 326): [MessageQ] ProcessNewMessage: [XT-CS] unknown deliver target [OS-Agent] > D/charger_monitor( 395): AICL: start > I/cat ( 290): <6>[ 623.710782] send sigkill to 2429 (Homescreen), adj 534, size 1993 > D/charger_monitor( 395): AICL: start > V/WLAN_PSA( 214): NL MSG, len[048], NL type[0x11] WNI type[0x5050] len[028] > I/cat ( 290): <6>[ 624.346914] send sigkill to 3153 (Browser), adj 134, size 5377 > D/charger_monitor( 395): AICL: start > E/OMXNodeInstance( 2339): !!! Observer died. Quickly, do something, ... anything... > D/charger_monitor( 395): AICL: start > E/ ( 2339): Destroy C2D instance > E/ ( 2339): Destroy C2D instance > I/cat ( 290): <6>[ 624.583856] msm_vidc: 4: Closed video instance: c6345000 I haven't been able to reproduce the blank white screen, but on one occasion it did reboot the entire device! Sam, in the browser app we used to display an "oops" error screen if the foreground tab got killed. Do you know what the expected behaviour for the system browser is? Is it expected that the window will close? I wouldn't have thought that one browser window crashing should be able to bring down the whole device, could this be a memory leak?
Flags: needinfo?(bfrancis) → needinfo?(sfoster)
Whiteboard: [systemsfe]
> Sam, in the browser app we used to display an "oops" error screen if the > foreground tab got killed. Do you know what the expected behaviour for the > system browser is? Is it expected that the window will close? > > I wouldn't have thought that one browser window crashing should be able to > bring down the whole device, could this be a memory leak? I'm sorry I dont know any more than you do.
Flags: needinfo?(sfoster)
Francis, Rob said he was going to ask you what the expected behaviour is here. It seems like if browser windows are meant to stick around when they crash this might be a feature that wasn't implemented.
Flags: needinfo?(fdjabri)
(In reply to Ben Francis [:benfrancis] from comment #11) > Francis, Rob said he was going to ask you what the expected behaviour is > here. > > It seems like if browser windows are meant to stick around when they crash > this might be a feature that wasn't implemented. For the case of foreground tabs, I see no reason why the behavior of crashed tabs should be any different in 2.1 vs 2.0. So a crashed tab should show the "oops" screen. For the task manager, please see my comments in Bug #1047143. From comment #13: "The task manager should show a task strip of historical apps/browser windows, whether they are active, in the background, or killed/quiesced. If the user selects a killed/quiesced app, then it should be revived. " The interaction for reviving an app is shown in the 2.1 Task manager specs.
Flags: needinfo?(fdjabri)
Digging through the original requirements I filed for feature parity with the browser app I've found that dealing with crashed browser windows was covered by two user stories: Bug 941238 - [User Story] Sheets Manager: Crashed window Bug 941243 - [User Story] Sheets Manager: Reload killed window Those bugs reference the task manager UX spec but it looks like that spec doesn't cover the behaviour for foreground browser windows crashing. It looks like at some point during the implementation of bug 941238 it was decided (it's not clear by who) that foreground browser windows which crash are just closed. Bug 941243 was closed assuming the rest of the functionality would be covered by remaining task manager work. Peter, it looks like maybe we just didn't have a spec for this behaviour, how do you think we should proceed?
Flags: needinfo?(pdolanjski)
Is there a difference between crash and OOM? Alive noted in bug 941238, comment #13 that a crash shows a full screen dialog. Is this not being shown?
Flags: needinfo?(pdolanjski) → needinfo?(bfrancis)
That's a good question. Alive or Etienne, from the event handler in AppWindow it looks like this logic is only applied to mozbrowsererror events with evt.detail.type of "fatal". IIRC (this isn't documented on MDN) an OOM error is not "fatal", that's just for crashes. So is there different behaviour somewhere for non-"fatal" errors? I can't find any. I'm not sure if there's an easy way to manually trigger an OOM error? I guess killing the b2g process for the browser window would send a "fatal" error?
Flags: needinfo?(etienne)
Flags: needinfo?(bfrancis)
Flags: needinfo?(alive)
Hmm, actually looking over the old browser source code it looks like we only ever handled error events with type "fatal". So maybe we can't actually tell the difference between a crash and an OOM, as Alive seems to hint at in comment 14 https://bugzilla.mozilla.org/show_bug.cgi?id=941238#c14 I'm not sure why the crash reporter doesn't show, but at first glance AppWindow fires a "crashed" event whereas the crash reporter seems to listen for an "appcrashed" event.
Francis, talking with Ben this morning and we won't have access to the previous strings. So it sounds like using an alternate dialog might be one of our few options. Also, does anyone have a screen grab of the OOM error as a quick reference?
Flags: needinfo?(fdjabri)
(In reply to Ben Francis [:benfrancis] from comment #15) > That's a good question. Alive or Etienne, from the event handler in > AppWindow it looks like this logic is only applied to mozbrowsererror events > with evt.detail.type of "fatal". > > IIRC (this isn't documented on MDN) an OOM error is not "fatal", that's just > for crashes. So is there different behaviour somewhere for non-"fatal" > errors? I can't find any. > > I'm not sure if there's an easy way to manually trigger an OOM error? I > guess killing the b2g process for the browser window would send a "fatal" > error? I am afraid we cannot distinguish OOM and real crash. Maybe the problem is we should not check isActive because a crashed/killed appWindow will try to close it and become not isActive anymore. https://github.com/mozilla-b2g/gaia/blob/master/apps/system/js/crash_reporter.js#L152
Flags: needinfo?(alive)
(In reply to Ben Francis [:benfrancis] from comment #15) > IIRC (this isn't documented on MDN) an OOM error is not "fatal", that's just > for crashes. So is there different behaviour somewhere for non-"fatal" > errors? I can't find any. Another type is 'error' which means network error such as 404 page not found, but it's now handled by gecko (with system app's support) so we don't need to care it. > > I'm not sure if there's an easy way to manually trigger an OOM error? I > guess killing the b2g process for the browser window would send a "fatal" > error? This is also my question. Apparently we have difficulties to trigger OOM/crash so it's impossible to do integration tests... Cervantes do you know some tricks?
Flags: needinfo?(cyu)
I remember there is an app named 'Membuster' :D
Flags: needinfo?(cyu)
Not sure if there's any other information needed. But my suggestion would be to * change the condition in the AppWindow mozbrowsererror handling [1] so that we call |.destroyBrowser()| even if the app |isActive()| * add a bit of logic in AppWindow#destroyBrowser so that, if the app |isActive()| we display a sad face (with no text so we don't hit l10n issues) until the next |_closed| event on |this.element| [1] https://github.com/mozilla-b2g/gaia/blob/master/apps/system/js/app_window.js#L844-852
Flags: needinfo?(etienne)
(In reply to Cervantes Yu from comment #21) > I remember there is an app named 'Membuster' :D for integration tests running on b2g-desktop on a lot of different machines this probably won't help too much :)
(In reply to Etienne Segonzac (:etienne) from comment #23) > (In reply to Cervantes Yu from comment #21) > > I remember there is an app named 'Membuster' :D > > for integration tests running on b2g-desktop on a lot of different machines > this probably won't help too much :) Let's implement PCMembuster ! (joking) Maybe we just need to send mozbrowsererror type=fatal in an integration test to see what happens?
If you really want to emulate apps being killed due to OOM (especially on desktop), just kill -9 the app. This is exactly what the lowmemory killer does.
I recommend not blocking on this as it was unspecified behaviour, and adding it to the 2.2 backlog with a UX spec.
Component: Gaia::System::Browser Chrome → Gaia::System::Window Mgmt
priority bug but not blocker as it is same behavior in previous release
blocking-b2g: 2.1? → backlog
tracking-b2g: --- → +
[Blocking Requested - why for this release]: Starting last week, I have seen this issue twice on a Flame without restricted memory.
blocking-b2g: backlog → 3.0?
Can QA reproduce on a non memory restricted flame?
Keywords: qawanted
I am able to reproduce the OOM app closure on a 319MB Flame device on a 3.0 build. Environmental Variables: Device: Flame 3.0 (319MB)(Full Flash) Build ID: 20150424010200 Gaia: 0c5e2ee1173f3c53379ef3cd10de714836258fe8 Gecko: 22a157f7feb7 Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b Version: 40.0a1 (3.0) Firmware Version: v18D-1 User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0 I was NOT able to reproduce this issue on a 512MB Flame device on the same 3.0 build. Environmental Variables: Device: Flame 3.0 (512MB)(Full Flash) Build ID: 20150424010200 Gaia: 0c5e2ee1173f3c53379ef3cd10de714836258fe8 Gecko: 22a157f7feb7 Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b Version: 40.0a1 (3.0) Firmware Version: v18D-1 User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0 Adding the [319MB-Flame-Support] tag.
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: qawanted
Whiteboard: [systemsfe] → [systemsfe][319MB-Flame-Support]
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Depends on: 1162535
Not blocking on low-mem regressions.
blocking-b2g: 2.5? → ---
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: