<a class="header-button" href="https://bugzilla-dev.allizom.org/home" title="Go to home page"> Bugzilla

Updated

•

11 years ago

Blocks: b2g-central-dogfood

Comment 1

•

11 years ago

This is tough bug to action. Background processes can be killed by the OOM killer if we run out of application memory. That might mean we could have a max of 3 cards potentially in some cases depending on what apps run and how much memory they consume. Can you give more information on the gaia ui tests failing here? What's the regression range?

No longer blocks: b2g-central-dogfood

Keywords: regression, regressionwindow-wanted

Updated

•

11 years ago

Blocks: b2g-central-dogfood

Comment 2

•

11 years ago

OK that seems to correlate well with the test failure we are seeing. We have a test that opens 3 apps (Clock, Gallery and Calendar) and the opens card view and closes them off in the reverse order of listed above. When closing the Calendar we intermittently get the Gallery app closed at the same time and our test fails. Jason think this might mean a memory spike/regression in one of the three apps (or even in the Cards View code) we're using in the test. Regression range is since the nightly Unagi build noted in comment #0.

Comment 3

•

11 years ago

Okay, that implies there's a memory regression on startup of the gallery app.

Component: Gaia::System → Gaia::Gallery

Keywords: regressionwindow-wanted

Whiteboard: [fromAutomation] → [fromAutomation][MemShrink]

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

11 years ago

blocking-b2g: --- → koi?

Comment 4

•

11 years ago

Do we have clearer steps to reproduce (what sequence of apps to launch, for instance) or a regression range here? There's not much to go on for investigating this yet.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 5

•

11 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #4) > Do we have clearer steps to reproduce (what sequence of apps to launch, for > instance) or a regression range here? There's not much to go on for > investigating this yet. Regression window is in comment 0 with the first failure in a nightly build on: Gecko http://hg.mozilla.org/mozilla-central/rev/a468b2e34b04 Gaia 753bed59566ad14c5e032e45d2b320ef9529ca9a BuildID 20130909195156 Version 26.0a1 STR can be followed using the test in question in Gaia UI Automation. Zac - Which gaia ui test was used to reproduce this bug?

Flags: needinfo?(zcampbell)

Comment 6

•

11 years ago

A single data point is not a window. What is the last known good revision?

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 7

•

11 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #6) > A single data point is not a window. What is the last known good revision? We can't do that easily - B2G does not support per changeset builds unless it's done via manual generation of builds. We can only reduce regression windows down by nightly builds available. Bisections should only be requested if there's absolutely no way to figure out the issue. You've right now got a reproducible automated test + regression window. Note - That implies that the regression occurred over the weekend between Sept 6th - Sept 9th.

Comment 8

•

11 years ago

Again, a regression window involves two points, a known good revision and a known bad revision. You've provided a bad revision. We need a good revision too. I'm asking for the last *known* good revision, not the last good revision. There's no need to build anything, just provide the cset ids for the last nightly that works.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 9

•

11 years ago

Looks like we first failed with build: https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-central-unagi-eng/2013/09/2013-09-09-19-51-56/ Which implies we then last passed with build: https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-central-unagi-eng/2013/09/2013-09-06-04-02-04/ Which means the last known good revision is: Gecko http://hg.mozilla.org/mozilla-central/rev/ab5f29823236 Gaia 94e5f269874b02ac0ea796b64ab995fce9efa4b3 Version 26.0a1

Comment 10

•

11 years ago

Great, thanks. The gecko change range is http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ab5f29823236&tochange=a468b2e34b04 There's nothing in that range that really jumps out at me. Bug 817700 might be the culprit, though it has been backed out on trunk so this would be WFM then. Bug 907745 is a little suspicious, although I don't really understand what it changes.

Nick Cameron [:nrc]

Comment 11

•

11 years ago

I think you meant to cc Nical, not me. Unless I'm missing something.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 12

•

11 years ago

FWIW - Looking at the Gaia::Gallery patches in that regression range at https://github.com/mozilla-b2g/gaia/commits/master/apps/gallery, there's only three commits that we're pushed on 9/6: https://github.com/mozilla-b2g/gaia/commit/946a506e9d312b82822e9593591e0993c8cb0943 https://github.com/mozilla-b2g/gaia/commit/bfe763146301349e30fea0bd5265db28a78b2be2 https://github.com/mozilla-b2g/gaia/commit/3628d9f5829f087782c708981c4863a7d885a96a Not sure if any of those look suspicious. djf would probably know if any of them are.

Comment 13

•

11 years ago

Bug 907965 might be relevant.

David Flanagan [:djf]

Comment 14

•

11 years ago

None of those commits look suspicious to me. The first is just tests. The third is trivial. You could try reverting the second, but I doubt it will make a difference. I'm also not convinced in comment #3 that there is a gallery regression here but maybe I'm just not following the logic. Bug 914412 landed 6 days ago and modified the window management code. I don't understand it but there are comments in that bug about the OOM killer and the danger that the browser app would get kiiled. Probably not related to this because it was only supposed to affect cases where inline activities were launched. I'm mentioning it only because I don't have any other ideas. It was a gaia patch so probably easy to revert and test without, if you want to try.

Nicolas Silva [:nical]

Comment 15

•

11 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #10) > Bug 907745 is a little suspicious, although I don't really understand > what it changes. It switches code paths related compositing video and non-accelerated canvas 2d (most canvas 2D are now accelerated on b2g since skia-GL canvas landed) on b2g, to a new architecture that doesn't have all the problems we often face with using Gralloc, I have been instrumenting that code a lot for the last few weeks and I would be very surprised that it makes any difference as far as memory consumption is concerned.

David Flanagan [:djf]

Comment 16

•

11 years ago

Actually, it is the third commit above that is more substantial than the second. But neither one would even have any effect until your gallery app is scanning photos when you launch it. (And if it is, that is a memory intensive process, and 3 apps is probably actually pretty good.) Be sure, when you test, that the gallery is a stable state and is not finding and parsing metadata for new photos when you launch it, or it really isn't a fair test. If you're not scanning and are just leaving the gallery in at its list of thumbnails screen, then the commits listed above shouldn't even be running. And those commits are from the bug that Kyle links to in comment 13, so same thing there. Gallery does use a lot of memory for images, so it should always be suspect for OOMs. But in this case it feels like a red herring to me.

Comment 17

•

11 years ago

(In reply to David Flanagan [:djf] from comment #14) > None of those commits look suspicious to me. The first is just tests. The > third is trivial. You could try reverting the second, but I doubt it will > make a difference. > > I'm also not convinced in comment #3 that there is a gallery regression here > but maybe I'm just not following the logic. I think my argument thinking this was related to the gallery was because that was the app always getting killed in the test 9/9/2013 and later intermittently. We also know that scanning is a memory intensive operation. Note that this test has not failed like this for quite some time, so I do think there's a regression present here. If it was always intermittent, then I'd agree that this wasn't a valid test to use as an assessment. > > Bug 914412 landed 6 days ago and modified the window management code. I > don't understand it but there are comments in that bug about the OOM killer > and the danger that the browser app would get kiiled. Probably not related > to this because it was only supposed to affect cases where inline activities > were launched. I'm mentioning it only because I don't have any other ideas. > It was a gaia patch so probably easy to revert and test without, if you want > to try. bug 914412 however doesn't appear to fall in the regression range. (In reply to David Flanagan [:djf] from comment #16) > Actually, it is the third commit above that is more substantial than the > second. But neither one would even have any effect until your gallery app is > scanning photos when you launch it. (And if it is, that is a memory > intensive process, and 3 apps is probably actually pretty good.) If scanning photos is memory intensive, then why couldn't that be a reason why this issue happening? Could the memory resource cost of scanning photos have increased leading to the intermittent test failure? > > Be sure, when you test, that the gallery is a stable state and is not > finding and parsing metadata for new photos when you launch it, or it really > isn't a fair test. If you're not scanning and are just leaving the gallery > in at its list of thumbnails screen, then the commits listed above shouldn't > even be running. Why wouldn't having this test including scanning metadata in the background be a valid test? > > And those commits are from the bug that Kyle links to in comment 13, so same > thing there. > > Gallery does use a lot of memory for images, so it should always be suspect > for OOMs. But in this case it feels like a red herring to me. The problem I have here is that this test hasn't intermittently failed like this for some time, but now is. The discussion above seems to imply that there may be value to study memory consumption over time during startup of the gallery app with metadata parsing happening and compare against the two target builds mentioned above to see if there's an increase of memory using the same target dataset the automation uses.

Comment 18

•

11 years ago

DJF, when we run this test (automated) the SD card is wiped before the test so there are no photos at all in the Gallery. So unless the scanning takes some time despite finding nothing. However Jason is correct in that this test was very stable for a long time before this intermittent.

Flags: needinfo?(zcampbell)

Comment 19

•

11 years ago

Pardon me I didn't fill the needinfo properly! The test to replicate it is here: https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/tests/functional/cards_view/test_cards_view_with_three_apps.py The initial bug didn't mention that this was replicated on Unagi device. If it is definitely a memory usage bug it could be sensitive to the device used.

Comment 20

•

11 years ago

Talking with rwood, approaches we consider doing next is: Run the Gaia UI Test on the last working build & first affected with running adb shell b2g-ps & B2G/tools/get_about_memory.py), especially in cases in the test failing in the first affected build. With that information, we'll be able to better identify where the problem is.

Comment 21

•

11 years ago

Zac - Can someone on your team do the following: With the last working build & first affected build: 1. Run adb shell b2g-ps & about_memory.py from https://github.com/mozilla-b2g/B2G/blob/master/tools/get_about_memory.py before you run the test & dump the results into log files to keep around 2. Run the Gaia UI Test in question 3. Run adb shell b2g-ps & about_memory.py from https://github.com/mozilla-b2g/B2G/blob/master/tools/get_about_memory.py after you run the test & dump the results into log files to keep around Note - for the first affected build, make sure you can reproduce the test failure when doing this analysis. After you do this, include the results at attachments to the bug. That will give Kyle here a good enough of information to go off of on how to diagnose this bug.

Flags: needinfo?(zcampbell)

Comment 22

•

11 years ago

We can do that tomorrow!

Comment 23

•

11 years ago

OK first thing is that the changeset you guys are looking at is wrong. The correct Gecko regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=aa9ec17cf912&tochange=ab5f29823236 Last passing build: https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-central-unagi-eng/2013/09/2013-09-05-14-36-08/sources.xml First failing build: https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-central-unagi-eng/2013/09/2013-09-06-04-02-04/sources.xml I'll bring the memory/debug data soon.

Flags: needinfo?(zcampbell)

Comment 24

•

11 years ago

Attached file about:memory for passing build (obsolete) (deleted) — Details

In comment #2 I had the test steps wrong, my apologies. The test merely opens the 3 apps and asserts the order they appear in the Cards View is the inverse order they were opened (so most recent first). Which is good, the test is simpler than I thought. Now for data: Passing build ============= b2g-ps: APPLICATION USER PID PPID VSIZE RSS WCHAN PC NAME b2g root 2895 1 174448 67820 ffffffff 40064330 S /system/b2g/b2g Usage app_2932 2932 2895 65560 26576 ffffffff 400cd330 S /system/b2g/plugin-container Homescreen app_2951 2951 2895 77144 30040 ffffffff 400f2330 S /system/b2g/plugin-container Clock app_3007 3007 2895 67144 27384 ffffffff 4005b330 S /system/b2g/plugin-container Gallery app_3029 3029 2895 68176 27368 ffffffff 400d8330 S /system/b2g/plugin-container Calendar app_3042 3042 2895 71240 29556 ffffffff 4006d330 S /system/b2g/plugin-container (Preallocated a root 3043 2895 62924 22612 ffffffff 40106330 S /system/b2g/plugin-container

Comment 25

•

11 years ago

Attached file about:memory for failing build (obsolete) (deleted) — Details

Failing build (2013-09-09) ============= APPLICATION USER PID PPID VSIZE RSS WCHAN PC NAME b2g root 743 1 190728 68608 ffffffff 4010c330 S /system/b2g/b2g Usage app_781 781 743 66708 23944 ffffffff 40127330 S /system/b2g/plugin-container Homescreen app_800 800 743 80660 27124 ffffffff 40100330 S /system/b2g/plugin-container Clock app_856 856 743 74860 27948 ffffffff 400ba330 S /system/b2g/plugin-container Calendar app_890 890 743 77740 30508 ffffffff 4010f330 S /system/b2g/plugin-container (Preallocated a root 892 743 64008 22564 ffffffff 40049330 S /system/b2g/plugin-container

Comment 26

•

11 years ago

Although I have replicated this on 2013-09-06 build it is harder to replicate, as in it gets more frequent. On the 2013-09-09 build it gets worse and is easier to replicate. Perhaps it is the cumulative effect of more than one commit in this range? http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=aa9ec17cf912&tochange=740094c07328

Florin Strugariu [:Bebe]

Comment 27

•

11 years ago

Attached file Pointer to Github pull request: https://github.com/mozilla-b2g/gaia/pull/12334 (obsolete) (deleted) — Details

Pointer to Github pull-request

Florin Strugariu [:Bebe]

Comment 28

•

11 years ago

Comment on attachment 807706 [details] Pointer to Github pull request: https://github.com/mozilla-b2g/gaia/pull/12334 Sorry all wrong bug

Attachment #807706 - Attachment is obsolete: true

Comment 30

•

11 years ago

The dupe here implies that this isn't a gallery regression - it's likely a System regression now. Moving to Gaia::System as such. Note - that also confirms this is possible to reproduce outside of automation.

Component: Gaia::Gallery → Gaia::System

gbennett

Updated

•

11 years ago

status-b2g18: --- → unaffected

status-b2g-v1.2: --- → affected

Keywords: regressionwindow-wanted, smoketest

Bob Silverberg [:bsilverberg]

Comment 31

•

11 years ago

This is not a smoketest blocker - this just so happened to be caught in today's smoketest. There's already a regression window included in the above comments.

Keywords: regressionwindow-wanted, smoketest

Updated

•

11 years ago

Blocks: 922708

Updated

•

11 years ago

Keywords: perf

Gregor Wagner [:gwagner]

Comment 32

•

11 years ago

Mike, can your team take a look here and do the koi? triage?

Flags: needinfo?(mlee)

Updated

•

11 years ago

No longer blocks: b2g-central-dogfood

Dave Huseby [:huseby]

Comment 34

•

11 years ago

Since this is only in master, we switched this to 1.3? Kyle, what's your take?

blocking-b2g: koi? → 1.3?

Flags: needinfo?(khuey)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 35

•

11 years ago

(In reply to Dave Huseby [:huseby] from comment #34) > Since this is only in master, we switched this to 1.3? Kyle, what's your > take? That's incorrect. The bug as filed was filed when master was 1.2 (9/11/2013). See the affected flag which indicates this reproduces on 1.2.

blocking-b2g: 1.3? → koi?

Comment 36

•

11 years ago

The about:memory logs here are not useful. It appears that someone ran get_about_memory.py, and then opened about:memory inside the desktop Firefox browser that popped up and copied that? Please zip and attach the entire folder that get_about_memory.py creates. It should be something like $CURDIR/about-memory-N/ IIRC.

Flags: needinfo?(khuey)

Comment 37

•

11 years ago

Zac - Can you help address Kyle's comment in comment 36 when getting about:memory logs?

Flags: needinfo?(zcampbell)

Updated

•

11 years ago

Flags: needinfo?(mlee)

Whiteboard: [fromAutomation][MemShrink] → [c=memory p= s= u=] [fromAutomation] [MemShrink]

Comment 38

•

11 years ago

I can provide the info but it's very easy to do yourself. Since we had first raised this bug we had to disable more tests are the mem problem got a bit worse. Kyle, which build do you want the info for?

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 39

•

11 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #36) > The about:memory logs here are not useful. It appears that someone ran > get_about_memory.py, and then opened about:memory inside the desktop Firefox > browser that popped up and copied that? > Actually the get_about_memory.py script loads that file in Firefox automatically and pardon but that is what led me to believe it to be the correct information.

Comment 40

•

11 years ago

Yeah, I can understand why you got confused by it. Can you get a single report from a trunk build? I suspect this is the same underlying issue as bug 919864. Seeing high heap-unclassified on trunk would make more confident in that suspicion.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 41

•

11 years ago

Attached file about-memory-test_cards_view.zip (obsolete) (deleted) — Details

Here is a zip of the about-memory directory created after running the script. I've run it on yesterday's Hamachi build out of convenience but if you want to dig deeper with a bit more time I can go back to the original test cases of 1.2/Unagi device. See how this looks! Device Hamachi Gecko http://hg.mozilla.org/mozilla-central/rev/64b497e6f593 Gaia 122ff8c6363227501f4121e5a3892ba41d4c0417 BuildID 20131008064334 Version 27.0a1

Attachment #802970 - Attachment is obsolete: true

Attachment #807129 - Attachment is obsolete: true

Attachment #807144 - Attachment is obsolete: true

Flags: needinfo?(zcampbell)

Comment 42

•

11 years ago

(In reply to Zac C (:zac) from comment #41) > Created attachment 815312 [details] > about-memory-test_cards_view.zip > > Here is a zip of the about-memory directory created after running the script. > > I've run it on yesterday's Hamachi build out of convenience but if you want > to dig deeper with a bit more time I can go back to the original test cases > of 1.2/Unagi device. > > See how this looks! > > Device Hamachi > Gecko http://hg.mozilla.org/mozilla-central/rev/64b497e6f593 > Gaia 122ff8c6363227501f4121e5a3892ba41d4c0417 > BuildID 20131008064334 > Version 27.0a1 Perfect, that's the right stuff. I'll take a look at it once I'm fully awake.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 43

•

11 years ago

Ok, this looks like something different from bug 919864. We have zombie content parents floating around too. I'll have to reproduce this and dig in here.

Assignee: nobody → khuey

Updated

•

11 years ago

Blocks: 801898

Updated

•

11 years ago

No longer blocks: 801898

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=] [fromAutomation] [MemShrink] → [c=memory p= s= u=] [fromAutomation] [MemShrink] [xfail]

Dietrich Ayala (:dietrich)

Updated

•

11 years ago

Status: NEW → ASSIGNED

Updated

•

11 years ago

Target Milestone: --- → 1.2 C3(Oct25)

Nicholas Nethercote [inactive]

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=] [fromAutomation] [MemShrink] [xfail] → [c=memory p= s= u=] [fromAutomation] [MemShrink:P1] [xfail]

Updated

•

11 years ago

Keywords: qablocker

Updated

•

11 years ago

blocking-b2g: koi? → koi+

Priority: -- → P1

Whiteboard: [c=memory p= s= u=] [fromAutomation] [MemShrink:P1] [xfail] → [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] [xfail]

Montoya Clemmons (MClemmons)

Comment 44

•

11 years ago

Updating Target Milestone for FxOS Perf koi+'s.

Target Milestone: 1.2 C3(Oct25) → 1.2 C4(Nov8)

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] [xfail] → [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] [xfail] burirun3

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 46

•

11 years ago

I've been looking into this but haven't managed to reproduce it yet. Going to try with a lower memory limit in the emulator tomorrow, and if that doesn't work I'll try an actual device.

Comment 47

•

11 years ago

Hi Kyle, that does not surprise me! For the test automation that caught this bug we have it enabled for desktopb2g and expected failure for on device testing. I've never been able to work out how much RAM the desktopb2g uses; I'd still like to know. No doubt lowering the memory limit on the emulator will achieve the same effect.

Comment 48

•

11 years ago

Actually Kyle this test has started passing again on Hamachi devices. Maybe some memory improvements were made in the last couple of days. We'll enable the test but I still think it's worth looking into this using an older known buggy build so we can close this bug and know why we're closing it.

Comment 49

•

11 years ago

QA Wanted - Can we confirm this no longer reproduces? See the dupes for example STR to use to test this.

Keywords: qawanted

Comment 50

•

11 years ago

I'm still seeing this issue when opening what I'd assume are more memory intensive apps like Music or Gallery, I've had as many as 5 or 6 apps open when it's only things like Settings, and an empty Dialer, Messages, and Contacts on a fresh flash. Opening something like Gallery or Music will knock the task manager down to only 2 or 3 open apps when more have been opened previously. I get the same results in 1.2 and 1.3. Environmental Variables: Device: Buri v1.2 Mozilla RIL BuildID: 20131107004003 Gaia: 590eb598aacf1e2136b2b6aca5c3124557a365ca Gecko: 26f1e160e696 Base Image: 20131104 and Environmental Variables: Device: Buri v1.3 Mozilla RIL BuildID: 20131107040200 Gaia: 42bbe26a72e030faf07a6fc297f61a3a8ccda25b Gecko: 70de5e24d79b Version: 28.0a1 Base Image: 20131104

Keywords: qawanted

QA Contact: jzimbrick

Comment 51

•

11 years ago

A contributing factor here could be a change that was merged into gaiatest today aimed at reducing the memory used by the test framework during perf/endurance tests. In hindsight these test cases were probably suffering from the same problem, memory sucked by the test framework!

Comment 52

•

11 years ago

(In reply to Zac C (:zac) from comment #51) > A contributing factor here could be a change that was merged into gaiatest > today aimed at reducing the memory used by the test framework during > perf/endurance tests. In hindsight these test cases were probably suffering > from the same problem, memory sucked by the test framework! I think you are referring to the patch from bug 924565. That might help here, but those objects only would have started truly leaking when bug 915598 landed. That was around 10/5 and it looks like this bug was reported a month earlier.

Comment 53

•

11 years ago

Yes I was talking about bug 924565, thanks Ben. Sounds like there are more than one contributing factor. Anyway this test case was never intended to catch a memory issue - it just lucked into it.

Comment 54

•

11 years ago

(In reply to J Zimbrick from comment #50) > I'm still seeing this issue when opening what I'd assume are more memory > intensive apps like Music or Gallery, I've had as many as 5 or 6 apps open > when it's only things like Settings, and an empty Dialer, Messages, and > Contacts on a fresh flash. > Opening something like Gallery or Music will knock the task manager down to > only 2 or 3 open apps when more have been opened previously. > > I get the same results in 1.2 and 1.3. > > Environmental Variables: > Device: Buri v1.2 Mozilla RIL > BuildID: 20131107004003 > Gaia: 590eb598aacf1e2136b2b6aca5c3124557a365ca > Gecko: 26f1e160e696 > Base Image: 20131104 > > and > > Environmental Variables: > Device: Buri v1.3 Mozilla RIL > BuildID: 20131107040200 > Gaia: 42bbe26a72e030faf07a6fc297f61a3a8ccda25b > Gecko: 70de5e24d79b > Version: 28.0a1 > Base Image: 20131104 Can you specifically test this by opening the Clock app, Gallery app, and Calendar and then closing the apps in reverse order? That's the bug here - there's always going to be cases with apps getting killed in the background.

Keywords: qawanted

Comment 55

•

11 years ago

Actually, let me be really specific - test the following cases: 1. Launch the Clock app, Gallery app, and Calendar app and then close those apps in reverse order. 2. Launch the browser app, contacts app, and settings app and then close those apps in reverse order. If any of the apps get killed in the background during testing the above scenarios, indicate which one gets killed. Repeat the above test cases 3 times.

Comment 56

•

11 years ago

Removing xfail as the relevant automated test is now passing.

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] [xfail] burirun3 → [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] burirun3

Comment 57

•

11 years ago

Repeating procedures 1 and 2 stated in Comment 55 produced the following results: On 1.1: 1. Only Calendar and Clock were displayed all three times. 2. All apps were displayed on all three tries. On 1.2: 1. All three apps displayed one time, the other two tries only displayed Calendar and Clock. 2. All apps were displayed on all three tries. On 1.3: 1. All three apps displayed one time, the other two tries only displayed Calendar and Clock. 2. All apps were displayed on all three tries. 1.1's environmental variables are as follows: Environmental Variables: Device: Buri v1.1 Mozilla RIL BuildID: 20131107041203 Gaia: 39b0203fa9809052c8c4d4332fef03bbaf0426fc Gecko: 31fa87bfba88 Version: 18.0 Base Image: 20131104 The environmental variables for the 1.2 and 1.3 builds are the same as stated in Comment 50.

Updated

•

11 years ago

Keywords: qawanted

Comment 58

•

11 years ago

Thanks for the detailed analysis. I'll go talk with Sandip about this to find out if we need to block on this still knowing comment 57.

Comment 59

•

11 years ago

(In reply to J Zimbrick from comment #57) > On 1.2: > > 1. All three apps displayed one time, the other two tries only displayed > Calendar and Clock. > 2. All apps were displayed on all three tries. > > On 1.3: > > 1. All three apps displayed one time, the other two tries only displayed > Calendar and Clock. > 2. All apps were displayed on all three tries. Was there any pattern here? For example, was it the first try that all the apps survived on both v1.2 and v1.3? Or always the last try?

Comment 60

•

11 years ago

If I remember correctly, the first try on 1.2 displayed all three, and the next two tries only displayed two apps. And on 1.3 it was the opposite, where the first two tries were the ones to only show two apps, and the third showed all three.

Comment 61

•

11 years ago

(In reply to J Zimbrick from comment #60) > If I remember correctly, the first try on 1.2 displayed all three, and the > next two tries only displayed two apps. > > And on 1.3 it was the opposite, where the first two tries were the ones to > only show two apps, and the third showed all three. Darn. I thought there might be a clue there. :-) Thanks for the info!

Updated

•

11 years ago

Target Milestone: 1.2 C4(Nov8) → 1.2 C5(Nov22)

Peter Dolanjski [:pdol]

Updated

•

11 years ago

Flags: needinfo?(ffos-product)

Comment 62

•

11 years ago

Discussed in triage, but this isn't going to happen for 1.2. Product can comment here on what we can do for a future release, but I don't see this happening in 1.2.

blocking-b2g: koi+ → ---

Sandip Kamat

Comment 63

•

11 years ago

Since our performance commitment for v1.2 is not to degrade it from 1.1 to 1.2, based on comment #57, this should be a blocker for v1.2. Existing 1.1 Users would notice this degradation once their devices get v1.2 software updates. btw, Can QA provide how many cards you could open without OOM issues in 1.1? Was that 4? or more? However I do realize that memory pressure may have increased in v1.2 with newer features. Is it only the gallery app (with its memory intensive operations) that is most impacted? If we can zero down on the overall degradation here, is there a way to educate the user about this (either via UI, or with a call out in in release notes)?

Flags: needinfo?(ffos-product)

Comment 64

•

11 years ago

I think the patches in bug 924565 might help here since it will allow DOMRequestHelper objects to get cleaned up under memory pressure.

Depends on: 924565

Comment 65

•

11 years ago

Alright - moving back to the koi+ then per comment 63

blocking-b2g: --- → koi+

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] burirun3 → [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] burirun3 [xfail]

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ab5f29823236&tochange=a468b2e34b04

Comment 66

•

11 years ago

Comment 67

•

11 years ago

Wait a second, the results in comment 57 don't make sense. That would indicate the gallery was never present in the background on 1.1, but the test automation seems to indicate otherwise. We never had this test fail on 1.1.

Comment 68

•

11 years ago

See comment 67 - I think you need to recheck your 1.1 test results here.

Flags: needinfo?(jzimbrick)

Comment 69

•

11 years ago

Got the same results as comment 67 if the gallery was still searching for pictures, only Clock and Calendar would be shown in the task manager every time. If the gallery is opened and allowed to sit for a minute or so and finish loading all of the pictures, all three apps will display in the task manager. Environmental Variables: Device: Buri v1.1 Mozilla RIL BuildID: 20131115041203 Gaia: 4fa6e6362b6a35fd18c7a631dfdaca748cc22c18 Gecko: 7c3cfc0936ca Version: 18.0 Base Image: 20131104

Flags: needinfo?(jzimbrick)

Comment 70

•

11 years ago

(In reply to J Zimbrick from comment #69) > Got the same results as comment 67 if the gallery was still searching for > pictures, only Clock and Calendar would be shown in the task manager every > time. > > If the gallery is opened and allowed to sit for a minute or so and finish > loading all of the pictures, all three apps will display in the task manager. > > Environmental Variables: > Device: Buri v1.1 Mozilla RIL > BuildID: 20131115041203 > Gaia: 4fa6e6362b6a35fd18c7a631dfdaca748cc22c18 > Gecko: 7c3cfc0936ca > Version: 18.0 > Base Image: 20131104 That's not the right way to reproduce the bug. The test here requires that there are no pictures loaded in the sdcard. Retest w/o any pictures in the SD card.

Flags: needinfo?(jzimbrick)

Comment 71

•

11 years ago

I must have missed comment 16. All apps stay open when there is nothing on the SD across 1.1, 1.2, and 1.3. Environmental Variables: Device: Buri v1.1 Mozilla RIL BuildID: 20131115041203 Gaia: 4fa6e6362b6a35fd18c7a631dfdaca748cc22c18 Gecko: 7c3cfc0936ca Version: 18.0 Base Image: 20131104 Device: Buri v1.2 Mozilla RIL BuildID: 20131115004003 Gaia: a6484b1e6fc07cf6bd8d6fcf9aeebb14b7e8869d Gecko: ff2c7c9d01d6 Version: 26.0 Base Image: 20131104 Device: Buri v1.3 Mozilla RIL BuildID: 20131115040200 Gaia: ac42cb33f21b3f13595432c965f44615daae2225 Gecko: b2fab608772f Version: 28.0a1 Base Image: 20131104

Flags: needinfo?(jzimbrick)

Blake Kaplan (:mrbkap) (inactive)

Comment 72

•

11 years ago

Okay - based on the above comment, that means this is an automation only bug, likely since we're using additional memory with marionette present. We don't need to block on this.

blocking-b2g: koi+ → ---

Updated

•

11 years ago

Assignee: khuey → mrbkap

Comment 73

•

11 years ago

(In reply to Jason Smith [:jsmith] from comment #72) > Okay - based on the above comment, that means this is an automation only > bug, likely since we're using additional memory with marionette present. We > don't need to block on this. Yes I can 'fix' this on the automation side by loading the apps more slowly. We'll patch it to get the functional test back.

Updated

•

11 years ago

Assignee: mrbkap → zcampbell

Comment 74

•

11 years ago

Attached file github pr (obsolete) (deleted) — Details

Attachment #815312 - Attachment is obsolete: true

Attachment #833164 - Flags: review?(florin.strugariu)

Attachment #833164 - Flags: review?(bob.silverberg)

Comment 75

•

11 years ago

(In reply to Zac C (:zac) from comment #74) > Created attachment 833164 [details] > github pr Shouldn't we do this patch on bug 922708?

Comment 76

•

11 years ago

Spoke with Aus & Blake in IRC - there's a suspicion that this might be because of a marionette client update within the regression range, implying that this regression actually is on the marionette side. The regression would also be related to memory spikes during launching of apps in rapid succession. Dave - Do you know if there was a marionette client update within the target regression range (9/6/2013 - 9/9/2013)?

Flags: needinfo?(dave.hunt)

Updated

•

11 years ago

Attachment #833164 - Attachment is obsolete: true

Attachment #833164 - Flags: review?(florin.strugariu)

Attachment #833164 - Flags: review?(bob.silverberg)

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

11 years ago

Assignee: zcampbell → mrbkap

Comment 77

•

11 years ago

Here are the dates the Python marionette_client package versions were published to PyPI: v0.5.36: 2013-08-01 v0.5.37: 2013-09-05 v0.6.0: 2013-10-16

Flags: needinfo?(dave.hunt)

Comment 78

•

11 years ago

(In reply to Dave Hunt (:davehunt) from comment #77) > Here are the dates the Python marionette_client package versions were > published to PyPI: > > v0.5.36: 2013-08-01 > v0.5.37: 2013-09-05 > v0.6.0: 2013-10-16 Hmm...so it's plausible this could have regressed by the Marionette client changes in v0.5.37, since it's near the target regression range. Given that the above analysis indicates this is likely a regression coming from marionette & there's plausibility that v0.5.37 caused this, I'm moving this over to the Marionette component to have someone from the ateam investigate this.

Assignee: mrbkap → nobody

Component: Gaia::System → Marionette

No longer depends on: 924565

Priority: P1 → --

Product: Firefox OS → Testing

QA Contact: jzimbrick

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] [MemShrink:P1] burirun3 [xfail] → [c=memory p= s= u=1.2] [fromAutomation] burirun3 [xfail]

Target Milestone: 1.2 C5(Nov22) → ---

Comment 79

•

11 years ago

I can replicate this on an engineering build just by loading the apps really quickly. I'm going to switch back to a user build and see if I get the same thing.

Jonathan Griffin (:jgriffin)

Comment 80

•

11 years ago

It's pretty unscientific but the Engineering build can definitely not load as many apps: Base: V1.2_US_20131115.cfg Gaia: 71063dd91bc8cbb15ba335236ed67a1c5058bd58 Gecko: http://hg.mozilla.org/mozilla-central/rev/cf378dddfac8 BuildID 20131121040202 Version 28.0a1 My STR: 1. Flash the build 2. Complete the FTU 3. Pan to the next screen 4. tap calendar, tap home, tap clock, tap home, tap settings. All this should take less than 2 seconds 5. Wait 4-5 seconds 6. Open cards view and swipe across to and enter the Clock app 7. Open cards view again and check the number of apps open Kill all apps and repeat again to get a few samples. The user build will handle 3 apps and the engineering build will not.

Comment 81

•

11 years ago

I really don't think this has anything to do with Marionette, given the manual STR given by Zac. There are several differences between user and eng builds; Marionette isn't the only one.

Comment 82

•

11 years ago

(In reply to Jonathan Griffin (:jgriffin) from comment #81) > I really don't think this has anything to do with Marionette, given the > manual STR given by Zac. There are several differences between user and eng > builds; Marionette isn't the only one. Okay, I'll move it back to general. At this point, I think the best debugging strategy we have is to do the same manual STR on a user build & eng build, get the memory report during the STR, and compare the results.

Component: Marionette → General

Product: Testing → Firefox OS

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] burirun3 [xfail] → [c=memory p= s= u=1.2] [fromAutomation] burirun3

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] burirun3 → [c=memory p= s= u=1.2] [fromAutomation] burirun3 [xfail]

Brogan Zumwalt [Inactive]

Updated

•

11 years ago

Assignee: nobody → dhuseby

Priority: -- → P1

Whiteboard: [c=memory p= s= u=1.2] [fromAutomation] burirun3 [xfail] → [c=memory p= s= u=1.3] [fromAutomation] burirun3 [xfail]

Dave Huseby [:huseby]

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=1.3] [fromAutomation] burirun3 [xfail] → [c=memory p=2 s= u=1.3] [fromAutomation] burirun3 [xfail]

Updated

•

11 years ago

Whiteboard: [c=memory p=2 s= u=1.3] [fromAutomation] burirun3 [xfail] → [c=memory p=2 s= u=1.3] [fromAutomation] burirun3 [xfail] burirun1.3-1

Mason Chang [Inactive] [:mchang]

Updated

•

11 years ago

Whiteboard: [c=memory p=2 s= u=1.3] [fromAutomation] burirun3 [xfail] burirun1.3-1 → [c=memory p=2 s= u=1.3] [fromAutomation] burirun3 burirun1.3-1

Updated

•

11 years ago

Assignee: dhuseby → mchang

Updated

•

11 years ago

Whiteboard: [c=memory p=2 s= u=1.3] [fromAutomation] burirun3 burirun1.3-1 → [c=memory p=2 s= u=] [fromAutomation] burirun3 burirun1.3-1

Updated

•

11 years ago

Assignee: mchang → bkelly

Comment 83

•

11 years ago

I'm dropping this in favor of bug 951806 which is a 1.3+ blocker.

Assignee: bkelly → nobody

Status: ASSIGNED → NEW

Assignee

Updated

•

11 years ago

Assignee: nobody → jhylands

Status: NEW → UNCONFIRMED

Ever confirmed: false

Assignee

Comment 84

•

11 years ago

Attached file Memory Report 1 (deleted) — Details

Memory Report 1 is a user-build Hamachi, running 1.3 (built from source), on Thursday Feb 6, 2014. It is a baseline with only the settings app open.

Assignee

Comment 85

•

11 years ago

Attached file Memory Report 3 (deleted) — Details

Memory Report 3 is a user-build Hamachi, running 1.3 (built from source), on Thursday Feb 6, 2014. It was taken after the Calendar, Clock, and Settings apps were opened per Comment 80.

Assignee

Comment 86

•

11 years ago

Attached file Memory Report 4 (deleted) — Details

Memory Report 4 is a eng-build Hamachi, running 1.3 (built from source), on Thursday Feb 6, 2014. It was taken after the Calendar, Clock, and Settings apps were opened per Comment 80, but before the Calendar app was killed.

Assignee

Comment 87

•

11 years ago

Attached file Memory Report 5 (deleted) — Details

Memory Report 5 is a eng-build Hamachi, running 1.3 (built from source), on Thursday Feb 6, 2014. It was taken after the Calendar, Clock, and Settings apps were opened per Comment 80, after the Calendar app was killed.

Updated

•

11 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=968297

Gabriele Svelto [:gsvelto]

Updated

•

11 years ago

Keywords: qablocker

Comment 88

•

11 years ago

Bug 968297 has now landed and I've re-tested this opening the dialer, messages, clock and settings app without issues. I suggest re-testing and closing as a dup of bug 968297 if the problem went away as this was very likely caused by that issue.