Closed
Bug 797189
(slim-fast)
Opened 12 years ago
Closed 7 years ago
Tracking: Operation Slim Fast
Categories
(Core :: General, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: cjones, Unassigned)
References
(Depends on 3 open bugs)
Details
Attachments
(7 files)
The two biggest areas I'd like to get gecko optimized for b2g v1 are
- memory usage (Slim)
- app startup time (Fast)
Faster startup time means that memory usage is less of a problem, because restarting apps is quick. Low memory usage means that startup time is less problematic, because we'll restart apps less frequently. Doing well on both means epic win.
We need to choose our targets very carefully and optimize against concrete data. Priority #1 is to measure measure measure. (Then start cutting).
The top targets initially are
- memory usage of the homescreen app. This needs to be as low as possible that it's most infrequently lowmem killed.
- cold startup time for the "homescreen app" (without the prelaunch process). When memory is tight, the homescreen will be the most frequently started-up application. It will be started cold in that case.
- warm startup of dialer app (with prelaunch process). This happens on incoming phone calls; we want need it to be fast.
- warm startup of settings app (with prelaunch process). The settings app gets a lot of use.
From here we can proceed to other targets on down the list.
Reporter | ||
Updated•12 years ago
|
Alias: slim-fast
Reporter | ||
Comment 1•12 years ago
|
||
Current known wins
- Slim, bug 775997: lets us use SMS from subprocesses, which enables moving the "sms service" and "cost control services" out of process, where they can be launched on demand. Last I heard, this should save 5-10MB RAM that's locked up in the main b2g unprocess, where it can't be reclaimed.
- Fast, bug 780692: this has the main b2g process chew far fewer CPU cycles uselessly during app startup. Current estimates are shaving 300-400ms off startup time.
I have still never seen an about:memory dump from a b2g process tree, so no idea what wins are there.
Reporter | ||
Comment 2•12 years ago
|
||
- Slim, bug 797203: reports are that the keyboard app uses 10MB (in-process). It's frequently open, so this is a big overhead.
Reporter | ||
Comment 3•12 years ago
|
||
This will let IPC calls show up in profiles. It's great!
Comment 4•12 years ago
|
||
Do we have any concrete about:memory measurements? Looking at the dependent bugs I'm seeing a lot of speculation about what's taking up memory and little data.
Bug 788021 is very close to landing. The data from that should guide the "slim" side of this whole exercise.
Reporter | ||
Comment 5•12 years ago
|
||
As I said in comment 1, I've never seen an about:memory dump from a b2g process (let alone tree).
I have tried to argue for nothing other than data-guided optimization. The numbers I mention are USS; without knowing what comprises that USS, we have no way to know what wins can be had. However, USS lets us prioritize what to measure.
Reporter | ||
Comment 6•12 years ago
|
||
bjacob, can you add some of the WebGL pages you've been testing as deps here? Thanks!
Comment 7•12 years ago
|
||
Reporter | ||
Comment 8•12 years ago
|
||
Hell yeah!
Reporter | ||
Comment 9•12 years ago
|
||
\o/
Comment 10•12 years ago
|
||
Two notable things from that example.
- heap-unclassified is 33.57%, which is unusually high. (E.g. on my desktop build, it's usually less than half that, even on start-up.) Can you do trace-malloc builds on B2G? If so, my experimental rewrite of DMD (bug 717853) might be able to help, though I haven't touched it for a while.
- The JS heap is 10.3 MiB but almost half of it is unused due to fragmentation, which is due to there being 114 compartments. Note that this is not the kind of fragmentation that builds up over time, but rather, it's the fragmentation we get from having ~20 different kinds of GC things and only being able to put one kind of thing from one compartment in each 4KB arena. See bug 764220 and bug 759585 for discussion on how to fix this (but little in the way of progress).
Could the number of compartments be reduced? The author of AdBlock Plus somehow merged a bunch of separate .jsm files to overcome this problem when CPG landed. Is that possible here? I realize it would be a pain, but it might be quicker than waiting for other solutions.
Comment 11•12 years ago
|
||
> - heap-unclassified is 33.57%, which is unusually high.
Yes, but Firefox uses more memory on startup, right? Is it high in absolute terms?
I'd still like to get it down, in any case. One thing that you can test, Nick, is whether you see the same high heap-unclassified on B2G desktop builds.
See https://wiki.mozilla.org/Gaia/Hacking#Building_B2G and http://blog.margaretleibovic.com/post/32836884540/challenges-getting-started-with-gaia -- it's a matter of cloning Gaia (git, have fun), setting a mozconfig flag, and modifying your invocation of Firefox.
Comment 12•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #11)
> > - heap-unclassified is 33.57%, which is unusually high.
>
> Yes, but Firefox uses more memory on startup, right? Is it high in absolute
> terms?
On a linux64 desktop build I get around 15% heap-unclassified on start-up.
> I'd still like to get it down, in any case. One thing that you can test,
> Nick, is whether you see the same high heap-unclassified on B2G desktop
> builds.
After going to the home screen, I see this for the b2g process:
97.29 MB (100.0%) -- explicit
├──56.69 MB (58.27%) ── heap-unclassified
Yikes!
I'd like to do a DMD run. I need to be able to invoke a "javascript:DMD()" URI. Can I do that in B2G?
Comment 13•12 years ago
|
||
Actually, I can just modify the code to trigger the DMD() call when the signal hits.
Comment 14•12 years ago
|
||
> After going to the home screen, I see this for the b2g process:
>
> 97.29 MB (100.0%) -- explicit
> ├──56.69 MB (58.27%) ── heap-unclassified
It's bloated by vast amounts of gfx driver type stuff, see the attachment. I wonder why there's so much more of that stuff that I get on desktop.
Then I remembered reading that I might need to disable hwaccel on Linux. I did that and I got this:
52.87 MB (100.0%) -- explicit
...
├──12.58 MB (23.80%) ── heap-unclassified
We don't have any reporting on gfx things like mesa. I wonder if that's why the on-device heap-unclassified is so high.
Comment 15•12 years ago
|
||
> > > - heap-unclassified is 33.57%, which is unusually high.
> >
> > Yes, but Firefox uses more memory on startup, right? Is it high in absolute
> > terms?
>
> On a linux64 desktop build I get around 15% heap-unclassified on start-up.
What I meant was, on Linux 64 desktop you get 15% heap-unclassified, but do you have higher overall explicit? That is, is the heap-unclassified on b2g higher in megabytes than the heap unclassified on desktop FF?
Comment 16•12 years ago
|
||
Comment 17•12 years ago
|
||
heap-unclassified is basically unchanged in root process between |after reboot| and |after unlocking|. 13.1mb vs 13.8mb (27.7% in both cases).
We should figure out if, with the right tweaks (such as disabling hw accel) desktop memory usage is a good model for the device. I would expect it it should be a decent mode at the very least for heap-unclassified, but I could be wrong.
Comment 18•12 years ago
|
||
On Linux64 desktop at start-up I get about 70 MiB explicit, 10 MiB heap-unclassified, and 95 MiB resident.
So if cjones gets 17 MiB heap-unclassified after start-up (not sure if the attachment was taken just after start-up) then that's significantly higher, esp. considering that B2G is 32-bit.
> We should figure out if, with the right tweaks (such as disabling hw accel)
> desktop memory usage is a good model for the device. I would expect it it
> should be a decent mode at the very least for heap-unclassified, but I could
> be wrong.
Hopefully, because it'll be a lot easier to work with. Is --trace-malloc usable on B2G?
Reporter | ||
Comment 19•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #18)
> On Linux64 desktop at start-up I get about 70 MiB explicit, 10 MiB
> heap-unclassified, and 95 MiB resident.
>
> So if cjones gets 17 MiB heap-unclassified after start-up (not sure if the
> attachment was taken just after start-up) then that's significantly higher,
> esp. considering that B2G is 32-bit.
>
It wasn't taken immediately after startup, but it was taken before loading any other pages/apps.
> > We should figure out if, with the right tweaks (such as disabling hw accel)
> > desktop memory usage is a good model for the device. I would expect it it
> > should be a decent mode at the very least for heap-unclassified, but I could
> > be wrong.
>
> Hopefully, because it'll be a lot easier to work with. Is --trace-malloc
> usable on B2G?
Yes. (I assume you mean on desktop b2g; it would also work on-device, but I've never tried so don't know the perf impact.)
Also, if at all possible, we should get you a b2g device to test with. The desktop build is good for quick iteration on web code, but the gecko code that runs in desktop builds is pretty different than what runs on device.
Updated•12 years ago
|
Depends on: B2GDarkMatter
Comment 20•12 years ago
|
||
(In reply to Chris Jones [:cjones] [:warhammer] from comment #6)
> bjacob, can you add some of the WebGL pages you've been testing as deps
> here? Thanks!
I looked for some WebGL demos that could be good as "this should run on B2G phones" tests. But it's not easy to find one:
- WebGL Aquarium uses 100 M of WebGL textures alone
- CloudGlobe (see https://github.com/mozilla-b2g/B2G/issues/125) has a 63 M compartment including 52 M gc-heap and 8 M objects, plus 17 M WebGL textures, 7 M WebGL buffers....
In short, I don't have evidence yet that some WebGL pages that currently OOM, really shouldn't. Maybe JS experts would find ways to improve JS memory usage in CloudGlobe, I don't know.
I'll leave WebGL demos aside for now as far as B2G OOM issues are concerned.
Reporter | ||
Comment 21•12 years ago
|
||
Does CloudGlobe use a lot of workers?
Comment 22•12 years ago
|
||
What's the quick way to tell?
Reporter | ||
Comment 23•12 years ago
|
||
about:memory will show workers separately.
Comment 24•12 years ago
|
||
So no, there are no workers here.
Reporter | ||
Comment 25•12 years ago
|
||
OK, just too fat then :(.
Reporter | ||
Comment 26•12 years ago
|
||
Although, if the app itself uses 87MB, we should /just/ be able to load it. (Gecko overhead is in the 10MB region.) Have you tried running it as an "app"?
Comment 27•12 years ago
|
||
I haven't. Is this easy to try?
Reporter | ||
Comment 28•12 years ago
|
||
Yep. Cargo-cult from here https://github.com/mozilla-b2g/gaia/tree/master/external-apps/maps ; copy the directory and edit the URLs. Upload gaia and the entry should appear on the homescreen.
Depends on: 799519
Reporter | ||
Comment 29•12 years ago
|
||
philikon is going to start looking at things from the gaia side. Vivien, it probably makes sense for us to sync up so that we don't duplicate work :).
Comment 30•12 years ago
|
||
Just to get this down somewhere: atm we do a bad job gc'ing screenshot strings/blobs. Even if we don't leak them, they sit around for a long time before we collect them. So if we can reduce the number of screenshots we take, that's likely a memory win.
Right now, it seems like we take many identical screenshots (I'm using in-progress patches in bug 801780) under some circumstances. So that would be something I'd look closely at on the Gaia side. See e.g. bug 801306.
Reporter | ||
Comment 32•12 years ago
|
||
Here's an attempt to make the goal of this work more formal
https://wiki.mozilla.org/B2G/Memory_acceptance_criteria
It needs to be fleshed out based on the set of user stories we have, and we need to pick some extreme targets we want to support.
It's a wiki, so if something bugs you feel free to change it.
Attachment #675745 -
Flags: feedback?(justin.lebar+bug)
Attachment #675745 -
Flags: feedback?(gal)
Reporter | ||
Updated•12 years ago
|
Attachment #675745 -
Flags: feedback?(clee)
Comment 34•12 years ago
|
||
> MANDATORY: Every foreground application X does not die within the constraints
> defined below.
>
> MANDATORY: While any perceivable application X is running, every foreground
> application Y does not die within the constraints defined below.
>
> MANDATORY: While any background application X is running, every perceivable
> application Y does not die within the constraints defined below.
This is difficult for me to parse. Is the following equivalent to what you're
trying to express?
* The foreground application must not die due to OOM while a background or
perceivable application is running.
* A perceivable application must not die due to OOM while a background
application is running.
> QoS: While meeting the constraints above, maximize the number of perceivable
> and background applications that stay alive over the workloads defined below.
This isn't an acceptance criterion (you'll never know when we meet it), so I
don't think it belongs in that section (and maybe it doesn't belong in the
document at all).
An acceptance criterion might be "never regess the number of perceivable and
background apps which stay alive under workloads X, Y, and Z." That's
something QA could test for.
Updated•12 years ago
|
Attachment #675745 -
Flags: feedback?(justin.lebar+bug) → feedback+
Reporter | ||
Comment 35•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #34)
> > MANDATORY: Every foreground application X does not die within the constraints
> > defined below.
> >
> > MANDATORY: While any perceivable application X is running, every foreground
> > application Y does not die within the constraints defined below.
> >
> > MANDATORY: While any background application X is running, every perceivable
> > application Y does not die within the constraints defined below.
>
> This is difficult for me to parse. Is the following equivalent to what
> you're
> trying to express?
>
> * The foreground application must not die due to OOM while a background or
> perceivable application is running.
>
"A" foreground application. s/OOM/memory pressure/.
> * A perceivable application must not die due to OOM while a background
> application is running.
>
Yes, this is another formulation.
> > QoS: While meeting the constraints above, maximize the number of perceivable
> > and background applications that stay alive over the workloads defined below.
>
> This isn't an acceptance criterion (you'll never know when we meet it), so I
> don't think it belongs in that section (and maybe it doesn't belong in the
> document at all).
>
> An acceptance criterion might be "never regess the number of perceivable and
> background apps which stay alive under workloads X, Y, and Z." That's
> something QA could test for.
It's not mandatory, it's quality of service. Maximize implies never regress, no?
But these are English semantic quibbles, I don't care.
Comment 36•12 years ago
|
||
> Maximize implies never regress, no?
Absolutely. But never regress is a weaker constraint.
As I understand the current formulation, the product is unacceptable if the number of perceivable and background applications which stay alive under the defined workloads is not maximal. But of course we'll never know if we get to maximality. And indeed, we may need to make trade-offs between the different defined workloads.
It's a "QoS acceptance criterion", not a "mandatory" one, but QoS is never defined, and anyway I don't understand how to accept or reject the build based on the degree to which we maximize the number of background apps which don't get killed, except to say that regressions are not acceptable.
> But these are English semantic quibbles, I don't care.
I don't mean to nitpick here; I just want unambiguous and easily-understandable criteria. :)
Reporter | ||
Comment 37•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #36)
> > Maximize implies never regress, no?
>
> As I understand the current formulation, the product is unacceptable if the
> number of perceivable and background applications which stay alive under the
> defined workloads is not maximal. But of course we'll never know if we get
> to maximality. And indeed, we may need to make trade-offs between the
> different defined workloads.
>
Ah, I see where our difference is. I didn't intend "maximize" as "formal proof of maximality", since as you note that's hard. (Though probably doable.) Rather, "best-effort maximize". Sort of like compiler "optimizations", which don't produce optimal code, rather the best code that they can over time.
So I think we're saying the same thing. I like your phrasing better, let's go with it.
Depends on: 805114
Comment 38•12 years ago
|
||
This is a testcase you can use to cause the browser to OOM.
Comment 39•12 years ago
|
||
[Blocking-basecamp request]
I would like this metabug to block basecamp. I would also like explicit approval to land low-risk blockers to this bug without further approval.
Improving memory usage and speed is a high priority for us. Right now, all memory usage and speed improvements must be explicitly approved through either the approval-aurora process or the b-b process.
Most speed and memory usage improvements are not true blockers -- we can back them out and still ship the phone. That means that the b-b argument for most relevant bugs is tenuous at best. So instead we have to nom these bugs for Aurora. This slows us down because we can't ask for Aurora approval until we have an r+'ed patch, and because Aurora triage happens less-often than basecamp triage. It also slows us down because release drivers are taking the lack of blocking-basecamp+ as a signal that a patch is not important to B2G (e.g. bug 805301 comment 26), so we run the risk of more patches being approval minus'ed, and then having to argue.
Note that this solution is not entirely satisfactory to us because we have queries that look for approval-aurora+ or blocking-basecamp+ bugs which haven't been checked in to Aurora. Under this scheme, those queries will not find these bugs. We run the risk of forgetting to check some bugs in to Aurora, and developers will have to bite extra overhead keeping track of bugs, which slows us down.
I would prefer a situation where I could nom bugs which block this metabug for soft-blocking-basecamp, but akeybl has indicated in e-mail that creating this flag is not acceptable to him. In the absence of adding that new flag, I feel that marking this bug as blocking-bascamp is the best solution.
Note that we will not "fix" this bug before we ship B2G v1. But this bug should not be denied blocking-basecamp+ on the basis of its being unfixable; if you find it unpalatable that we should have a blocker bug that won't be fixed by v1, please add a soft-blocking flag.
Reporter | ||
Comment 40•12 years ago
|
||
How would dependencies of this bug become bb+ in this scheme? Carte-blanche approval isn't right, because I think you'd agree we wouldn't land a patch for e.g. bug 786818 without a significant amount more work and analysis. If we end up nom'ing deps here bb?, then I'm not sure the bb+/- status of this bug matters.
If we nom'd deps here as bb? but noted them as "soft" when nom'ing, would that solve the problem you want to solve?
Comment 41•12 years ago
|
||
> Carte-blanche approval isn't right
I've become frustrated with the fact that release drivers are not willing to accommodate this kind of bug in their processes. So in the absence of any changes to the process -- Jonas and I have proposed many that we'd be happy with, but what we're really looking for is a soft-blocking flag -- I'm asking that release drivers stop trying to manage this bug's dependencies using a process that we all agree is inapplicable here and instead let us manage risk ourselves. So yes, I'm asking for carte-blanche approval from release drivers.
It's the release drivers' contention that very few bugs actually need soft-blocker status. If this is true, then I'm asking for carte blanche approval on very few bugs. If it's not, then that speaks to the need for a soft-blocking flag.
> I think you'd agree we wouldn't land a patch for e.g. bug 786818 without a significant
> amount more work and analysis.
If a bug is "too risky", then we shouldn't land it until we believe that the reward is worth the risk, same as usual. Carte blanche approval isn't the same as right to land irrespective of the risk. It's just approval from release drivers to use our own judgement.
In all but one case I've ever observed, release drivers have (eventually) deferred to engineers' judgement. So I don't think this is actually a big change I'm asking for here.
> If we nom'd deps here as bb? but noted them as "soft" when nom'ing, would that solve the
> problem you want to solve?
Yes, that would be fine. In fact, Jonas and I suggested this to akeybl. I hope I'm not misrepresenting his position to say that he did not find it acceptable. His counter-proposal was that we'd blocking-minus such bugs and indicate "but we'll consider an uplift if nominated for Aurora" in a comment.
This is basically the status quo. I don't like this for a few reasons, all of which I've explained at length on the release-drivers mailing list. In summary:
* "blocking minus" means "this is not important", which makes it less likely that my pach will get aurora approval without requiring me to argue my case.
We've already seen instances (e.g. bug 805301 comment 26) where a patch for a bug which was merely lacking blocking+ was denied aurora uplift because the lack of the flag was considered a sign that the patch was "not important for B2G". I don't want to justify every patch at length to release drivers every time I nom a patch for Aurora uplift. It's a waste of time and attention, and it's predicated on the false premise that the work I'm doing is not important.
* Requiring every patch here to go through the Aurora uplift process adds an extra step to landing, which takes time and adds multiple days before people can use these patches. Again, this is an incredible waste of time and attention.
I've been making this case against the status quo on the release-drivers mailing list for two weeks now. That discussion has gone nowhere -- Alex's newest proposal is exactly the status quo -- which is why I'm asking for this bug to be exempt from this process. I feel like that would be more productive than continuing to debate in circles.
Reporter | ||
Comment 42•12 years ago
|
||
OK, so what you really want is triage/approval powers for this bug subtree. I'm all in favor of that.
Alex, b2g drivers, any objections?
Comment 43•12 years ago
|
||
I now have approval powers for this subtree. Thanks, Alex and Chris!
Updated•12 years ago
|
Depends on: butter-delivery
Reporter | ||
Comment 44•12 years ago
|
||
I'm seeing stopwatch-measured start times of the "Template" app on otoro taking somewhere in the neighborhood of 1.7s to go from tap icon -> fully rendered content visible. We need to re-analyze where that time is going.
Reporter | ||
Updated•12 years ago
|
Attachment #675745 -
Flags: feedback?(gal)
Attachment #675745 -
Flags: feedback?(clee)
Reporter | ||
Comment 45•12 years ago
|
||
With bug 810453 applied, b2g can run Membuster up to 122MB PSS (111MB allocated explicitly by Membuster) before it dies a fiery death. That's about 20% above what we were estimating.
Cheers, all :).
Reporter | ||
Comment 46•12 years ago
|
||
s/PSS/USS/
Comment 47•12 years ago
|
||
> That's about 20% above what we were estimating.
What estimate was that? Does this mean we've exceeded some kind of goal? I just want to understand exactly what the status is...
Reporter | ||
Comment 48•12 years ago
|
||
The goal we promised to partners was 100MB available to apps. We've exceeded that.
Updated•12 years ago
|
Depends on: 835809
Rolling up the dependencies from Bug 864943 - [meta] Reduce the amount of chrome JS we run in B2G, directly into this bug. 864943 isn't really adding much at this point, and looks like it's probably not up to date.
It does has a few 8-month-old memory reports, so if those are useful please look there.
Depends on: less-js-at-startup, 870676
Depends on: 950268
Depends on: unsalted-butter
No longer depends on: 825137
No longer depends on: butter-delivery
Comment 51•7 years ago
|
||
This old B2G bug is not seeing activity, and ongoing efforts relating to speed and memory use are being tracked elsewhere.
Comment 52•7 years ago
|
||
You didn't actually close this bug, but I assume you meant to.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•