Closed Bug 815355 Opened 12 years ago Closed 12 years ago

Failure in "MW1: Music stays alive"; Music doesn't stay alive

Categories

(Core :: DOM: Core & HTML, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED DUPLICATE of bug 793105
B2G C3 (12dec-1jan)
blocking-basecamp +

People

(Reporter: cjones, Assigned: cjones)

References

Details

See https://wiki.mozilla.org/B2G/Memory_acceptance_criteria#W1:_Music_stays_alive for STR and symptoms.

The problem here is that we don't elevate the "perceivable" Music player into a "perceivable" OOM class.  To fix that, we should
 1. create a "perceivable" OOM class in between background and foreground in priority
 2. when an app transitions visible:true -> visible:false, *AND* it holds a CPU wake lock, move it into the "perceivable" class.
 3. [optional] when a perceivable app drops its CPU wake lock, move it into the background class

This should be implementable entirely within gecko.
To prevent apps from artificially bumping their OOM classes, the action in (2) above should only happen for distinguished apps with a special "can-hold-CPU-wakelock-while-in-background" permission or something to that effect.  Is that already implemented?
Flags: needinfo?(jonas)
Summary: Failure in "W1: Music stays alive" → Failure in "W1: Music stays alive"; Music doesn't stay alive
Currently all pages could use the "cpu" wakelock.
Even while in the background?
Yes, even while in the background. But the wakelock will be in "locked-background" state so it can be distinguished.
Is that how this interface was designed?  Letting background apps destroy battery life without any way for the user to assign blame doesn't seem like a good idea to me.

I guess it could be argued that that's an orthogonal problem, but extending this out to allow apps to prioritize their RAM pages *on top of* destroying battery life makes me squeamish.
> Is that how this interface was designed?

I think we all agree that not all apps should have CPU wake lock permissions.

The way the wake lock interface is designed, Gecko does not attach any semantics to the wake lock names.  When a page requests wake lock "foo", Gecko sends a message to Gaia saying "foo is now locked", and indicating whether anyone who has a lock on foo is in the foreground.

We can fix the interface for this by sending the origin of all the pages holding the foo wake lock (and also indicating which of those pages are in the foreground).  Again, it'd be up to Gaia to check permissions.
Or create a new visibility state "perceivable" that maps apps that have this permission to be in "perceivable" instead of "background" state; the wake lock states will become "locked-foreground", "locked-perceivable", "locked-background".
I'm not sure that I'm that concerned by the fact that apps can use "cpu" locks to stay awake in the background.

First of all, all apps are currently kept running in the background so we're already using an optimally battery-consuming policy. I'd really prefer to only let non-visible apps run if they are holding the "cpu" wake-lock, but that's not a v1 feature at this point.

But it's always been the case that apps have freedom to waste user's battery life. Things like the alarm API and push API allows an app to wake up as often as it wants which gives broad battery wasting powers.

Rather than trying to limit the amount of battery usage that an app can cause, I think it's better to expose battery usage to users. Trying to limit battery usage will just result in hurdles for developers that are trying to do the right thing. Either they'll have to go through annoying review processes (which also limits their ability to go through non-mozilla-marketplaces stores for now), or they'll run into annoying limitations such as you can only have X number of alarms per day.

I think this would result in a worse end result for users since after all, most apps that get installed are written by the "good guys".

Instead we should try to make it possible for developers and users to track how much battery consumption various apps use. That's obviously not a v1 feature though.
Flags: needinfo?(jonas)
(In reply to Jonas Sicking (:sicking) from comment #8)
> I'm not sure that I'm that concerned by the fact that apps can use "cpu"
> locks to stay awake in the background.
> 

It's not so much that.  It's that they can keep the CPU from *suspending*, whether in the background or not.  And users have no idea they're doing that.  Their phones' batteries will just die within a few hours.

Although maybe gaia or b2g chrome prevents that somehow?  Based on earlier discussion, it doesn't seem like it, but I don't know for sure.

> First of all, all apps are currently kept running in the background so we're
> already using an optimally battery-consuming policy. I'd really prefer to
> only let non-visible apps run if they are holding the "cpu" wake-lock, but
> that's not a v1 feature at this point.

Apps in the background that don't hold get a CPU wake-lock can't prevent the CPU from suspending.

While the screen is on, the backlight and display machinery consume around an order of magnitude more power than the CPU, so background apps can only affect battery life by 10% or so while the screen is on.  (My numbers may be a little dated, though.)

It's true that if a privileged background app holds a CPU wake lock, then non-privileged background apps will incidentally be allowed to use the CPU even when the screen is off.  They can have a significant impact on power consumption in that case.  However, I'm not too concerned about that for v1.

> But it's always been the case that apps have freedom to waste user's battery
> life. Things like the alarm API and push API allows an app to wake up as
> often as it wants which gives broad battery wasting powers.

Those are privileged APIs, right?

> Rather than trying to limit the amount of battery usage that an app can
> cause, I think it's better to expose battery usage to users. Trying to limit
> battery usage will just result in hurdles for developers that are trying to
> do the right thing. Either they'll have to go through annoying review
> processes (which also limits their ability to go through
> non-mozilla-marketplaces stores for now), or they'll run into annoying
> limitations such as you can only have X number of alarms per day.

Again, the problem here isn't CPU usage per se (although that is a problem), but rather preventing the CPU from suspending.

How could we charge energy usage among
 - background apps that keep the CPU from suspending, but don't use CPU cycles
 - background apps that don't keep the CPU from suspending, but do consume CPU cycles in the background
?

I very much agree with your higher-level goal, but I think the use cases for holding a CPU wake lock in a background process are exceedingly few and far between.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #9)
> (In reply to Jonas Sicking (:sicking) from comment #8)
> > I'm not sure that I'm that concerned by the fact that apps can use "cpu"
> > locks to stay awake in the background.
> 
> It's not so much that.  It's that they can keep the CPU from *suspending*,
> whether in the background or not.  And users have no idea they're doing
> that.  Their phones' batteries will just die within a few hours.

Do we have to make the "cpu" wakelock prevent the CPU from suspending? Rather than just preventing the app from getting "frozen". I.e. rather than preventing us from firing timers and other DOM Events.

Is there a use-case for exposing a wakelock that prevents the CPU from suspending to any app, certified or 3rd party.

> Although maybe gaia or b2g chrome prevents that somehow?  Based on earlier
> discussion, it doesn't seem like it, but I don't know for sure.

Given that we don't have APIs exposed to JS for preventing CPU suspending right now (and wakelocks are implemented in JS), I don't think that the cpu wakelock currently prevents CPU suspension.

> > But it's always been the case that apps have freedom to waste user's battery
> > life. Things like the alarm API and push API allows an app to wake up as
> > often as it wants which gives broad battery wasting powers.
> 
> Those are privileged APIs, right?

They are not. Any app has the permission to use them if they are enumerated in the app manifest. So any app that the user installs from any website or 3rd party store can use these APIs.

> I very much agree with your higher-level goal, but I think the use cases for
> holding a CPU wake lock in a background process are exceedingly few and far
> between.

Agreed. If we make the "cpu" lock just prevent us from stopping timers and other DOM Events, as well as put the app in a OOM class in between background and foreground, does that satisfy your concern?

Though right now we'd only do the OOM class thing since we never stop timers and DOM Events.

I don't care terribly much if we call the lock something other than "cpu".

We had a similar discussion in some other bug about the meaning of the "cpu" lock, but I forget what bug that was now.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #9)
> It's not so much that.  It's that they can keep the CPU from *suspending*,
> whether in the background or not.  And users have no idea they're doing
> that.  Their phones' batteries will just die within a few hours.
> 
> Although maybe gaia or b2g chrome prevents that somehow?  Based on earlier
> discussion, it doesn't seem like it, but I don't know for sure.

The wake lock here is not the low level wake-lock defined by Android hal. The meaning of a "cpu" wake-lock is defined by Gaia. Gaia can decide whether to keep CPU from suspending or not when "cpu" is "locked-background".

We do need to know if a app has the permission to keep CPU from suspending, either by limiting the "cpu" wake-lock permission or providing more information for Gaia to be able to decide that.
(In reply to Jonas Sicking (:sicking) from comment #10)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #9)
> > (In reply to Jonas Sicking (:sicking) from comment #8)
> > > I'm not sure that I'm that concerned by the fact that apps can use "cpu"
> > > locks to stay awake in the background.
> > 
> > It's not so much that.  It's that they can keep the CPU from *suspending*,
> > whether in the background or not.  And users have no idea they're doing
> > that.  Their phones' batteries will just die within a few hours.
> 
> Do we have to make the "cpu" wakelock prevent the CPU from suspending?
> Rather than just preventing the app from getting "frozen". I.e. rather than
> preventing us from firing timers and other DOM Events.
> 

No, but that additional interface would be the use case you agreed to punt for v1 of denying background apps CPU entirely, which I mostly agree with.  The goal, that is; I 100% agree with punting for v1 ;).

> Is there a use-case for exposing a wakelock that prevents the CPU from
> suspending to any app, certified or 3rd party.

Yes, a telephony app, for example.  A music player is another.

> > Although maybe gaia or b2g chrome prevents that somehow?  Based on earlier
> > discussion, it doesn't seem like it, but I don't know for sure.
> 
> Given that we don't have APIs exposed to JS for preventing CPU suspending
> right now (and wakelocks are implemented in JS), I don't think that the cpu
> wakelock currently prevents CPU suspension.

I sure hope that's not true, or we won't hit our certification requirements.  You would be unhappy if your phone suspended in the middle of a 911 call ;).

> > I very much agree with your higher-level goal, but I think the use cases for
> > holding a CPU wake lock in a background process are exceedingly few and far
> > between.
> 
> Agreed. If we make the "cpu" lock just prevent us from stopping timers and
> other DOM Events, as well as put the app in a OOM class in between
> background and foreground, does that satisfy your concern?

Well, one goal is to keep music playing while the screen is on, and some apps are in the foreground, and other apps are in the background but aren't doing anything the user perceives.  That's the specific goal here, in fact.

A related goal (another user story, I think) is that music keeps playing while the phone's screen is off.  To do that we need to prevent the CPU from suspending.  Not suspending the CPU is required to not throttle DOM timers and events.  But, we would not want those throttled either in both cases, too.

I think I've been treating those as the same thing, but I think you're seeing them differently.

> We had a similar discussion in some other bug about the meaning of the "cpu"
> lock, but I forget what bug that was now.

My recollection is that we settled on "prevent the CPU from suspending", but I may be misremembering.
(In reply to Kan-Ru Chen [:kanru] from comment #11)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #9)
> > It's not so much that.  It's that they can keep the CPU from *suspending*,
> > whether in the background or not.  And users have no idea they're doing
> > that.  Their phones' batteries will just die within a few hours.
> > 
> > Although maybe gaia or b2g chrome prevents that somehow?  Based on earlier
> > discussion, it doesn't seem like it, but I don't know for sure.
> 
> The wake lock here is not the low level wake-lock defined by Android hal.
> The meaning of a "cpu" wake-lock is defined by Gaia. Gaia can decide whether
> to keep CPU from suspending or not when "cpu" is "locked-background".


My question was, what does gaia do now?
CPU is allowed to suspend only if state is not locked.

          case 'cpu':
            power.cpuSleepAllowed = (state != 'locked-foreground' &&
                                     state != 'locked-background');
That code means that gaia allows arbitrary content in the background to lock the CPU  out of suspend, right?
Yes.
Summary: Failure in "W1: Music stays alive"; Music doesn't stay alive → Failure in "MW1: Music stays alive"; Music doesn't stay alive
We discussed this at length during triage today.  The telephony use case alone (mentioned by cjones in comment #3) warrants this being a blocker.
blocking-basecamp: ? → +
So there are several issues being discussed here:

* The music app is just as likely as any other app to get killed in OOM situations.

One potential solution is to change the OOM class of apps that are holding the "audio" wakelock.

I'm not convinced that this is a blocker. Though would be very nice to fix.


* The telephony app is just as likely as any other app to get killed in OOM situations.

We most likely should fix this by giving a *very* generous OOM class to apps that are using the telephony API. Or institute some sort of "telephony" wakelock which is only accessible to apps which have the telephony permission. Using whatever solution we'll use for the music player sounds wrong since the music player should be killed before the telephony app if I'm on a call.

This definitely sounds like a blocker.


* The "cpu" lock allows any app to prevent the CPU from suspending.

I'm still not convinced that it's a problem that apps can prevent the CPU from suspending since there are lots of use-cases for doing so: Music players, apps that need to keep the CPU running for a short period of time to save data, apps that do number-crunching, GPS-navigation apps.

But I could agree that maybe simply holding a "cpu" wakelock makes this too easy to do by mistake.

This should probably be a totally separate bug.
> So there are several issues being discussed here:

That's just one way of looking at it.

An alternative way of looking at it is that there are two issues:

* Pages holding the CPU wake lock should be in the "background perceivable" OOM class.
* Pages should need a permission in order to hold the CPU wake lock in the background.

I much prefer this approach.
(In reply to Justin Lebar [:jlebar] from comment #19)
> > So there are several issues being discussed here:
> 
> That's just one way of looking at it.
> 
> An alternative way of looking at it is that there are two issues:
> 
> * Pages holding the CPU wake lock should be in the "background perceivable"
> OOM class.

That will mean that if the music player running when you're on a call, we're as likely to kill the music player app as we are the telephony app.

Is that ok?

> * Pages should need a permission in order to hold the CPU wake lock in the
> background.

I'm not convinced that this will benefit users more than it will hurt them. I.e. I think the risk is bigger that the user will install an app which would have benefited from holding a CPU lock but isn't being installed through the mozilla marketplace, than that the user will install an evil app that intentionally drains the user's battery.

Either way though, I still think this is a separate bug.
> That will mean that if the music player running when you're on a call, we're as likely 
> to kill the music player app as we are the telephony app.
>
> Is that ok?

I guess the telephony app is a special case; if I'm on a call, telephony should always be the last app to be killed, period.  For example, if I take a call and then background the telephony app and open the browser and run out of memory, we should kill the browser, not the telephony app.

> I.e. I think the risk is bigger that the user will install an app which would have 
> benefited from holding a CPU lock but isn't being installed through the mozilla 
> marketplace, than that the user will install an evil app that intentionally drains the 
> user's battery.

Only apps installed through the Mozilla Marketplace can have permissions?  That's news to me.

I think it's more likely that a stupid app will unintentionally hold the CPU lock, personally.  Halorn's razor and all that.  I think your idea of having a notification that X app is keeping itself alive is worth considering too, although that's harder to implement for V1.
The telephony app always has visible UI during calls so it should always stay "foreground".
(In reply to Jonas Sicking (:sicking) from comment #20)
> (In reply to Justin Lebar [:jlebar] from comment #19)
> > * Pages should need a permission in order to hold the CPU wake lock in the
> > background.
> 
> I'm not convinced that this will benefit users more than it will hurt them.
> I.e. I think the risk is bigger that the user will install an app which
> would have benefited from holding a CPU lock but isn't being installed
> through the mozilla marketplace, than that the user will install an evil app
> that intentionally drains the user's battery.
> 
> Either way though, I still think this is a separate bug.

I agree with jlebar, the issue isn't so much evil apps as just long-tail apps.

The situation is much like desktop FF and localstorage.  We built a footgun API that we can't support without a degraded UX.  Pages set off the footgun and FF gets blamed, not the pages.
(In reply to Justin Lebar [:jlebar] from comment #21)
> > That will mean that if the music player running when you're on a call, we're as likely 
> > to kill the music player app as we are the telephony app.
> >
> > Is that ok?
> 
> I guess the telephony app is a special case; if I'm on a call, telephony
> should always be the last app to be killed, period.  For example, if I take
> a call and then background the telephony app and open the browser and run
> out of memory, we should kill the browser, not the telephony app.

Agreed. Maybe worthy of a separate bug? :-)

> > I.e. I think the risk is bigger that the user will install an app which would have 
> > benefited from holding a CPU lock but isn't being installed through the mozilla 
> > marketplace, than that the user will install an evil app that intentionally drains the 
> > user's battery.
> 
> Only apps installed through the Mozilla Marketplace can have permissions? 
> That's news to me.

Depends on what you mean by "can have a permission".

We can, and even should, require that wakelocks are enumerated in the app manifest as a permission.

The real question is which types of apps are allowed to enumerate the permission. We have three categories:

* Permissions that any app can enumerate. For example the alarm API. These permissions are effectively available to any app.
* Permissions that only privileged apps can enumerate. For example Contacts API. These permissions are only accessible to apps installed through the mozilla marketplace since review and signing is required to be a privileged app.
* Permissions only available to certified apps. For example telephony. Only available to mozilla-developed apps. This category is obviously not really relevant here.

So I'm totally fine with putting cpu-wakelock in the first category. That effectively doesn't limit access to cpu-wakelocks in any meaningful way. Any random person can write an app and put up on their website and use cpu-wakelocks to their hearts content.

The only thing that it does do is give the mozilla marketplace team a chance to look for wakelock abuses in the apps that are put in the marketplace.


The second category is the only (apart from the third) one which would actually limit evil-doer's ability to abuse cpu-wakelocks. It's the only category that guarantees that users won't end up with apps installed which use (intentionally or accidentally) cpu-wakelocks to drain battery. At least assuming perfect reviews by the marketplace team.


Which category were you referring to?

> I think it's more likely that a stupid app will unintentionally hold the CPU
> lock, personally.  Halorn's razor and all that.  I think your idea of having
> a notification that X app is keeping itself alive is worth considering too,
> although that's harder to implement for V1.

I agree. I think the accidental usage is the main concern. Especially given that the incentive to write apps that drain battery power is pretty low.

It would be great if we could think of a way to change the API such that there's less risk that people do this by accident.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #22)
> The telephony app always has visible UI during calls so it should always
> stay "foreground".

See the example in comment 21. You can still switch to another app while you're on a call. But we should still prioritize killing the telephony app last.

Though maybe that case isn't common enough that it needs to be a blocker? I don't really feel strongly either way.
I'm not sure how to phrase it differently than, "the telephony app always has visible UI during a call" ;).  Try making a call on a b2g phone to see what I'm referring to.
*making a call and switching to another app.
However, bug 815856 :).
Chris, are you the best owner here?
Assignee: nobody → jones.chris.g
(In reply to Andrew Overholt [:overholt] from comment #17)
> We discussed this at length during triage today.  The telephony use case
> alone (mentioned by cjones in comment #3) warrants this being a blocker.

As we discussed in later comments, the dialer isn't affected by this bug because it always has visible UI during calls.  Because of that, it must stay "foreground".  (Except bug 815856, which is a plain-jane bug.)

I want the memory-usage acceptance criteria to serve the same kind of function as product requirements / user stories.  That is, we accept the codebase if it meets the criteria, reject if not.  This implies bb+.

So the discussion to have here is whether this should be an acceptance criterion.  I argued for it based on feedback from partners and competitive parity.  I don't know of a product requirement or user story that specifically covers this yet; if there were, this would be automatically bb+.
Flags: needinfo?(clee)
Setting priority based on triage discussions.  Feel free to decrease priority if you disagree.
Priority: -- → P1
Trying to get the discussion moving again. With bug 815856 filed, there's now two issues that have been discussed in this bug.

* The music app needs to get a higher OOM class while playing music somehow.

* Applications holding the "cpu" wakelock can keep the cpu from going into suspend
  mode, and can do so without enumerating anything permissions in the manifest.

I see two possible solutions to each of these issues.

For the music-app OOM class problem, we can either solve this by giving apps with an "audio" wakelock a higher OOM class. Or we can do it by giving apps with a "cpu" wakelock a higher OOM class. Or both.

For the "cpu" wakelock problem we can either require that the app lists wake-lock-cpu in the permissions list in the manifest, or we can do that *and* limit access to that permission to privileged apps.


I'm personally of the opinion that both "cpu" and "audio" locks should give an elevated OOM class. Probably "cpu" even higher than "audio". That will minimize the number of apps that will want to hold a "cpu" lock.

And I'm of the opinion that we should not restrict neither "cpu" or "audio" locks to privileged apps. But that we should require them enumerated in the permissions list.
Mass Modify: All un-milestoned, unresolved blocking-basecamp+ bugs are being moved into the C3 milestone. Note that the target milestone does not mean that these bugs can't be resolved prior to 12/10, rather C2 bugs should be prioritized ahead of C3 bugs.
Target Milestone: --- → B2G C3 (12dec-1jan)
(In reply to Chris Jones [:cjones] [:warhammer] from comment #30)
> (In reply to Andrew Overholt [:overholt] from comment #17)
> > We discussed this at length during triage today.  The telephony use case
> > alone (mentioned by cjones in comment #3) warrants this being a blocker.
> 
> As we discussed in later comments, the dialer isn't affected by this bug
> because it always has visible UI during calls.  Because of that, it must
> stay "foreground".  (Except bug 815856, which is a plain-jane bug.)
> 
> I want the memory-usage acceptance criteria to serve the same kind of
> function as product requirements / user stories.  That is, we accept the
> codebase if it meets the criteria, reject if not.  This implies bb+.

Agree, assuming we have defined memory-usage acceptance criteria (which we should), this should be bb+ if the results are below the threshold. 

Who owns creating this criteria?

> 
> So the discussion to have here is whether this should be an acceptance
> criterion.  I argued for it based on feedback from partners and competitive
> parity.  I don't know of a product requirement or user story that
> specifically covers this yet; if there were, this would be automatically bb+.

Today, the user stories do not call out memory/perf usage goals so this would need to be tracked separately.
Flags: needinfo?(clee)
(In reply to Chris Lee [:clee] from comment #34)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #30)
> > (In reply to Andrew Overholt [:overholt] from comment #17)
> > > We discussed this at length during triage today.  The telephony use case
> > > alone (mentioned by cjones in comment #3) warrants this being a blocker.
> > 
> > As we discussed in later comments, the dialer isn't affected by this bug
> > because it always has visible UI during calls.  Because of that, it must
> > stay "foreground".  (Except bug 815856, which is a plain-jane bug.)
> > 
> > I want the memory-usage acceptance criteria to serve the same kind of
> > function as product requirements / user stories.  That is, we accept the
> > codebase if it meets the criteria, reject if not.  This implies bb+.
> 
> Agree, assuming we have defined memory-usage acceptance criteria (which we
> should), this should be bb+ if the results are below the threshold. 
> 
> Who owns creating this criteria?

Right now, me.

> > 
> > So the discussion to have here is whether this should be an acceptance
> > criterion.  I argued for it based on feedback from partners and competitive
> > parity.  I don't know of a product requirement or user story that
> > specifically covers this yet; if there were, this would be automatically bb+.
> 
> Today, the user stories do not call out memory/perf usage goals so this
> would need to be tracked separately.

I meant specifically for this bug: is there a user story that music stays playing in the background above all other background apps?  Or does that fall under "memory/perf goals" in your categorization?
This will be fixed by bug 793105, but I'm going to leave this open to verify.
Verified that the patch in bug 793105 gets us passing this test.

\o/
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.