Closed
Bug 808975
Opened 12 years ago
Closed 12 years ago
GonkHal doesn't set control group for scheduler
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(blocking-basecamp:-)
RESOLVED
DUPLICATE
of bug 861441
blocking-basecamp | - |
People
(Reporter: alan.yenlin.huang, Assigned: alan.yenlin.huang)
Details
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
justin.lebar+bug
:
feedback+
|
Details | Diff | Splinter Review |
SetProcessPriority() in GonkHal set process' oomScoreAdj. But we should also define control group for target process in ProcessPriorityManager, and set control group for scheduler in GonkHal.
It seems using androidSetThreadSchedulingGroup in Threads.cpp or direct using set_sched_policy in sched_policy.c could accomplish our purpose in GonkHal.
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → ahuang
Assignee | ||
Comment 2•12 years ago
|
||
Set threads' control group for scheduler in SetProcessPriority.
Even though there's higher control group "ANDROID_TGROUP_FG_BOOST" defined in threads.h, ICS doesn't support it yet. (As I know since JB) So we set both "PROCESS_PRIORITY_FOREGROUND" and "PROCESS_PRIORITY_MASTER" to ANDROID_TGROUP_DEFAULT.
Attachment #679546 -
Flags: review?(dhylands)
Comment 3•12 years ago
|
||
> +// This is referenced from android_util_Process.cpp with little modification.
We can't copy code from Android "with little modification". That's a likely copyright violation. Recall that Oracle's huge lawsuit against Android was over a tiny snippet of code such as this.
Can you explain how this is different from setting the processes' nice values, as we already do? Do we need to do both?
Assignee | ||
Comment 4•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #3)
> > +// This is referenced from android_util_Process.cpp with little modification.
>
> We can't copy code from Android "with little modification". That's a likely
> copyright violation. Recall that Oracle's huge lawsuit against Android was
> over a tiny snippet of code such as this.
Thanks! I'll remove that code and prepare another patch.
> Can you explain how this is different from setting the processes' nice
> values, as we already do? Do we need to do both?
Control group is different from setting nice value. All sub-group/tasks/threads share system cpu resource fairly according to their weight. I think we could consider that nice value is a scheduling priority when in same control group.
If we got all tasks/threads as cgroup=default, then they are all under /dev/cpuctl/tasks and using equally weight defined under /dev/cpuctl/cpu.shares, which is 1024 on my device.
If we got 10 tasks (f0~f9) as cgroup=default, X tasks (b0~bx) as as cgroup=bg_non_interactive. /dev/cpuctl/cpu.shares is 1024, and /dev/cpuctl/bg_non_interactive/cpu.shares is 52 (on my android device). Then system CPU resource is divided as below when all tasks are 100% loaded:
Each fn's bandwidth = 1024/(10*1024+52) = 9.9%
Sum of all bn's bandwidth = 52/(10*1024+52) = 0.5%
Note that I count "Each fn" and "Sum of all bn" because "bg_non_interactive" is a group under /dev/cpuctl.
Assignee | ||
Updated•12 years ago
|
Attachment #679546 -
Attachment is obsolete: true
Attachment #679546 -
Flags: review?(dhylands)
Comment 5•12 years ago
|
||
Apache code from Android can be copied into Gecko. If you have any concerns re: licensing please email Jishnu from legal or myself.
Comment 6•12 years ago
|
||
> Apache code from Android can be copied into Gecko.
Indeed. I suppose the lesson is that we need to be conscious of the license and call it out, instead of just saying "I copied this code".
Comment 7•12 years ago
|
||
> Control group is different from setting nice value.
So if I understand you correctly, Control groups here are used to set hard scheduling priorities, whereas nice is a soft scheduling priority. Is that right? Judging from your description, it sounds to me that we may still want to nice the root process under other processes (because they all share the same cgroup), but we could leave background processes' nice at 0.
It's still not clear to me why control groups would be better than nice; can you describe a situation where nice would do the wrong thing but control groups would do the right thing?
When you redo the patch, I think you want to add code to modify /dev/cpuctl/bg_non_interactive/cpu.shares and /dev/cpuctl/cpu.shares, just like we now do for the low-memory killer's config values.
Also, the code in your patch re-prioritizes all the threads currently in the process, but what about threads created after this code runs?
Assignee | ||
Comment 8•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #7)
> > Control group is different from setting nice value.
>
> So if I understand you correctly, Control groups here are used to set hard
> scheduling priorities, whereas nice is a soft scheduling priority. Is that
> right? Judging from your description, it sounds to me that we may still
> want to nice the root process under other processes (because they all share
> the same cgroup), but we could leave background processes' nice at 0.
>
> It's still not clear to me why control groups would be better than nice; can
> you describe a situation where nice would do the wrong thing but control
> groups would do the right thing?
>
Yes, I think we need both nice and control groups. Considering the case we have massive background threads, setting different control groups could provide much more help than nice only. As you mentioned, I think we should also use nice in the same control group, especially for cgroup=default.
> When you redo the patch, I think you want to add code to modify
> /dev/cpuctl/bg_non_interactive/cpu.shares and /dev/cpuctl/cpu.shares, just
> like we now do for the low-memory killer's config values.
>
Why do we need to modify cpu.shares? We already tried to change threads' control groups, so it seems no need to modify each control groups cpu.shares also.
> Also, the code in your patch re-prioritizes all the threads currently in the
> process, but what about threads created after this code runs?
threads created after this code runs should inherit (clone) the control group from parent.
Comment 9•12 years ago
|
||
> Why do we need to modify cpu.shares?
You said earlier
> /dev/cpuctl/cpu.shares is 1024, and /dev/cpuctl/bg_non_interactive/cpu.shares is 52 (on
> my android device).
How do you know those are the right settings for our B2G device? How do you know those are the right settings for all B2G devices?
Assignee | ||
Comment 10•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #9)
> > Why do we need to modify cpu.shares?
>
> You said earlier
>
> > /dev/cpuctl/cpu.shares is 1024, and /dev/cpuctl/bg_non_interactive/cpu.shares is 52 (on
> > my android device).
>
> How do you know those are the right settings for our B2G device? How do you
> know those are the right settings for all B2G devices?
Oh, I didn't say it clearly. I mean, do we need to modify those value at run time? For each device, we could set those just once, maybe at build time.
Comment 11•12 years ago
|
||
Ah, I understand.
We don't need to set them in Gecko (at runtime), but I had a long discussion with cjones about why I think it's good to set them there. See bug 771195 comment 41 and on. And on and on... :)
Assignee | ||
Comment 12•12 years ago
|
||
Hi Justin,
I made this revised patch. Just as EnsureKernelLowMemKillerParamsSet, I made EnsureCgroupCPUSharesParamsSet but currently lack the calculation of /dev/cpuctl/bg_non_interactive/cpu.shares. I have no idea what bases should we consider to decide this value yet.
Also, I checked android kernel and found it seems didn't define fair group schedule. So how much cpu resource bg tasks could get could be expressed in a quite simple formulation:
(/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is
52/1024 = 5% in my otoro.
Attachment #682352 -
Flags: feedback?(justin.lebar+bug)
Comment 13•12 years ago
|
||
> So how much cpu resource bg tasks could get could be expressed in a quite simple
> formulation:
> (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is
> 52/1024 = 5% in my otoro.
What happens if the foreground processes do not want any CPU? Are the bg processes still limited to 5%, or do they get 100%?
Comment 14•12 years ago
|
||
Comment on attachment 682352 [details] [diff] [review]
Set threads' control group for scheduler in SetProcessPriority, v2
This looks like the right idea, but we need to figure out how to tweak these numbers. At the very least you'll need to set the nice value of background processes to 0.
Attachment #682352 -
Flags: feedback?(justin.lebar+bug) → feedback+
Assignee | ||
Comment 15•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #13)
> > So how much cpu resource bg tasks could get could be expressed in a quite simple
> > formulation:
> > (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is
> > 52/1024 = 5% in my otoro.
>
> What happens if the foreground processes do not want any CPU? Are the bg
> processes still limited to 5%, or do they get 100%?
It could get 100% in this case.
(In reply to Justin Lebar [:jlebar] from comment #13)
> > So how much cpu resource bg tasks could get could be expressed in a quite simple
> > formulation:
> > (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is
> > 52/1024 = 5% in my otoro.
>
> What happens if the foreground processes do not want any CPU? Are the bg
> processes still limited to 5%, or do they get 100%?
It could get 100% in this case.(In reply to Justin Lebar [:jlebar] from comment #14)
> Comment on attachment 682352 [details] [diff] [review]
> Set threads' control group for scheduler in SetProcessPriority, v2
>
> This looks like the right idea, but we need to figure out how to tweak these
> numbers. At the very least you'll need to set the nice value of background
> processes to 0.
I still keep set nice value here:
https://bugzilla.mozilla.org/attachment.cgi?id=682352&action=diff#a/hal/gonk/GonkHal.cpp_sec5
I'm considering could we have this bug to apply setting control group first (using partner default numbers), and another bug about tweak those numbers?
Comment 16•12 years ago
|
||
> I still keep set nice value here:
Right. But the nice value for background-group processes should be 0, right?
Comment 17•12 years ago
|
||
> I'm considering could we have this bug to apply setting control group first (using partner default
> numbers), and another bug about tweak those numbers?
You'll need to do some testing to show that, for example, the radio and music player apps work fine with cgroups while they're in the background. But if you can show that the device works fine with your parameters, we can "tweak" them in a separate bug.
To be honest, I'm really not sure why we're spending time on this right now. It's not a blocker, and it's not clear to me that there's a lot of benefit to be had over nice.
Assignee | ||
Comment 18•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #16)
> > I still keep set nice value here:
> Right. But the nice value for background-group processes should be 0, right?
I think the nice value for background-group processes should be just as we set currently, 6. We set -1/0/6 for master/fg/bg now, may I know why you think we need to change that here?
(In reply to Justin Lebar [:jlebar] from comment #17)
> > I'm considering could we have this bug to apply setting control group first (using partner default
> > numbers), and another bug about tweak those numbers?
>
> You'll need to do some testing to show that, for example, the radio and
> music player apps work fine with cgroups while they're in the background.
> But if you can show that the device works fine with your parameters, we can
> "tweak" them in a separate bug.
>
I've been testing this on my phone for more than one day. Apps like music work fine even they were set to bg_non_interactive. I think this also came from that we never treat mediaserver as background process. They sometimes killed by LMK, but this patch would not change oom_adj, so I think this would still happens even without this patch.
> To be honest, I'm really not sure why we're spending time on this right now.
> It's not a blocker, and it's not clear to me that there's a lot of benefit
> to be had over nice.
I agree that this is not a blocker. Without setting scheduling policy, there won't be additional stability or memory consumption defects. But in our experience, setting correct scheduling policy to tasks could make user experience better.
Comment 19•12 years ago
|
||
> may I know why you think we need to change that here?
I thought that was what we agreed to do in comments 7 and 8? I guess it doesn't matter much; if all of the tasks in the background cgroup have the same nice, it doesn't matter what that nice is. But setting it to 0 seems cleaner than setting it to 6.
Unless I'm misunderstanding something?
Assignee | ||
Comment 20•12 years ago
|
||
Bug 861441 - "GonkHal::SetPriority does not set CPU priorities properly" would contains what we want to do here.
Since Bug 861441 is tef+ and this is blocking-basecamp-, I mark this one as duplicate with bug 861441.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Comment 21•12 years ago
|
||
Turns out this was an important thing to do. Just not for the reasons we thought! :)
You need to log in
before you can comment on or make changes to this bug.
Description
•