Closed Bug 808975 Opened 12 years ago Closed 12 years ago

GonkHal doesn't set control group for scheduler

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-basecamp:-)

RESOLVED DUPLICATE of bug 861441
blocking-basecamp -

People

(Reporter: alan.yenlin.huang, Assigned: alan.yenlin.huang)

Details

Attachments

(1 file, 1 obsolete file)

SetProcessPriority() in GonkHal set process' oomScoreAdj. But we should also define control group for target process in ProcessPriorityManager, and set control group for scheduler in GonkHal. It seems using androidSetThreadSchedulingGroup in Threads.cpp or direct using set_sched_policy in sched_policy.c could accomplish our purpose in GonkHal.
Assignee: nobody → ahuang
not blocking, but we really want!
blocking-basecamp: ? → -
Set threads' control group for scheduler in SetProcessPriority. Even though there's higher control group "ANDROID_TGROUP_FG_BOOST" defined in threads.h, ICS doesn't support it yet. (As I know since JB) So we set both "PROCESS_PRIORITY_FOREGROUND" and "PROCESS_PRIORITY_MASTER" to ANDROID_TGROUP_DEFAULT.
Attachment #679546 - Flags: review?(dhylands)
> +// This is referenced from android_util_Process.cpp with little modification. We can't copy code from Android "with little modification". That's a likely copyright violation. Recall that Oracle's huge lawsuit against Android was over a tiny snippet of code such as this. Can you explain how this is different from setting the processes' nice values, as we already do? Do we need to do both?
(In reply to Justin Lebar [:jlebar] from comment #3) > > +// This is referenced from android_util_Process.cpp with little modification. > > We can't copy code from Android "with little modification". That's a likely > copyright violation. Recall that Oracle's huge lawsuit against Android was > over a tiny snippet of code such as this. Thanks! I'll remove that code and prepare another patch. > Can you explain how this is different from setting the processes' nice > values, as we already do? Do we need to do both? Control group is different from setting nice value. All sub-group/tasks/threads share system cpu resource fairly according to their weight. I think we could consider that nice value is a scheduling priority when in same control group. If we got all tasks/threads as cgroup=default, then they are all under /dev/cpuctl/tasks and using equally weight defined under /dev/cpuctl/cpu.shares, which is 1024 on my device. If we got 10 tasks (f0~f9) as cgroup=default, X tasks (b0~bx) as as cgroup=bg_non_interactive. /dev/cpuctl/cpu.shares is 1024, and /dev/cpuctl/bg_non_interactive/cpu.shares is 52 (on my android device). Then system CPU resource is divided as below when all tasks are 100% loaded: Each fn's bandwidth = 1024/(10*1024+52) = 9.9% Sum of all bn's bandwidth = 52/(10*1024+52) = 0.5% Note that I count "Each fn" and "Sum of all bn" because "bg_non_interactive" is a group under /dev/cpuctl.
Attachment #679546 - Attachment is obsolete: true
Attachment #679546 - Flags: review?(dhylands)
Apache code from Android can be copied into Gecko. If you have any concerns re: licensing please email Jishnu from legal or myself.
> Apache code from Android can be copied into Gecko. Indeed. I suppose the lesson is that we need to be conscious of the license and call it out, instead of just saying "I copied this code".
> Control group is different from setting nice value. So if I understand you correctly, Control groups here are used to set hard scheduling priorities, whereas nice is a soft scheduling priority. Is that right? Judging from your description, it sounds to me that we may still want to nice the root process under other processes (because they all share the same cgroup), but we could leave background processes' nice at 0. It's still not clear to me why control groups would be better than nice; can you describe a situation where nice would do the wrong thing but control groups would do the right thing? When you redo the patch, I think you want to add code to modify /dev/cpuctl/bg_non_interactive/cpu.shares and /dev/cpuctl/cpu.shares, just like we now do for the low-memory killer's config values. Also, the code in your patch re-prioritizes all the threads currently in the process, but what about threads created after this code runs?
(In reply to Justin Lebar [:jlebar] from comment #7) > > Control group is different from setting nice value. > > So if I understand you correctly, Control groups here are used to set hard > scheduling priorities, whereas nice is a soft scheduling priority. Is that > right? Judging from your description, it sounds to me that we may still > want to nice the root process under other processes (because they all share > the same cgroup), but we could leave background processes' nice at 0. > > It's still not clear to me why control groups would be better than nice; can > you describe a situation where nice would do the wrong thing but control > groups would do the right thing? > Yes, I think we need both nice and control groups. Considering the case we have massive background threads, setting different control groups could provide much more help than nice only. As you mentioned, I think we should also use nice in the same control group, especially for cgroup=default. > When you redo the patch, I think you want to add code to modify > /dev/cpuctl/bg_non_interactive/cpu.shares and /dev/cpuctl/cpu.shares, just > like we now do for the low-memory killer's config values. > Why do we need to modify cpu.shares? We already tried to change threads' control groups, so it seems no need to modify each control groups cpu.shares also. > Also, the code in your patch re-prioritizes all the threads currently in the > process, but what about threads created after this code runs? threads created after this code runs should inherit (clone) the control group from parent.
> Why do we need to modify cpu.shares? You said earlier > /dev/cpuctl/cpu.shares is 1024, and /dev/cpuctl/bg_non_interactive/cpu.shares is 52 (on > my android device). How do you know those are the right settings for our B2G device? How do you know those are the right settings for all B2G devices?
(In reply to Justin Lebar [:jlebar] from comment #9) > > Why do we need to modify cpu.shares? > > You said earlier > > > /dev/cpuctl/cpu.shares is 1024, and /dev/cpuctl/bg_non_interactive/cpu.shares is 52 (on > > my android device). > > How do you know those are the right settings for our B2G device? How do you > know those are the right settings for all B2G devices? Oh, I didn't say it clearly. I mean, do we need to modify those value at run time? For each device, we could set those just once, maybe at build time.
Ah, I understand. We don't need to set them in Gecko (at runtime), but I had a long discussion with cjones about why I think it's good to set them there. See bug 771195 comment 41 and on. And on and on... :)
Hi Justin, I made this revised patch. Just as EnsureKernelLowMemKillerParamsSet, I made EnsureCgroupCPUSharesParamsSet but currently lack the calculation of /dev/cpuctl/bg_non_interactive/cpu.shares. I have no idea what bases should we consider to decide this value yet. Also, I checked android kernel and found it seems didn't define fair group schedule. So how much cpu resource bg tasks could get could be expressed in a quite simple formulation: (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is 52/1024 = 5% in my otoro.
Attachment #682352 - Flags: feedback?(justin.lebar+bug)
> So how much cpu resource bg tasks could get could be expressed in a quite simple > formulation: > (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is > 52/1024 = 5% in my otoro. What happens if the foreground processes do not want any CPU? Are the bg processes still limited to 5%, or do they get 100%?
Comment on attachment 682352 [details] [diff] [review] Set threads' control group for scheduler in SetProcessPriority, v2 This looks like the right idea, but we need to figure out how to tweak these numbers. At the very least you'll need to set the nice value of background processes to 0.
Attachment #682352 - Flags: feedback?(justin.lebar+bug) → feedback+
(In reply to Justin Lebar [:jlebar] from comment #13) > > So how much cpu resource bg tasks could get could be expressed in a quite simple > > formulation: > > (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is > > 52/1024 = 5% in my otoro. > > What happens if the foreground processes do not want any CPU? Are the bg > processes still limited to 5%, or do they get 100%? It could get 100% in this case. (In reply to Justin Lebar [:jlebar] from comment #13) > > So how much cpu resource bg tasks could get could be expressed in a quite simple > > formulation: > > (/dev/cpuctl/bg_non_interactive/cpu.shares)/(/dev/cpuctl/cpu.shares), which is > > 52/1024 = 5% in my otoro. > > What happens if the foreground processes do not want any CPU? Are the bg > processes still limited to 5%, or do they get 100%? It could get 100% in this case.(In reply to Justin Lebar [:jlebar] from comment #14) > Comment on attachment 682352 [details] [diff] [review] > Set threads' control group for scheduler in SetProcessPriority, v2 > > This looks like the right idea, but we need to figure out how to tweak these > numbers. At the very least you'll need to set the nice value of background > processes to 0. I still keep set nice value here: https://bugzilla.mozilla.org/attachment.cgi?id=682352&action=diff#a/hal/gonk/GonkHal.cpp_sec5 I'm considering could we have this bug to apply setting control group first (using partner default numbers), and another bug about tweak those numbers?
> I still keep set nice value here: Right. But the nice value for background-group processes should be 0, right?
> I'm considering could we have this bug to apply setting control group first (using partner default > numbers), and another bug about tweak those numbers? You'll need to do some testing to show that, for example, the radio and music player apps work fine with cgroups while they're in the background. But if you can show that the device works fine with your parameters, we can "tweak" them in a separate bug. To be honest, I'm really not sure why we're spending time on this right now. It's not a blocker, and it's not clear to me that there's a lot of benefit to be had over nice.
(In reply to Justin Lebar [:jlebar] from comment #16) > > I still keep set nice value here: > Right. But the nice value for background-group processes should be 0, right? I think the nice value for background-group processes should be just as we set currently, 6. We set -1/0/6 for master/fg/bg now, may I know why you think we need to change that here? (In reply to Justin Lebar [:jlebar] from comment #17) > > I'm considering could we have this bug to apply setting control group first (using partner default > > numbers), and another bug about tweak those numbers? > > You'll need to do some testing to show that, for example, the radio and > music player apps work fine with cgroups while they're in the background. > But if you can show that the device works fine with your parameters, we can > "tweak" them in a separate bug. > I've been testing this on my phone for more than one day. Apps like music work fine even they were set to bg_non_interactive. I think this also came from that we never treat mediaserver as background process. They sometimes killed by LMK, but this patch would not change oom_adj, so I think this would still happens even without this patch. > To be honest, I'm really not sure why we're spending time on this right now. > It's not a blocker, and it's not clear to me that there's a lot of benefit > to be had over nice. I agree that this is not a blocker. Without setting scheduling policy, there won't be additional stability or memory consumption defects. But in our experience, setting correct scheduling policy to tasks could make user experience better.
> may I know why you think we need to change that here? I thought that was what we agreed to do in comments 7 and 8? I guess it doesn't matter much; if all of the tasks in the background cgroup have the same nice, it doesn't matter what that nice is. But setting it to 0 seems cleaner than setting it to 6. Unless I'm misunderstanding something?
Bug 861441 - "GonkHal::SetPriority does not set CPU priorities properly" would contains what we want to do here. Since Bug 861441 is tef+ and this is blocking-basecamp-, I mark this one as duplicate with bug 861441.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Turns out this was an important thing to do. Just not for the reasons we thought! :)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: