Closed Bug 1137124 Opened 10 years ago Closed 10 years ago

Write to cgroup.procs could take >10 ms when set process priority

Categories

(Firefox OS Graveyard :: GonkIntegration, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(b2g-v2.2 affected)

RESOLVED WONTFIX
Tracking Status
b2g-v2.2 --- affected

People

(Reporter: ting, Assigned: ting)

References

Details

Attachments

(3 files)

Summary: Write to /dev/cpuctl/*/cgroup.procs could take >10 ms when set process prirority → Write to /dev/cpuctl/*/cgroup.procs could take >10 ms when set process priority
Have you reproduced this consistently in other runs too? What base image are you using? When I did my testing these operations were measured in µs (I used clock_gettime(CLOCK_REALTIME) for those) which is why I didn't put them on a separate thread.
So, I don't see anything in your capture that indicates that they take more than 10 msec. I see 3 captures where that function happens to be on the top of the stack. Because these are individual samples, you can only conclude that they take upto 10 milliseconds. You can't conclude that they take at least 10 milliseconds until you can see 2 samples in a row with the same function.
Attached file debug patch (deleted) —
Attached file debug log (deleted) —
It is v18D I am using. I used clock_gettime(CLOCK_MONOTONIC) to check how long does the write() take. From the log, you can see they could consume >10ms when launch an app.
(In reply to Ting-Yu Chou [:ting] from comment #4) > I used clock_gettime(CLOCK_MONOTONIC) to check how long does the write() > take. From the log, you can see they could consume >10ms when launch an app. If I'm reading your traces correctly I can see a few times where this is taking exactly 10ms which sounds a lot like a context switch to me. Nevertheless I'll do some additional testing and we can move the actual write to the I/O thread, it shouldn't be too hard.
The log (attachment 8571135 [details]) was generated from attachment 8571129 [details]. Take the last launch from the log as an example: D/x ( 209): nsFrameLoader::TryRemoteBrowser > CreateBrowserOrApp D/x ( 209): PriorityClass::AddProcess mCpuCGroupProcsFd t0=186.452251912 t1=186.499630140 D/x ( 209): PriorityClass::AddProcess mMemCGroupProcsFd t0=186.499777172 t1=186.510176235 D/x ( 209): nsFrameLoader::TryRemoteBrowser < CreateBrowserOrApp t0/t1 are gotten before/after write(). So write to mCpuCGroupProcsFd takes ~47ms, and mMemCGroupProcsFd takes ~10ms. How could I determine whether it is a context switch?
(In reply to Ting-Yu Chou [:ting] from comment #6) > t0/t1 are gotten before/after write(). So write to mCpuCGroupProcsFd takes > ~47ms, and mMemCGroupProcsFd takes ~10ms. > > How could I determine whether it is a context switch? The default timeslice in Linux is 10ms so a delay of precisely 10ms is usually caused by a context switch (maybe the main process was waiting on some coarse grained lock within the kernel in the cgroup machinery?). That being said the ~47ms delay doesn't sound like a context switch but a genuine delay. AFAIK those writes should be fast so I suspect we're hitting some bottleneck in the kernel - possibly a coarse grained lock as I mentioned before - I'll do some testing later today and see if we can move the write()s into the I/O thread. If you've got time to do it yourself and you're in a hurry please go ahead and take this bug.
(In reply to Gabriele Svelto [:gsvelto] from comment #7) > > How could I determine whether it is a context switch? > > The default timeslice in Linux is 10ms so a delay of precisely 10ms is > usually caused by a context switch (maybe the main process was waiting on > some coarse grained lock within the kernel in the cgroup machinery?). That > being said the ~47ms delay doesn't sound like a context switch but a genuine > delay. AFAIK those writes should be fast so I suspect we're hitting some > bottleneck in the kernel - possibly a coarse grained lock as I mentioned > before - I'll do some testing later today and see if we can move the > write()s into the I/O thread. Thank you for explaining me :) > If you've got time to do it yourself and you're in a hurry please go ahead > and take this bug. I am not in a hurry. If it is not resolved later when I have time, I'll take it.
Attached patch patch v1 (deleted) — Splinter Review
Assignee: nobody → janus926
Status: NEW → ASSIGNED
Attachment #8571824 - Flags: review?(gsvelto)
(In reply to Dave Hylands [:dhylands] from comment #2) > I see 3 captures where that function happens to be on the top of the stack. > Because these are individual samples, you can only conclude that they take > upto 10 milliseconds. You can't conclude that they take at least 10 > milliseconds until you can see 2 samples in a row with the same function. I haven't confirmed yet, but it seems there will be only one sample and the others are absent when a system call takes longer than 1 sample time.
Comment on attachment 8571824 [details] [diff] [review] patch v1 Since you're moving those writes to the I/O thread I wonder if it would be worth moving those to the oom_score_adj / oom_adj files too. Dave what do you think?
Flags: needinfo?(dhylands)
(In reply to Gabriele Svelto [:gsvelto] from comment #11) > Comment on attachment 8571824 [details] [diff] [review] > patch v1 > > Since you're moving those writes to the I/O thread I wonder if it would be > worth moving those to the oom_score_adj / oom_adj files too. Dave what do > you think? I think that the general consensus is that all I/O should be moved off of the main thread. Writing to any file in /proc will involve at least a context switch from userspace to kernel space, plus whatever the kernel side decides to do in addition. It doesn't look like oom_score_adj does anything particularly heavy weight (i.e. basically a bunch of checking and then it stores the value in a data structure). I think that changing stuff in cpuctl may cause an additional context switch to occur which may be why it sometimes takes longer.
Flags: needinfo?(dhylands)
Comment on attachment 8571824 [details] [diff] [review] patch v1 After thinking this through a little more I'm r-'ing it for two reasons: the first one is that we should also move the writes to oom_adj/oom_score_adj off the main thread since we're at it; but this is the least of my worries since it's rather fast already. The second is more important though. Since this patch moves the priority change to the I/O thread it means that the operation might be queued behind other I/O operations for 10s or 100s of milliseconds thus significantly delaying it and this in turn would slow down application launch time or other scenarios where a process must get a priority boost. Ouch. I think we might consider other options here. One would be to spawn a thread just for this but that seems a little overkill. The other thing that comes to mind is making our writes asynchronous; they're fire-and-forget after all so we don't care when they finish. Is it possible to open files under a pseudo-filesystem using O_ASYNC? We'd have to check that the write() succeeds and repeat it if it doesn't and this might also introduce a delay if it fails too often or for too long but I suppose it's worth a try.
Attachment #8571824 - Flags: review?(gsvelto) → review-
Blocks: 1110624
No longer blocks: AppStartup
Making it off main thread will always create a delay, no matter which way we choose. WriteToFile() is used not only for oom_adj/oom_score_adj but also other files. Add a thread and write synchronously sounds reasonable and easier to me.
(In reply to Gabriele Svelto [:gsvelto] from comment #13) > I think we might consider other options here. One would be to spawn a thread > just for this but that seems a little overkill. The other thing that comes > to mind is making our writes asynchronous; they're fire-and-forget after all > so we don't care when they finish. Is it possible to open files under a > pseudo-filesystem using O_ASYNC? We'd have to check that the write() > succeeds and repeat it if it doesn't and this might also introduce a delay > if it fails too often or for too long but I suppose it's worth a try. Async I/O requires kernel-side support. I seem to recall that all of the I/O to the sys and proc file system is synchronous. I also question moving any of this to another thread. There are basically 2 scenarios: 1 - The currently running process is lowering its priority. If this is deferred then the higher priority process that should be running will be delayed until the priority change occurs 2 - The currently running process is raising its priority. If this is deferred, then it will take longer for the priority change to occur, which means that we'll be delayed even further.
(In reply to Dave Hylands [:dhylands] from comment #15) > I also question moving any of this to another thread. Does this mean moving the write()s to another thread is also not a good idea?
(In reply to Ting-Yu Chou [:ting] from comment #16) > Does this mean moving the write()s to another thread is also not a good idea? Yes, in both cases we are somewhat delaying the priority change which can impact startup time even more. Personally I'd give a spin to O_ASYNC just to check if it works even though as per Dave's comment it most likely doesn't. We could also delegate the change to a real-time scheduled thread (we've got support under HAL to set the thread's priority appropriately) but that sounds like overkill to me and we should still check if the scheduler is not introducing a significant delay anyway.
I added O_ASYNC to the flag of OpenCpuCGroupProcs() and OpenMemCGroupProcs(), but it still takes time: D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16276.040237286 t1=16276.081491817 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16276.328065462 t1=16276.370156973 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16276.370226608 t1=16276.430427858 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16282.772551553 t1=16282.790109314 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16282.790169470 t1=16282.820122439 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16283.464762178 t1=16283.464807126 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16283.465154209 t1=16283.465181084 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16283.471997074 t1=16283.472066658 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16283.472163168 t1=16283.472203012 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16283.472339939 t1=16283.540280199 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16283.540427178 t1=16283.553304626 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16285.605023792 t1=16285.630133167 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16285.630206188 t1=16285.631333584 D/x (27199): PriorityClass::AddProcess mCpuCGroupProcsFd t0=16286.646602802 t1=16286.661171865 D/x (27199): PriorityClass::AddProcess mMemCGroupProcsFd t0=16286.661274990 t1=16286.680204313 From http://man7.org/linux/man-pages/man2/open.2.html, O_ASYNC is available for terminals, pseudoterminals, sockets, and (since Linux 2.6) pipes and FIFOs.
Do we have any other options (O_ASYNC, a new thread, IO thread)?
(In reply to Ting-Yu Chou [:ting] from comment #18) > From http://man7.org/linux/man-pages/man2/open.2.html, O_ASYNC is available > for terminals, pseudoterminals, sockets, and (since Linux 2.6) pipes and > FIFOs. The sysfs and profs filesystems are special filesystems which are none of the above.
(In reply to Gabriele Svelto [:gsvelto] from comment #17) > (In reply to Ting-Yu Chou [:ting] from comment #16) > > Does this mean moving the write()s to another thread is also not a good idea? > > Yes, in both cases we are somewhat delaying the priority change which can > impact startup time even more. Personally I'd give a spin to O_ASYNC just to > check if it works even though as per Dave's comment it most likely doesn't. > > We could also delegate the change to a real-time scheduled thread (we've got > support under HAL to set the thread's priority appropriately) but that > sounds like overkill to me and we should still check if the scheduler is not > introducing a significant delay anyway. The realtime threads take priority over any of the non-realtime threads. The only advantage will be if the realtime thread actually blocks for a period of time and does nothing. My guess is that in order to really figure out what's going on here we need to do a detailed analysis of what's actually happening in terms of context switches, interrupts, etc.
(In reply to Dave Hylands [:dhylands] from comment #21) > My guess is that in order to really figure out what's going on here we need > to do a detailed analysis of what's actually happening in terms of context > switches, interrupts, etc. Agreed, this requires some solid profiling before we decide on how to proceed. I'd say we should consider fixing this in the v3 timeframe when we have more time. Personally I've got my hands full of v2.2 blockers and I wouldn't be able to work on this properly right now.
There has been a patch to get rid of the synchronize_rcu(): http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5d65bc0ca1bceb73204dab943922ba3c83276a8c Since our cgroup.c is different, the patch can't be merged directly. Any rules I should follow for taking this change? Another concern is even we apply it for flame-kk, there're still chances that other manufacturers' kernel does not have it. Probably off the write()s to the other thread is still needed?
If the problem has been fixed upstream then I'd rather not do anything else on our side. We've already wrote workarounds in gecko for kernel/driver issues and I'd rather not do it again. As for taking the patch for the Flame I'm not sure what the procedure is; you probably need to talk to our contact with T2M (unfortunately I keep forgetting who are the TAMs and I can't find the relevant page on the wiki).
(In reply to Ting-Yu Chou [:ting] from comment #24) > There has been a patch to get rid of the synchronize_rcu(): > > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/ > ?id=5d65bc0ca1bceb73204dab943922ba3c83276a8c Note it is included in android-msm-hammerhead-3.4-lollipop-mr1, but not in android-msm-hammerhead-3.4-kitkat-mr2 and android-msm-hammerhead-3.4-lollipop-release. Check https://android.googlesource.com/kernel/msm.git/.
How about set this bug won't fix, and treat it as a known issue?
(In reply to Ting-Yu Chou [:ting] from comment #27) > How about set this bug won't fix, and treat it as a known issue? I think it's a good idea.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Summary: Write to /dev/cpuctl/*/cgroup.procs could take >10 ms when set process priority → Write to cgroup.procs could take >10 ms when set process priority
This issue also exists on Aries.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: