Closed
Bug 870203
Opened 12 years ago
Closed 12 years ago
Weird immediate downgrading of priority for Communications process
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(blocking-b2g:tef+, b2g18 fixed, b2g18-v1.0.1 fixed)
People
(Reporter: bzbarsky, Assigned: justin.lebar+bug)
References
Details
(Whiteboard: [target:resolved already?])
Attachments
(2 files, 2 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details |
I was logging things as I tried to call while running the process buster thing, and some logs like so:
05-08 21:49:11.317 I/Gecko:ProcessPriorityManager( 104): [Communications, child-id=133, pid=2757] Creating ParticularProcessPriorityManager.
05-08 21:49:11.317 I/Gecko:ProcessPriorityManager( 104): [Communications, child-id=133, pid=2757] Done starting up. mHoldsCPUWakeLock=0, mHoldsHighPriorityWakeLock=0
05-08 21:49:11.317 I/Gecko:ProcessPriorityManager( 104): [Communications, child-id=133, pid=2757] Changing priority from UNKNOWN to FOREGROUND_HIGH.
then setting the priorities to low on all the other processes, then:
05-08 21:49:11.397 I/Gecko:ProcessPriorityManager( 104): [Communications, child-id=133, pid=2757] Scheduling reset timer to fire in 1000ms.
05-08 21:49:11.447 I/Gecko:ProcessPriorityManager( 104): [Communications, child-id=133, pid=2757] Changing priority from FOREGROUND_HIGH to BACKGROUND.
Note the timestamps. Why did we just go back from HIGH to BACKGROUND after about 150ms of being HIGH?
Note that right after this in the log we have:
05-08 21:49:11.918 I/Gecko ( 104): XXX ril_worker::_process_calls() got new call.
so this is in the middle of us starting up the Communications app to take the call!
Reporter | ||
Comment 1•12 years ago
|
||
To be clear, this is with the STR from bug 847592 comment 19.
Assignee | ||
Comment 2•12 years ago
|
||
Ah, I think I may understand this better now.
The process should hold a wake lock when it starts up, due to ContentParent::MaybeTakeCPUWakeLock. It doesn't: "Done starting up. mHoldsCPUWakeLock=0, mHoldsHighPriorityWakeLock=0".
We force the priority to FOREGROUND_HIGH after starting it up (that's the third log line), but then if anybody takes or releases a wake lock for any reason, this forces us to reconsider the priority we've given the process.
Assignee | ||
Comment 3•12 years ago
|
||
> The process should hold a wake lock when it starts up, due to
> ContentParent::MaybeTakeCPUWakeLock. It doesn't: "Done starting up. mHoldsCPUWakeLock=0,
> mHoldsHighPriorityWakeLock=0".
Oh, okay. We don't take the wake lock immediately after starting the process, but we take it within one shot of the event loop after creating the ContentParent. That should be soon enough.
When we're starting up the communications app (that is, when it's not already loaded), you should see
> I/Gecko:ProcessPriorityManager( 144): [Communications, child-id=32, pid=932] Got wake lock changed event. Now mHoldsCPUWakeLock=1, mHoldsHighPriorityWakeLock=0
immediately after we renice all of the other processes due to our getting FOREGROUND_HIGH priority.
Reporter | ||
Comment 4•12 years ago
|
||
I do NOT see that. What I see in my log is lots of things like:
05-08 21:49:11.357 I/Gecko:ProcessPriorityManager( 104): [Usage, child-id=21, pid=841] Changing priority from FOREGROUND to FOREGROUND (low CPU).
05-08 21:49:11.357 I/Gecko ( 104): XXX Setting nice for pid 841 to 18
05-08 21:49:11.367 I/Gonk ( 104): Changed nice for pid 841 from 1 to 18.
as we renice all the processes and then immediately after that is the second log quote from comment 0.
Note that if we happen to set visibility in there we'll immediately and synchronously reset our priority.. I wonder if that's what happens here.
In any case, in the relevant log I never see mHoldsCPUWakeLock=1. It's always 0 every time it's logged.
Assignee | ||
Comment 5•12 years ago
|
||
Assignee | ||
Comment 6•12 years ago
|
||
Comment on attachment 747269 [details] [diff] [review]
Moar logging
This may clear things up.
Attachment #747269 -
Flags: feedback?(bzbarsky)
Reporter | ||
Comment 7•12 years ago
|
||
Comment on attachment 747269 [details] [diff] [review]
Moar logging
Add something to OnFrameloaderVisibleChanged too, for its ResetPriorityNow call?
Attachment #747269 -
Flags: feedback?(bzbarsky) → feedback+
Assignee | ||
Comment 8•12 years ago
|
||
Attachment #747269 -
Attachment is obsolete: true
Reporter | ||
Comment 9•12 years ago
|
||
Managed to get this while profiling... still not sure whether the profiler bit matters.
The profile is at http://people.mozilla.com/~bgirard/cleopatra/#report=99acc84a9c8c3b4462f16b539bc62b417421d054
This was a phone that had tried to take a call once already, and I _think_ had a Communications running when I started the testcase.
Salient bits from this log:
1) There is a Communications with pid=478 but then it looks like it dies and we create a new Communications with pid=928?
2) When this second Communications finished startup, it has
"mHoldsCPUWakeLock=0, mHoldsHighPriorityWakeLock=0"
3) Right after going into the FOREGROUND_HIGH state, this Communications gets
a ResetPriority via OnRemoteBrowserFrameShown that schedules a timer to
reset the priority back to FOREGROUND.
4) Right after that, we get a OnFrameloaderVisibleChanged and are now
isVisible=0(?), and synchronously go into BACKGROUND state.
5) Much later (12s later) we get mHoldsHighPriorityWakeLock=1 and switch back
to FOREGROUND_HIGH mode.
Assignee | ||
Comment 10•12 years ago
|
||
> 1) There is a Communications with pid=478 but then it looks like it dies and we create a new
> Communications with pid=928?
Looks like it, particularly since we destroy its particular process priority manager.
> I/Gecko:ProcessPriorityManager( 104): [child-id=4, pid=-1] Destroying ParticularProcessPriorityManager.
Then we create a new preallocated process (pid 697) and then it OOMs. This should be fixed by bug 868521. Hopefully, anyway.
But then this is bad.
> XXX ContentParent::MaybeTakeCPUWakeLock taking wake lock
> [Communications, child-id=34, pid=928] OnRemoteBrowserFrameShown; calling ResetPriority.
> [Communications, child-id=34, pid=928] ComputePriority[mHoldsCPUWakeLock=0, mHoldsHighPriorityWakeLock=0, HasAppType('critical')=1, isVisible=1, IsExpectingSystemMessage=1
It looks like taking the wake lock on behalf of the new process isn't working right; we should have mHoldsCPUWakeLock=1 there.
Assignee | ||
Comment 11•12 years ago
|
||
Ohhhh.
We don't notify observers of wake-lock changes if the state remains the same.
Which is to say, if the CPU wake lock is held, and then we acquire the CPU wake lock on behalf of some process, we don't hear about that.
Assignee | ||
Comment 12•12 years ago
|
||
> 4) Right after that, we get a OnFrameloaderVisibleChanged and are now
> isVisible=0(?), and synchronously go into BACKGROUND state.
This is separately problematic. If a process is loading and doesn't happen to hold a wake lock, it still shouldn't get shoved into the bg.
If you have time to look into this bug, I'd want to understand why this is happening (e.g., what's the backtrace for this call).
Assignee | ||
Comment 13•12 years ago
|
||
This also fixes an important bug where, if a process crashed while holding a wake lock, we'd send a wake lock changed notification without including the list of locking processes. This could cause us to incorrectly believe that no processes were holding the lock.
Unfortunately both T-Mobile and AT&T cell service is currently not functioning at my house (?!), so I haven't been able to test this. But I also haven't been able to reproduce this problem in the first place.
Assignee | ||
Comment 14•12 years ago
|
||
Comment on attachment 747538 [details] [diff] [review]
Patch v1: Notify wake lock listeners when a process adds or removes a wake lock, even if the wake lock itself doesn't change state.
Moving this to bug 870480.
Attachment #747538 -
Attachment is obsolete: true
Assignee | ||
Comment 15•12 years ago
|
||
This is tef+ because we've determined it blocks bug 847592.
blocking-b2g: --- → tef+
Assignee | ||
Comment 16•12 years ago
|
||
> 4) Right after that, we get a OnFrameloaderVisibleChanged and are now
> isVisible=0(?), and synchronously go into BACKGROUND state.
>
> This is separately problematic. If a process is loading and doesn't happen to hold a wake lock, it
> still shouldn't get shoved into the bg.
Okay, I finally tracked down this OnFrameloaderVisibleChanged. It's coming from the Gaia window manager calling setVisible on this particular frame because, as far as Gaia knows at the time, it's launching a background service. This should have no effect on the process's priority, since it's holding a wake lock here.
In other words, this is totally fine, and we may only need the patch in bug 847592 here.
I do want to write a patch to avoid sending frameloader-visible-changed when the visibility didn't actually change. That will help avoid races around the grace period timer. But I'll explain this in more detail in a separate bug.
Assignee | ||
Comment 17•12 years ago
|
||
> But I'll explain this in more detail in a separate bug.
Bug 870598.
Assignee | ||
Comment 18•12 years ago
|
||
The only remaining work to be done here, I think, is to verify that bug 870480 fixes this issue. I've never been able to reliably reproduce myself, so I'd appreciate some help here.
Reporter | ||
Comment 19•12 years ago
|
||
> The only remaining work to be done here, I think, is to verify that bug 870480 fixes this
> issue.
I could reproduce it pretty reliably without that patch; I haven't managed to with the patch yet. And the responsiveness to an incoming call when running the process buster thing seems to be a good bit better, too. So I'd say that patch fixes this problem, yes.
Assignee | ||
Comment 20•12 years ago
|
||
I'll figure out tomorrow if I want to land any of this logging. It's a fine line between being helpful and being too verbose.
Flags: needinfo?(justin.lebar+bug)
Updated•12 years ago
|
Target Milestone: --- → 1.0.1 Cert2 (28may)
Assignee | ||
Comment 22•12 years ago
|
||
Yes, I'm the right owner. I think we're good here; I'm waiting for QA in bug 847592, and if they don't find any problems, we can close this bug.
I'd still like to add some logging, but I can do that separately.
Flags: needinfo?(justin.lebar+bug)
Updated•12 years ago
|
Whiteboard: [target:resolved already?]
Assignee | ||
Comment 23•12 years ago
|
||
Okay, QA is doing work in bug 847592; we don't need to have two bugs open for that. We can reopen this if we find that this is still happening somehow.
Comment 24•12 years ago
|
||
Closing per jlebar's earlier comment.
Status: NEW → RESOLVED
Closed: 12 years ago
status-b2g18:
--- → fixed
status-b2g18-v1.0.1:
--- → fixed
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•