Reduce gap between CPU measurements and sample timestamp
Categories
(Core :: Gecko Profiler, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox86 | --- | fixed |
People
(Reporter: mozbugz, Assigned: mozbugz)
References
(Blocks 1 open bug)
Details
Attachments
(4 files)
A profile taken on MacOS after bug 1329600 seems to have 0% CPU usage everywhere.
Looking at the raw data (accessible through profile
in the JS console), this is due to apparent spikes in CPU usage, which themselves are most probably due to gaps between the time CPU measurements are taken and the sample timestamp. These gaps can be caused by the OS scheduling another thread.
Consider the following sequence:
- sampler loop 1: CPU, (long gap), timestamp, stack, no sleep (to "catch up")
- sampler loop 2: CPU, (NO gap), timestamp, stack...
Because of the gap, we have CPU[2]-CPU[1] covering a much longer time than timestamp[2]-timestamp[1]. So when we compute the CPU usage over time, we get a spike. The above profile shows spike going for an apparent 15x the theoretical maximum!
One issue is the tiny sleep between sampler loops, which makes things worse. This will be dealt in separate bug 1685937.
The main issue we have is due to the gap between CPU measurements and timestamp.
On all supported systems, these are collected with different APIs, so we cannot guarantee that there couldn't be such a gap.
I doubt OSes provide ways to prevent scheduling interruptions. Maybe a higher thread priority could help though? In any case, I'll start with a generic solution, we can try to tweak it on different OSes later on.
The proposed solution here is to record timestamps immediately before and after the CPU measurements, and in cases where the time difference is "too long", we'd repeat the operation.
"Too long" can be determined with a quick test loop at the start of the profiling session.
With this, long gaps should be eliminated, so we can trust that (CPU[2]-CPU[1])/(timestamp[2]-timestamp[1]) is reliable.
The cost of repeating measurements should be negligible: The API calls themselves are short (in the order of a microsecond), so when we detect a "too long" gap (probably in tens of microseconds or more), it would indicate that the system was probably busy with something else (so there's little CPU waste), and the stack sample would have been delayed anyway, so it shouldn't hurt to delay it a little more, in order to get better CPU usage values.
Assignee | ||
Comment 1•4 years ago
|
||
Assignee | ||
Comment 2•4 years ago
|
||
This handles the conversion (from TimeStamp to number of milliseconds since process start) once and gives it to subroutines.
It will also help in a following patch where this value will be more closely tied with the CPU usage value.
Disambiguated nested delta
variables in the Sampler:
sampleStartDeltaMs
is at the start of each sampling loop,threadSampleDeltaMs
is associated to one thread being sampled during that loop.
Depends on D101543
Assignee | ||
Comment 3•4 years ago
|
||
mPostMeasurementTimeStamp records the time right after CPU measurements ended.
It is then used as the main sample timestamp, to both avoid another TimeStamp::Now() call, and to keep them as close together as possible.
Depends on D101544
Assignee | ||
Comment 4•4 years ago
|
||
Depends on D101545
Comment 6•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/cefa452f7131
https://hg.mozilla.org/mozilla-central/rev/a42b21d1612a
https://hg.mozilla.org/mozilla-central/rev/5eb3bdbbdcd5
https://hg.mozilla.org/mozilla-central/rev/090fc4f02d83
Description
•