Consider always running Raptor jobs with gecko profiler and "nostacksampling" feature enabled
Categories
(Testing :: Raptor, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: whimboo, Unassigned)
References
(Depends on 2 open bugs, Blocks 2 open bugs)
Details
(Whiteboard: [fxp])
Currently Raptor jobs in CI are running with the gecko profiler disabled and enabled:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=rap
The profiler jobs are within the prof
job group. None of the results are actually getting submitted to perfherder due to the overhead of the profiler. As such there is no way to compare actual results and how much the profiler affects the performance metrics. To enable submission we would have to enable:
For try jobs that would be fine as Greg mentioned to me. Otherwise there is still the perfherder-data.json
artifact for each job, which can be used to extract the numbers.
So if we want to have a low-overhead profiler running even for normal Raptor jobs, a feature list like js,leaf,stackwalk,nostacksampling
should do it. This will record markers and the JS stack only for those, but will not do interval based stack sampling.
It might be helpful to compare the metrics between a normal Raptor job, one with the full profiler, and another one with the low-overhead profiler. If the latter isn't adding overhead or additional noise to the page loads, maybe it could be enabled by default.
Comment 1•4 years ago
|
||
So it looks like we went through a lot of pain to prevent gecko-profiler data from getting out but with the extra-options split we have available to us now we could enable perfherder-data submission by adding the gecko_profile
extra option to the results here: https://dxr.mozilla.org/mozilla-central/source/testing/raptor/raptor/results.py#198-207
Similar to what this patch does: https://phabricator.services.mozilla.com/D73275
There are a few things disabled when gecko-profiling is active though, so we would need to undo those changes as well:
- https://dxr.mozilla.org/mozilla-central/source/testing/raptor/raptor/results.py#232,660
- https://dxr.mozilla.org/mozilla-central/source/testing/raptor/raptor/output.py#215
This sounds like a really neat use case so I think it would make sense for us to start collecting this data everywhere we run it if we can. :davehunt, what do you think?
Comment 2•4 years ago
|
||
(In reply to Greg Mierzwinski [:sparky] from comment #1)
This sounds like a really neat use case so I think it would make sense for us to start collecting this data everywhere we run it if we can. :davehunt, what do you think?
Yes, I like this idea. Could you file a bug and mark it as a dependency?
Updated•4 years ago
|
Comment 3•4 years ago
|
||
(In reply to Dave Hunt [:davehunt] [he/him] ⌚BST from comment #2)
Yes, I like this idea. Could you file a bug and mark it as a dependency?
Actually, is this a dupe of bug 1631784?
Reporter | ||
Comment 4•4 years ago
|
||
(In reply to Dave Hunt [:davehunt] [he/him] ⌚BST from comment #3)
Actually, is this a dupe of bug 1631784?
No, this only prevents submission of metrics for the profiler jobs. But this bug is for running the profiler with no stack sampling for all Raptor jobs.
Comment 5•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #4)
(In reply to Dave Hunt [:davehunt] [he/him] ⌚BST from comment #3)
Actually, is this a dupe of bug 1631784?
No, this only prevents submission of metrics for the profiler jobs. But this bug is for running the profiler with no stack sampling for all Raptor jobs.
If you look at the patch in that bug, I believe it's not preventing submission but in fact adding gecko_profile to extraOptions as described in comment 1.
Reporter | ||
Comment 6•4 years ago
|
||
Sorry, it should have been enables
and not prevents
.
Comment 7•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #6)
Sorry, it should have been
enables
and notprevents
.
I wasn't clear in comment 3. I'm not suggesting that this bug is a dupe, I meant the bug I was requesting in comment 2 is the dupe.
Comment 8•4 years ago
|
||
That bug is only for browsertime profiling jobs since they are submitting the data as if it was non-profiled data which gives us a bi-modal distribution. Once that patch lands, we will still have to enable the gecko-profile perfherder data to be output (steps are detailed in comment #1).
Updated•4 years ago
|
Comment 10•4 years ago
|
||
I would be very interested in knowing whether marker-only mode could be enabled by default without perturbing the results too much. Bug 1702310 makes it easier to run, but if it were always enabled then I wouldn't need that bug.
Bug 1700700 also fixes a problem in the gecko_profile
tag added to the perfdata, specifically that at least right now the schema validator doesn't want underscores in the name (I renamed it to gecko-profile
there, but perhaps the schema should be updated instead?)
I've experimented with this in the past months. It looked like there were a large number of differences in the perfherder results when we enabled the profiler by default. I suspect that it's because there are number of changes profiler does in the firefox runtime when it's enabled. We were okay with enabling them if the numbers were consistents across different tests and platforms. For example if we were getting a 15% hit on all test or on a specific platform. But numbers were very unpredictable.
Some of the reasons include, windows timer resolution change we do when we enable the profiler and the profiler button visual change that happens when profiler is started and stopped. But they weren't the only ones as they continued to be a large difference even after solving them.
So instead of enabling the profiler in the main browsertime run, we've decided to add an additional run with the profiler enabled. This way, we won't affect the performance results of the main run but still get a profile after. This work is done in Bug 1786400. Closing this one as wontfix in favor of it.
Updated•1 year ago
|
Description
•