Closed
Bug 1253298
Opened 9 years ago
Closed 9 years ago
TC Linux 64 Opt / PGO builds as Tier 2
Categories
(Firefox Build System :: Task Configuration, task)
Firefox Build System
Task Configuration
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: selenamarie, Assigned: mtabara)
References
Details
Attachments
(2 files)
This is a planning bug for tracking work related to getting Linux 64 Opt and PGO builds as Tier-2.
Updated•9 years ago
|
Assignee: nobody → mtabara
Comment 1•9 years ago
|
||
I put together a quick-and-dirty taskcluster pgo build so I could start looking at tests. I'm not sure it's completely "right", but it builds reliably: https://treeherder.mozilla.org/#/jobs?repo=try&revision=731455d9c8fc
:mtabara -- Is that helpful? You are welcome to inherit or scavenge my patch. Or, if you don't have work in progress here, want me to take this bug?
Flags: needinfo?(mtabara)
Assignee | ||
Comment 2•9 years ago
|
||
Switched my focus 100% towards this bug only Friday, last week, sorry I haven't dropped a status-update yet.
:gbrown - this is huge help, thanks a lot! I worked on better understanding the context as all this is new to me, and was trying to setup a local environment to play with the PGO build.
I expect to have some tangible progress results in the following days.
Thanks again for the help!
Flags: needinfo?(mtabara)
Assignee | ||
Comment 4•9 years ago
|
||
Throwing this back to review and potential check-in. It's basically gbrown's patch for which I once again thank him! I only did some extra testing locally and on try to make sure it builds reliably.
The commit lies here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6c2cadcffc99&selectedJob=20888989
It doesn't handle the test part as it's going to be treated separately in bug 1253300.
One of the concerns was if the build-linux.sh change doesn't affect the non-pgo builds. I tested the patch against linux64-pgo and linux64 and it seems to be working fine.
Log for pgo build is here: https://public-artifacts.taskcluster.net/FAplNO47QZyqqWRRxsTh5g/0/public/logs/live_backing.log
Log for opt build is here: https://public-artifacts.taskcluster.net/FoTkNnAHSB-STb02Dk6GRg/0/public/logs/live_backing.log
If my understanding is right, I may need to follow-up with a patch to promote the build to Tier 2, by specifying the tier param in the task yml description.
Attachment #8752966 -
Flags: review?(gbrown)
Attachment #8752966 -
Flags: feedback?(dustin)
Comment 5•9 years ago
|
||
Comment on attachment 8752966 [details] [diff] [review]
Enable PGO builds.
Review of attachment 8752966 [details] [diff] [review]:
-----------------------------------------------------------------
::: testing/taskcluster/scripts/builder/build-linux.sh
@@ +12,5 @@
>
> : MOZHARNESS_SCRIPT ${MOZHARNESS_SCRIPT}
> : MOZHARNESS_CONFIG ${MOZHARNESS_CONFIG}
> : MOZHARNESS_ACTIONS ${MOZHARNESS_ACTIONS}
> +: MOZHARNESS_PGO ${MOZHARNESS_PGO}
I'd like this to be a little more generic. How about MOZHARNESS_OPTIONS, and put --enable-pgo in that variable?
::: testing/taskcluster/tasks/builds/opt_linux64_pgo.yml
@@ +25,5 @@
> + groupName: Submitted by taskcluster
> + machine:
> + # see https://github.com/mozilla/treeherder/blob/master/ui/js/values.js
> + platform: linux64
> + symbol: B
Does this need some other symbol, so it's not confused with non-PGO builds?
Attachment #8752966 -
Flags: feedback?(dustin) → feedback+
Comment 6•9 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> I'd like this to be a little more generic. How about MOZHARNESS_OPTIONS,
> and put --enable-pgo in that variable?
+1 -- I like that idea.
> Does this need some other symbol, so it's not confused with non-PGO builds?
I think it is okay this way. "Linux x64 pgo" gets its own section on treeherder, and the existing buildbot pgo builds use "B".
Comment 7•9 years ago
|
||
Comment on attachment 8752966 [details] [diff] [review]
Enable PGO builds.
Review of attachment 8752966 [details] [diff] [review]:
-----------------------------------------------------------------
> If my understanding is right, I may need to follow-up with a patch to promote the build to Tier 2, by specifying the tier param in the task yml description.
I agree. Until we get the tests running and the build is proven, it should be tier 2. You should be able to add to opt_linux64_pgo.yml:
extra:
treeherder:
tier: 2
Attachment #8752966 -
Flags: review?(gbrown) → review+
Assignee | ||
Comment 8•9 years ago
|
||
***
Bug 1253298 - Enable TC Linux64 PGO builds as Tier 2. r=gbrown
Review commit: https://reviewboard.mozilla.org/r/53272/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/53272/
Attachment #8753435 -
Flags: review?(gbrown)
Assignee | ||
Comment 9•9 years ago
|
||
Thanks gbrown and dustin for review & feedback.
Did the refactoring and had a second try push - https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c262994f743
It's working fine.
The PGO build log is here: https://public-artifacts.taskcluster.net/TVJwt3KpRM25OvVfspvnhA/0/public/logs/live_backing.log
whilst the non-pgo build is here: https://public-artifacts.taskcluster.net/N_fOfW-RRJymPUyHXDA2aw/0/public/logs/live_backing.log
I pushed to MozReview against inbound, containing the tier change as well.
Comment 10•9 years ago
|
||
Comment on attachment 8753435 [details]
MozReview Request: Bug 1253298 - Enable TC Linux64 PGO builds as Tier 2. r=gbrown
https://reviewboard.mozilla.org/r/53272/#review50064
That looks fine to me. Thanks!
Attachment #8753435 -
Flags: review?(gbrown) → review+
Comment 11•9 years ago
|
||
Comment 12•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment 13•9 years ago
|
||
looking into this, the runtime here is about an hour faster than the runtime on buildbot. Why is that? are we generating proper builds?
looking at this revision:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=d3d23c5640717bfb9c72db8e951c462685991854&filter-searchStr=pgo&selectedJob=28431651
we have:
bb: 159 minutes
* 3:09 minutes to setup and start |gmake -f client.mk|
* 145:36 minutes to do the build |Finished build step (success)|
tc: 89 minutes
* 00:48 seconds to setup and start |gmake -f client.mk|
* 79:03 minutes to do the build |Finished build step (success)|
so it seems as though the build step is the biggest difference- it given a 70 minute difference in build times, it would be nice to know we are doing the same things resolve any issues now vs later.
Comment 14•9 years ago
|
||
another quirk is this is posting data to perfherder which now makes a bimodal graph:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,8555b3405a4d6fca6716db38c31dc94d1c4f8fe1,1,2%5D&zoom=1463435529229.1667,1463590048635.4167,4125.27885578822,9701.487034226882&selected=%5Bmozilla-inbound,8555b3405a4d6fca6716db38c31dc94d1c4f8fe1,31804,28048691,2%5D
because of that graph I am looking into this, but we should determine if we want tc data posting different from bb, or how we deal with things like this.
Assignee | ||
Comment 15•9 years ago
|
||
(in the same context of making sure we're generating the same pgo builds as with buildbot)
Before we can go forward with promoting PGO builds to Tier-1 (bug 1274306), in a separate conversation I had with mshal, he suggested as well to run the talos suite against the build and compare it to the buildbot PGO build results to see if they are reasonable.
Not sure I'm running them properly, but for that I triggered https://treeherder.mozilla.org/#/jobs?repo=try&revision=d6b9d39e4a1d to follow-up.
Comment 16•9 years ago
|
||
oh, good idea- I have 5 linux boxes right now that I am testing for talos- they should be done in ~4 hours- most likely we could take my patch, make it hook up to pgo- then run the build as you see fit.
this is my patch:
https://hg.mozilla.org/try/rev/80cca75e6d5999ca2ba2840005092a6efcdc11a5
it is sloppy, and treats talos as unittests- my goal is to work on getting talos tests defined properly with the -t flag.
Comment 17•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #13)
> looking into this, the runtime here is about an hour faster than the runtime
> on buildbot. Why is that? are we generating proper builds?
One thing to keep in mind is that the buildbot Linux 64 pgo builds run on c1.xlarge while the taskcluster builds run on m3.2xlarge: different hardware characteristics may explain different run-times.
Assignee | ||
Comment 18•9 years ago
|
||
:jmaher: not sure I'm following. two questions if I may:
1) My attempt is to try to run the Talos suite against the current TC Linux64 build and compare it to the Buildbot PGO build results to see if they make sense. It looks like I failed to do that in https://treeherder.mozilla.org/#/jobs?repo=try&revision=d6b9d39e4a1d as Talos is being run against the Linux64 opt build. Can you guide me on to how to do this the right way?
2) As to your patch - sorry, not sure I'm following. So you want me to take your patch, add the PGO enabling patch as well (http://hg.mozilla.org/integration/mozilla-inbound/raw-rev/3e069aea556f) and trigger a new linux64-pgo build in Try with talos fully enabled?
Thanks for all the help in this.
Flags: needinfo?(jmaher)
Comment 19•9 years ago
|
||
pushed to try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6874f3fa403c3bf10aafd5eb0ae82e30f836948
I will verify tests are scheduled and repush as needed
Flags: needinfo?(jmaher)
Assignee | ||
Comment 20•9 years ago
|
||
I see a lot of green in that Try push, good job :)
:jmaher - is there anything I could help you with on the Talos thing?
Flags: needinfo?(jmaher)
Comment 21•9 years ago
|
||
oh this took me a bit of focus time to get the data and validate it:
https://docs.google.com/spreadsheets/d/1e-8R6UGyrTJO4jX0RfDvwEi9QkU213E4l8MHNNr1_bw/edit?usp=sharing
overall we are good- there are 4 tests that are different, but they are different on opt as well in a similar fashion.
I would say we are getting the same benefits of pgo in these taskcluster builds!
Flags: needinfo?(jmaher)
Assignee | ||
Comment 22•9 years ago
|
||
:jmaher - that's awesome work you've put there, good job!
Is there anything we need to measure/check before we want to go for this on Tier-1? (Other than the scheduling basis mechanism from https://bugzilla.mozilla.org/show_bug.cgi?id=1274310#c12)
Comment 23•9 years ago
|
||
from a perf perspective there is nothing wrong with these pgo builds. I assume we need signing or l10n or something to make pgo tier-1.
Comment 24•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #23)
> from a perf perspective there is nothing wrong with these pgo builds. I
> assume we need signing or l10n or something to make pgo tier-1.
We need to address nightlies in TC before we can do a full migration here.
It's a big leap of faith to move PGO CI builds to TC as tier 1, but still rely on PGO nightlies generated in buildbot. That's why we started migration with debug builds.
Assignee | ||
Updated•8 years ago
|
Updated•7 years ago
|
Product: TaskCluster → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•