Closed
Bug 1243039
Opened 9 years ago
Closed 9 years ago
Pushes to try with --trigger-tests should make us schedule TC test jobs as many times as requested
Categories
(Testing :: General, defect)
Testing
General
Tracking
(firefox47 fixed)
RESOLVED
FIXED
mozilla47
Tracking | Status | |
---|---|---|
firefox47 | --- | fixed |
People
(Reporter: armenzg, Assigned: armenzg)
References
Details
Attachments
(1 file)
We currently accomplish this on the Buildbot side by having a pulse listener watch for '--times' in the try syntax and scheduling extra jobs.
Is there something we can put in a task definition to schedule a job multiple times instead of just once?
Or should we create one task per the number indicated with --times?
We can also change the pulse monitor to do this instead of through the gecko decision.
Preference? Recommendations?
Comment 1•9 years ago
|
||
when "--times" is specified, it schedules *all* jobs that many times? If so, it sounds like the integration component that schedules the decision task (mozilla-taskcluster) would just do so N times. You'll get a graph for each of them. I would think we shouldn't add them all to the same graph because there is an upper limit to the number of tasks that can be in a graph.
Assignee | ||
Comment 2•9 years ago
|
||
We schedule the test jobs N times. Not the builds.
Comment 3•9 years ago
|
||
We should be able to do this within the decision task to part out that try flag and when iterating over the tests add it to the graph N times.
Assignee | ||
Comment 4•9 years ago
|
||
What is the upper limit of tasks for a graph?
We want to run an experiment this weekend with 100 jobs per task.
I think I can create graphs from mozci. If I know the limit I will make sure not to create graphs bigger than that.
On another note, would it better to create independent tasks instead of graphs of tasks?
Assignee: nobody → armenzg
Comment 5•9 years ago
|
||
I was told that the upper limit would be somewhere around 1300 tasks but you could experience issues before that is reached depending on the size of the graph.
These tests would need to be part of the same graph as the build for the tests to wait until the build is complete before running. With our current scheduler you cannot require a task from another graph to be completed before running.
Comment 6•9 years ago
|
||
1300 tasks- is that all the total jobs we can run for a given push? or is this per platform? one of the goals of switching to taskcluster is being able to run in smaller chunks, that would imply about 250 jobs for each platform- so linux32/64/asan opt/debug already puts us at 1250, and that is just for linux. Our current chunking which is limited by buildbot buildernames is not ideal and leaves us with about 50 per platform (except for android which has more!).
Comment 7•9 years ago
|
||
Jonas, is the rough limit of 1300 tasks for all tasks within a graph or the number of tasks that could be submitted to extend the graph at a single time?
If I recall correctly, the limit of tasks is related to the size of the document we can store in azure table storage, and after 1300 tasks it becomes too large to be stored but I might be mistaken. I'll let Jonas weigh in.
Flags: needinfo?(jopsen)
Assignee | ||
Comment 8•9 years ago
|
||
I believe we could schedule a graph per platform if we wanted to (a build with its associated tests).
Assignee | ||
Comment 9•9 years ago
|
||
Assignee | ||
Comment 10•9 years ago
|
||
Assignee | ||
Comment 11•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/32797/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/32797/
Assignee | ||
Comment 12•9 years ago
|
||
Hi gardnt, I believe my patch is correct, however, the gecko decision task discards the extra tasks.
If you look at my output locally [1] you will see that mochitest-chrome 3 appears three times, however, the scheduled graph only lists it once [2]. If I try to look at how many tasks should have been scheduled, the number is 16 [3], however, I only see 6 on the graph.
Would you mind having a look at what happens on the scheduling side?
[1]
armenzg@armenzg-thinkpad:~/repos/mozilla-central$ ./mach taskcluster-graph --pushlog-id=107023 --project=try '--message=try: -b d -p linux64 -u mochitest-3 -t none --rebuild 3' --owner=armenzg@mozilla.com --level=1 --revision-hash=25014ae86fb9e850ff07ffa4b829b8f6be06455d --extend-graph | grep "mochitest-chrome 3"
Querying URL for pushdate: None/json-pushes?changeset=None
Error querying pushinfo for repository 'None' revision 'None'
"name": "[TC] Linux64 mochitest-chrome 3",
"name": "[TC] Linux64 mochitest-chrome 3",
"name": "[TC] Linux64 mochitest-chrome 3",
[2] https://tools.taskcluster.net/task-graph-inspector/#chSsS1XkQ36QlHs1eJbpkg/
[3]
armenzg@armenzg-thinkpad:~/repos/mozilla-central$ ./mach taskcluster-graph --pushlog-id=107023 --project=try '--message=try: -b d -p linux64 -u mochitest-3 -t none --rebuild 3' --owner=armenzg@mozilla.com --level=1 --revision-hash=25014ae86fb9e850ff07ffa4b829b8f6be06455d --extend-graph | grep '"name"' | grep "TC"
Querying URL for pushdate: None/json-pushes?changeset=None
Error querying pushinfo for repository 'None' revision 'None'
"name": "[TC] Linux64 Dbg",
"name": "[TC] Linux64 mochitest-plain e10s 3",
"name": "[TC] Linux64 mochitest-plain e10s 3",
"name": "[TC] Linux64 mochitest-plain e10s 3",
"name": "[TC] Linux64 mochitest-plain 3",
"name": "[TC] Linux64 mochitest-plain 3",
"name": "[TC] Linux64 mochitest-plain 3",
"name": "[TC] Linux64 mochitest-browser-chrome e10s M(bc3)",
"name": "[TC] Linux64 mochitest-browser-chrome e10s M(bc3)",
"name": "[TC] Linux64 mochitest-browser-chrome e10s M(bc3)",
"name": "[TC] Linux64 mochitest-browser-chrome M(bc3)",
"name": "[TC] Linux64 mochitest-browser-chrome M(bc3)",
"name": "[TC] Linux64 mochitest-browser-chrome M(bc3)",
"name": "[TC] Linux64 mochitest-chrome 3",
"name": "[TC] Linux64 mochitest-chrome 3",
"name": "[TC] Linux64 mochitest-chrome 3",
Flags: needinfo?(garndt)
Comment 13•9 years ago
|
||
Each of those tasks must have a unique task ID otherwise the queue would have though the same task was being submitted and ignore it.
Flags: needinfo?(garndt)
Assignee | ||
Comment 14•9 years ago
|
||
Assignee | ||
Comment 15•9 years ago
|
||
Comment 16•9 years ago
|
||
> Jonas, is the rough limit of 1300 tasks for all tasks within a graph
Max number of tasks in a graph, regardless of how many times you call extend.
Note: big-graph scheduler won't have this limitation. Hence, the limitation is going away.
Flags: needinfo?(jopsen)
Comment 17•9 years ago
|
||
will big-graph scheduler require changes to how we currently hack the graph? What is the timeline for big-graph?
Assignee | ||
Updated•9 years ago
|
Attachment #8713652 -
Attachment description: MozReview Request: Bug 1243039 - Allow on try to schedule TC test jobs multiple times. → MozReview Request: Bug 1243039 - Allow on try to schedule TC test jobs multiple times. r=garndt
Attachment #8713652 -
Flags: review?(garndt)
Assignee | ||
Comment 18•9 years ago
|
||
Comment on attachment 8713652 [details]
MozReview Request: Bug 1243039 - Allow, on try, to schedule TaskCluster test jobs multiple times. DONTBUILD. r=garndt
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32797/diff/1-2/
Assignee | ||
Updated•9 years ago
|
Summary: Pushes to try with --times should make us schedule test jobs as many times as requested → Pushes to try with --rebuild should make us schedule TC test jobs as many times as requested
Comment 19•9 years ago
|
||
Comment on attachment 8713652 [details]
MozReview Request: Bug 1243039 - Allow, on try, to schedule TaskCluster test jobs multiple times. DONTBUILD. r=garndt
https://reviewboard.mozilla.org/r/32797/#review29681
Just a nit about the name of the option, but I don't have a better name to suggest, and I think we can get away with not needing to copy the test_task each time.
::: testing/taskcluster/mach_commands.py:545
(Diff revision 2)
> + test_task = copy.deepcopy(test_task)
Is there a need to do any kind of copy of this or couldn't we just update the task ID and append it?
::: testing/taskcluster/taskcluster_graph/commit_parser.py:26
(Diff revision 2)
> -def escape_whitspace_in_brackets(input_str):
> +def escape_whitespace_in_brackets(input_str):
ah thanks for catching this!
::: testing/taskcluster/taskcluster_graph/commit_parser.py:262
(Diff revision 2)
> + parser.add_argument('--rebuild', dest='rebuild', type=int, default=1)
This option being called "rebuild" makes it seem like we are actually rebuilding something when really we're just running the tests N times.
Attachment #8713652 -
Flags: review?(garndt) → review+
Assignee | ||
Comment 20•9 years ago
|
||
Assignee | ||
Comment 21•9 years ago
|
||
Assignee | ||
Comment 22•9 years ago
|
||
Assignee | ||
Comment 23•9 years ago
|
||
Assignee | ||
Comment 24•9 years ago
|
||
Assignee | ||
Comment 25•9 years ago
|
||
(In reply to Greg Arndt [:garndt] from comment #19)
..
> ::: testing/taskcluster/mach_commands.py:545
> (Diff revision 2)
> > + test_task = copy.deepcopy(test_task)
>
> Is there a need to do any kind of copy of this or couldn't we just update
> the task ID and append it?
>
Yes, otherwise, the three tasks will use the task id of the last time I touched that field.
It is a reference within the graph, hence, needing a deepcopy.
> ::: testing/taskcluster/taskcluster_graph/commit_parser.py:262
> (Diff revision 2)
> > + parser.add_argument('--rebuild', dest='rebuild', type=int, default=1)
>
> This option being called "rebuild" makes it seem like we are actually
> rebuilding something when really we're just running the tests N times.
I was re-using what chmanchester used on his trigger bot.
I believe --trigger-tests is the compromise we came up with.
I will land with that naming and file bugs for the other tools.
Assignee | ||
Updated•9 years ago
|
Summary: Pushes to try with --rebuild should make us schedule TC test jobs as many times as requested → Pushes to try with --trigger-tests should make us schedule TC test jobs as many times as requested
Assignee | ||
Updated•9 years ago
|
Attachment #8713652 -
Attachment description: MozReview Request: Bug 1243039 - Allow on try to schedule TC test jobs multiple times. r=garndt → MozReview Request: Bug 1243039 - Allow, on try, to schedule TaskCluster test jobs multiple times. DONTBUILD. r=garndt
Assignee | ||
Comment 26•9 years ago
|
||
Comment on attachment 8713652 [details]
MozReview Request: Bug 1243039 - Allow, on try, to schedule TaskCluster test jobs multiple times. DONTBUILD. r=garndt
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32797/diff/2-3/
Assignee | ||
Comment 28•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/de6d4626e415e1aa93dc00209197e6daeaf15ddd
Bug 1243039 - Allow, on try, to schedule TaskCluster test jobs multiple times. DONTBUILD. r=garndt
Assignee | ||
Comment 29•9 years ago
|
||
https://reviewboard.mozilla.org/r/32797/#review29681
> This option being called "rebuild" makes it seem like we are actually rebuilding something when really we're just running the tests N times.
I replied to this in the bug. We will use --trigger-tests instead.
Comment 30•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Assignee | ||
Comment 31•9 years ago
|
||
I want to give triggerbot the capacity of scheduling as many TC jobs as required.
As we add more platforms the current fix will not handle very high number of tasks.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 32•9 years ago
|
||
:armen, can we outline what issues we have here? are we blocked on big graph support? are there other related bugs?
Flags: needinfo?(armenzg)
Assignee | ||
Comment 33•9 years ago
|
||
For now I will close this again.
We have a working solution with the first version we landed.
I've filed bug 1250988 to take my work in progress and implement it with pulse_actions.
It will help with adding full TaskCluster support in mozci/pulse_actions.
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Flags: needinfo?(armenzg)
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•