1318253 - Use dummy tasks to handle task.dependencies maxItems limit

Reporter

Description

•

8 years ago

In bug 1316340 I tried to use task.dependencies instead of graph level "requires". Max deps we can use is 100 per task and some of tasks have more than that. We need to use dummy tasks so we can reduce this amount.

Rail Aliiev [:rail]

Reporter

Updated

•

8 years ago

Priority: -- → P5

Dustin J. Mitchell [:dustin] (he/him)

Comment 1

•

8 years ago

The in-tree part of this would be pretty easy, especially after some stuff I'm doing in bug 1333255. The question is, how do we make a dummy task that resolves immediately once it becomes pending? I'd like some feedback from Jonas here. The obvious option is a worker implementation that just runs claimWork and resolves the tasks immediately. But I think this will be a common requirement, so a worker that we would want to provide to all of our users. And it hardly seems worth running a whole service just for that purpose. Another solution would be to have the queue know of a particular provisionerId/workerType for which it short-circuits pending -> resolved.

Flags: needinfo?(jopsen)

Aki Sasaki (not active)

Assignee

Comment 2

•

8 years ago

in http://escapewindow.dreamwidth.org/243472.html#dummy_jobs , I brainstormed that we may want to have * noop tasks * notification tasks (irc? slack? email? pulse? other? might be handled by routes already?) * timer tasks - mark as STATUS at datetime These are small enough features that they could all fit in one type of task, if we want all of the above.

Jonas Finnemann Jensen (:jonasfj)

Comment 3

•

8 years ago

> The question is, how do we make a dummy task that resolves immediately once it becomes pending? From the I sent earlier today: Running the decision-task image on gecko-decision with command: ['echo', 'dummy-task'], takes: 3 seconds, sometimes 5 seconds. Unless we have in the order of >10k tasks per day, it's not worth it to build a special cased workerType. Replacing a dummy task template with an empty payload and using a special cased workerType for it would be easy to do later. So let's reconsider that if/when we have >10k dummy tasks per day. 10k dummy tasks at 5s each would take: ~14 compute hours I think we can find better places to spend our time making things faster/cheaper. We can re-evaluate this solution later, as changing the template for dummy tasks should be easy.

Flags: needinfo?(jopsen)

Dustin J. Mitchell [:dustin] (he/him)

Comment 4

•

8 years ago

The "graph morph" support I'm adding in bug 1333255 may help with all of those.

Aki Sasaki (not active)

Assignee

Updated

•

7 years ago

Assignee: nobody → aki

Blocks: 1397773

Aki Sasaki (not active)

Assignee

Comment 5

•

7 years ago

Attached patch reverse_chunk.patch (deleted) — Details — Splinter Review

f? to land on maple. We'll need to add a post-balrog-dummy kind once we have balrog tasks in publish_firefox, I think.

Attachment #8928737 - Flags: feedback?(rail)

Aki Sasaki (not active)

Assignee

Comment 6

•

7 years ago

Avoiding a post- target_tasks_method morph, because that may negatively affect cot verification.

Rail Aliiev [:rail]

Reporter

Comment 7

•

7 years ago

Comment on attachment 8928737 [details] [diff] [review] reverse_chunk.patch Review of attachment 8928737 [details] [diff] [review]: ----------------------------------------------------------------- This should work imo. Another approach would be adding something generic at the end of taskgraph generation (morph?) to scan all tasks, identify the problematic ones and add dummy deps to all of them, not only to the beetmover task. What do you think? ::: taskcluster/taskgraph/transforms/reverse_chunk_deps.py @@ +11,5 @@ > +from taskgraph.transforms.base import TransformSequence > +import taskgraph.transforms.release_deps as release_deps > + > +transforms = TransformSequence() > +MAX_DEPS = 100 Can you add a pointer to the the docs https://docs.taskcluster.net/reference/platform/taskcluster-queue/references/api#createTask or the schema to make it easier to understand where this limit comes from. @@ +35,5 @@ > + deps = {} > + count += 1 > + if deps: > + yield yield_job(job, deps, count) > + count += 1 Shouldn't you reset the counter here? It looks like you are switching to the next job after this, but keeping `count` global.

Attachment #8928737 - Flags: feedback?(rail) → feedback+

Aki Sasaki (not active)

Assignee

Comment 8

•

7 years ago

Thanks! (In reply to Rail Aliiev [:rail] ⌚️ET from comment #7) > Comment on attachment 8928737 [details] [diff] [review] > reverse_chunk.patch > > Review of attachment 8928737 [details] [diff] [review]: > ----------------------------------------------------------------- > > This should work imo. Another approach would be adding something generic at > the end of taskgraph generation (morph?) to scan all tasks, identify the > problematic ones and add dummy deps to all of them, not only to the > beetmover task. What do you think? I explicitly avoided this solution, to prevent CoT bustage. CoT checks against task definitions at the target-graph level. Morphing happens afterwards, so we may end up with altered task definitions, which CoT would reject. > ::: taskcluster/taskgraph/transforms/reverse_chunk_deps.py > @@ +11,5 @@ > > +from taskgraph.transforms.base import TransformSequence > > +import taskgraph.transforms.release_deps as release_deps > > + > > +transforms = TransformSequence() > > +MAX_DEPS = 100 > > Can you add a pointer to the the docs > https://docs.taskcluster.net/reference/platform/taskcluster-queue/references/ > api#createTask or the schema to make it easier to understand where this > limit comes from. Sure! > @@ +35,5 @@ > > + deps = {} > > + count += 1 > > + if deps: > > + yield yield_job(job, deps, count) > > + count += 1 > > Shouldn't you reset the counter here? It looks like you are switching to the > next job after this, but keeping `count` global. I kept it since I was hitting dup label issues. I can probably reset it at this point, now that I adjusted my transform workflow... I'll try that.

Aki Sasaki (not active)

Assignee

Comment 9

•

7 years ago

https://hg.mozilla.org/projects/maple/rev/946a722551ce . We'll probably open a new bug to get everything reviewed+landed on m-c; marking this bug as fixed.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

Aki Sasaki (not active)

Assignee

Comment 10

•

7 years ago

(In reply to Aki Sasaki [:aki] from comment #8) > Thanks! > > (In reply to Rail Aliiev [:rail] ⌚️ET from comment #7) > > Comment on attachment 8928737 [details] [diff] [review] > > reverse_chunk.patch > > > > Review of attachment 8928737 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > This should work imo. Another approach would be adding something generic at > > the end of taskgraph generation (morph?) to scan all tasks, identify the > > problematic ones and add dummy deps to all of them, not only to the > > beetmover task. What do you think? > > I explicitly avoided this solution, to prevent CoT bustage. > CoT checks against task definitions at the target-graph level. Morphing > happens afterwards, so we may end up with altered task definitions, which > CoT would reject. I'm wrong here. task-graph.json is written after morphing, so we won't end up with modified task definitions. However, scriptworker tasks with morphed deps may still barf, since we expect upstreamArtifacts to come from taskIds in task.dependencies. This probably isn't a huge concern... we may end up ignoring task.dependencies in jsone-verified-cot, and only ensure that the task definition hasn't changed.

bhearsum@mozilla.com (:bhearsum)

Comment 11

•

7 years ago

Are we going to need a separate dummy kind for each different phase? I see you've set this one to the "ship" phase, but we also have kinds that depend on beetmover-repackage in the promote (update verify) and publish (final verify) phases.

Flags: needinfo?(aki)

Aki Sasaki (not active)

Assignee

Comment 12

•

7 years ago

Ah, we should probably move these to 'push'.

Flags: needinfo?(aki)

Aki Sasaki (not active)

Assignee

Comment 13

•

7 years ago

(In reply to Ben Hearsum (:bhearsum) from comment #11) > Are we going to need a separate dummy kind for each different phase? I see > you've set this one to the "ship" phase, but we also have kinds that depend > on beetmover-repackage in the promote (update verify) and publish (final > verify) phases. And yeah, if we have multiple phases to support, we probably need some for each phase. I'm not sure it requires a new kind, but a new job type definitely.

Bugzilla

Use dummy tasks to handle task.dependencies maxItems limit

Categories

(Release Engineering :: Release Automation: Other, defect, P5)

Tracking

(Not tracked)

People

(Reporter: rail, Assigned: mozilla)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Attachment

General

Description

File Name

Content Type