1465117 - Update the backfill action task to support performance sheriffing/test-verify workflows

Reporter

Description

•

7 years ago

there are 2 specific use cases here: 1) a performance regression and we want to edit/retrigger a specific job and add '--geckoProfile' to the mozharness options, and change the treeherder symbol to add a -p to indicate this is a profile run (ex: 'tp' -> 'tp-p'). We want to initiate this on push X and have the same change run on push X-1 2) test-verify: when a test fails inside of test-verify, on a push X we want to run the same test on push X-1. We can do force the list of tests to run by edit+retrigger a job to have an extra environment variable: |MOZHARNESS_TEST_PATHS: '<testname>'|. Ideally this should be run with an different symbol (ex: 'TV' -> 'TV-b'). What is in common is: 1) add a value to an existing job via edit/retrigger 2) adjust the job name on treeherder to distinguish it from a normal run 3) initiate it on push X and have it run on push X-1 I do not know the magic needed here, possibly there is some easy work we can do with action tasks to make this less error prone and closer to a "one click" operation.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

7 years ago

:bstack- would you have some thoughts on how we could achieve this? If there are other ways to simplify this workflow from our current way of doing things, then I would be happy to edit our other tooling to work in that fashion.

Flags: needinfo?(bstack)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 2

•

7 years ago

as a note, using our existing 'backfill' feature doesn't retain any of the customizations that I do (such as treeherder symbol, mozharness options, env vars)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

7 years ago

Blocks: test-verify

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

7 years ago

Blocks: 1405428

Dustin J. Mitchell [:dustin] (he/him)

Comment 3

•

7 years ago

This sounds like it would make a good action task. The "X-1" sounds like backfilling, while the adding of options is something retriggering already does. So this is just combining some existing behaviors.

Brian Stack [:bstack]

Assignee

Comment 4

•

7 years ago

Yep, I agree with dustin!

Flags: needinfo?(bstack)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 5

•

7 years ago

how can we move forward with this? My use of retriggering and backfilling doesn't work for this- but that is the interface built into treeherder. Is this something I could configure with existing interfaces to action tasks between treeherder/taskcluster? I don't have ideas of starting points and to be honest I don't understand the terminology, even when looking online for action tasks.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 6

•

7 years ago

:bstack, can you help me find some places to start here- I would be happy to hack around for a couple days with a few places to start.

Flags: needinfo?(bstack)

Brian Stack [:bstack]

Assignee

Comment 7

•

7 years ago

I'm going to hack up the action tasks myself and then put them up for review. After that we can try to bug some treeherder folks to wire them into the UI like backfilling is now!

Flags: needinfo?(bstack)

Comment hidden (mozreview-request)

Brian Stack [:bstack]

Assignee

Updated

•

7 years ago

Assignee: nobody → bstack

Status: NEW → ASSIGNED

Comment hidden (mozreview-request)

Brian Stack [:bstack]

Assignee

Updated

•

7 years ago

Attachment #8982744 - Flags: review?(jmaher)

Attachment #8982744 - Flags: review?(dustin)

Brian Stack [:bstack]

Assignee

Comment 11

•

7 years ago

Ok, I've created a patch that should allow backfilling to do what you wanted if I've understood what you want! Once we land this, people can use this the hard way by using the "custom action..." button on treeherder and filling in the fields the way they want. It should be a couple hours of work at most for someone to make dedicated buttons in treeherder for this after that!

Dustin J. Mitchell [:dustin] (he/him)

Comment 12

•

7 years ago

mozreview-review

Comment on attachment 8982744 [details] Bug 1465117 - Take two at updating backfill task. Use modifier this time. https://reviewboard.mozilla.org/r/248638/#review255108 Joel, hopefully this helps identify how actions like this are implemented, and how other similar options might be added? ::: taskcluster/mach_commands.py (Diff revision 3) > root = options['root'] > > return taskgraph.actions.trigger_action_callback( > task_group_id=options['task_group_id'], > task_id=task_id, > - task=task, Whoops - I should have removed this a few weeks ago. Thanks for the fix! ::: taskcluster/taskgraph/actions/backfill.py:116 (Diff revision 3) > - create_tasks( > - [label], full_task_graph, label_to_taskid, > - push_params, push_decision_task_id, push) > + task_def = fix_task_dependencies(full_task_graph.tasks[label], label_to_taskid) > + task_def['taskGroupId'] = push_decision_task_id > + > + if input.get('addGeckoProfile'): > + mh_options = task_def['payload'].setdefault('env', {}) \ > + .get('MOZHARNESS_OPTIONS', '') I suspect the python lint will not like this indentation - try running `./mach lint -l flake8 -l py2 -l py3 taskcluster/`

Attachment #8982744 - Flags: review?(dustin) → review+

Brian Stack [:bstack]

Assignee

Comment 13

•

7 years ago

mozreview-review

Comment on attachment 8982744 [details] Bug 1465117 - Take two at updating backfill task. Use modifier this time. https://reviewboard.mozilla.org/r/248638/#review255160 ::: taskcluster/taskgraph/actions/backfill.py:116 (Diff revision 3) > - create_tasks( > - [label], full_task_graph, label_to_taskid, > - push_params, push_decision_task_id, push) > + task_def = fix_task_dependencies(full_task_graph.tasks[label], label_to_taskid) > + task_def['taskGroupId'] = push_decision_task_id > + > + if input.get('addGeckoProfile'): > + mh_options = task_def['payload'].setdefault('env', {}) \ > + .get('MOZHARNESS_OPTIONS', '') It's funny because it looks weird to me too but that actually seems to be want flake8 wants. I had a few previous tries where it was complaining with what I did with that line. Running that now gets: ✖ 0 problems (0 errors, 0 warnings)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 14

•

7 years ago

mozreview-review

Comment on attachment 8982744 [details] Bug 1465117 - Take two at updating backfill task. Use modifier this time. https://reviewboard.mozilla.org/r/248638/#review255224 this looks really easy and very hackable. Thanks for putting this together!

Attachment #8982744 - Flags: review?(jmaher) → review+

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 15

•

6 years ago

:bstack, is it ok to land this so I can play with it live as well as work with the treeherder team to build a "one click" solution?

Flags: needinfo?(bstack)

Brian Stack [:bstack]

Assignee

Comment 16

•

6 years ago

Ah yes! Please feel free to land. If you can promptly try a normal backfill afterward to see if it works, I would be happy. I don’t _think_ this will conflict with dustin’s most recent changes, but if it does I’ll fix.

Flags: needinfo?(bstack)

Pulsebot

Comment 17

•

6 years ago

Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3d3fe54d0cb6 Add additional options to backfilling action task r=dustin,jmaher

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 18

•

6 years ago

leave-open to verify and get treeherder UI working.

Keywords: leave-open

Natalia Csoregi [:nataliaCs]

Comment 19

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/3d3fe54d0cb6

Andrei Ciure[:aciure]

Comment 20

•

6 years ago

Backed out 1 changesets (bug 1465117) for fix_task_dependencies not working as expected backed out revision: https://hg.mozilla.org/mozilla-central/rev/3d3fe54d0cb6 backout: https://hg.mozilla.org/mozilla-central/rev/e16a3d8427b8bb6ceeb0b970702fc2eaccd997b8

Flags: needinfo?(bstack)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 21

•

6 years ago

thanks for backing this out. As a note, the MOZHARNESS_TEST_PATH worked well, the --geckoProfile didn't, but when this lands again I can fiddle and figure out what is wrong. I also need to adjust the MOZHARNESS_TEST_PATH to actually launch --test-verify instead, or consider adding an additional cli param; details I can sort out and fine tune.

Comment hidden (mozreview-request)

Brian Stack [:bstack]

Assignee

Updated

•

6 years ago

Flags: needinfo?(bstack)

Attachment #8982744 - Flags: review+ → review?

Brian Stack [:bstack]

Assignee

Updated

•

6 years ago

Attachment #8982744 - Flags: review? → review?(dustin)

Brian Stack [:bstack]

Assignee

Updated

•

6 years ago

Attachment #8982744 - Flags: review?(dustin)

Brian Stack [:bstack]

Assignee

Updated

•

6 years ago

Attachment #8982744 - Flags: feedback?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 23

•

6 years ago

mozreview-review

Comment on attachment 8982744 [details] Bug 1465117 - Take two at updating backfill task. Use modifier this time. https://reviewboard.mozilla.org/r/248638/#review258230 One relatively small fix.. ::: taskcluster/taskgraph/actions/util.py:122 (Diff revisions 3 - 4) > this function to create new tasks, > allowing easy debugging with `mach taskgraph action-callback --test`. > This builds up all required tasks to run in order to run the tasks requested. > > + Optionally this function takes a `modifier` function that is passed in each > + task before it is put into a new graph. It should return a valid task. I like this! ::: taskcluster/taskgraph/actions/util.py:135 (Diff revisions 3 - 4) > full_task_graph = copy.deepcopy(full_task_graph) > label_to_taskid = label_to_taskid.copy() > > target_graph = full_task_graph.graph.transitive_closure(to_run) > target_task_graph = TaskGraph( > - {l: full_task_graph[l] for l in target_graph.nodes}, > + {l: modifier(full_task_graph[l]) for l in target_graph.nodes}, This will apply to *every* task in the transitive closure, where I think the caller expects it to only apply to tasks in `to_run`. I suppose it might be useful in some cases to modify all tasks, so perhaps the right fix here is to document the set of tasks passed to modifier, and to add some filtering in the backfilling modifier function that limits its effects to `to_run`.

Attachment #8982744 - Flags: review-

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

6 years ago

Attachment #8982744 - Flags: feedback?(dustin)

Comment hidden (mozreview-request)

Dustin J. Mitchell [:dustin] (he/him)

Comment 25

•

6 years ago

mozreview-review

Comment on attachment 8982744 [details] Bug 1465117 - Take two at updating backfill task. Use modifier this time. https://reviewboard.mozilla.org/r/248638/#review258318

Attachment #8982744 - Flags: review?(dustin) → review+

Brian Stack [:bstack]

Assignee

Updated

•

6 years ago

Keywords: checkin-needed

Pulsebot

Comment 26

•

6 years ago

Pushed by csabou@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2731368a39e5 Take two at updating backfill task. Use modifier this time. r=dustin,jmaher

Keywords: checkin-needed

Narcis Beleuzu [:NarcisB]

Comment 27

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/2731368a39e5

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

6 years ago

Depends on: 1471227

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 28

•

6 years ago

as a note, this works great for --geckoProfile, I am working on tweaking the parameters I didn't think through for the backfill/retrigger task in bug 1471227. Once that is done (solved, reviewed, landed, tested) I will move this to the treeherder team to make this more of a "one click" solution for the two scenarios.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 29

•

6 years ago

now that bug 1471227 is completed, I want to use it for a few days before recommending we make the treeherder changes. From my testing I have high confidence we are ready. I see a few things here: 1) for talos jobs adding the --geckoProfile will be straightforward, a few default to set on the existing backfill job and ensuring this is matched on talos jobs (will be raptor as well in the near future) 2) for test-verify stuff, this gets trickier, in general this pattern works well for mochitest*/web-platform/xpcshell/reftest/crashtest/jsreftest (not gtest, junit, robocop, cppunittest, marionette, firefox-ui, talos, etc.): * change |depth: 1| * change |inclusive: true| * add |testPath: <path>| 3) using <path>, that will be difficult as we need a full test path and this doesn't work on all test types. In the failure_line.test field, we can find the <path> we are looking for. There are simple rules and we need to adjust a few paths: * if path starts with /, add testing/web-platform/tests to the beginning of the path * if path is like file:///Users/cltbld/tasks/task_1520511336/build/tests/jsreftest/tests/jsreftest.html?test=non262/Array/regress-488989.js, replace |*jsreftest.html?test=| with |/js/src/tests/| * there will be quirks on android tests and other reftests, but I think we can get 95% of cases with the above.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 30

•

6 years ago

here is an example of android reftest failure message: REFTEST TEST-UNEXPECTED-FAIL | http://10.0.2.2:8854/tests/layout/reftests/bidi/dirAuto/dynamicDirAuto-setLTR-NoDir3.html == http://10.0.2.2:8854/tests/layout/reftests/bidi/dirAuto/dynamicDirAuto-refLTR-LTR.html | image comparison, max difference: 1, number of differing pixels: 1

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 31

•

6 years ago

another exception: testing/web-platform/tests/WebCryptoAPI/generateKey/failures_RSA-OAEP.https.any.html needs this file: testing/web-platform/tests/WebCryptoAPI/generateKey/failures_RSA-OAEP.https.any.js that is rare though

Geoff Brown [:gbrown]

Updated

•

6 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1443867

Chris Cooper [:coop] (he/him)

Comment 32

•

6 years ago

Brian: is there more work for you/the TC team here? It's unclear to me from the last few comments.

Flags: needinfo?(bstack)

Brian Stack [:bstack]

Assignee

Comment 33

•

6 years ago

I don't believe there's anything for tc team to be doing at this time.

Flags: needinfo?(bstack)

Chris Cooper [:coop] (he/him)

Comment 34

•

6 years ago

Putting this on the Treeherder teams radar, per comment #18.

Assignee: bstack → nobody

Status: ASSIGNED → NEW

Component: General → Treeherder: Job Triggering & Cancellation

Product: Taskcluster → Tree Management

Version: unspecified → ---

Ed Morley [:emorley]

Comment 35

•

6 years ago

Could someone file a new bug in the Treeherder component with a description of: * what changes you'd like to see to the UI (ie how best to expose these features?) * what the backfill actions payload should look like for them My preference would be for us to move away from sharing bugs for changes to different systems, since: * in-tree vs out of tree have different workflows (eg it's likely we'll need this backported to older branches, which will get even more confusing once there are treeherder GitHub PRs in here) * it's easier to follow status/next steps/what's being requested when there are not multiple attachments/50 comments etc * bugs are cheap Many thanks :-)

Assignee: nobody → bstack

Status: NEW → RESOLVED

Closed: 6 years ago

Component: Treeherder: Job Triggering & Cancellation → General

Product: Tree Management → Taskcluster

Resolution: --- → FIXED

Summary: create a method which is easily initiated in treeherder by the sheriffs to retrigger a job with extra options on 2 revisions → Update the backfill action task to support performance sheriffing/test-verify workflows

Version: --- → unspecified

Dustin J. Mitchell [:dustin] (he/him)

Comment 36

•

6 years ago

It's worth noting, I think the best "flow" for creating new actions is to initially develop them in-tree, using the "Custom Action.." option in treeherder to test and experiment with them. Once that's fully landed and if necessary uplifted to the relevant trees, only then would a new bug be filed to add some UI bindings for the action to Treeherder.

Sylvestre Ledru [:Sylvestre]

Updated

•

6 years ago

Keywords: leave-open