Closed Bug 1290620 Opened 8 years ago Closed 8 years ago

Use common command to perform common actions in TaskCluster tasks

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gps, Assigned: gps)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

I propose changing the command for all taskcluster tasks to be a wrapper program that acts as a "harness" for the task. I see it performing a few fundamental roles: * Re-executing self as a non-root user * Performing a Gecko source checkout * Executing another command and prefixing its output with the current time Most existing tasks do at least one of these today. Except they all reinvent the wheel in shell scripts. And various tasks often forget a step, such as not running as root. This will bring taskcluster tasks in line with buildbot and mozharness jobs, which print "step" data and timestamps that downstream consumers can use to analyze timings. Contrast this with a TaskCluster task of today like https://public-artifacts.taskcluster.net/CCEb_7EQSruA1WfTH20KFw/0/public/logs/live_backing.log, which is just a wall of text. And you have no clue wow much time is being spent until you get to a `mach` invocation. We already have the beginnings of a wrapper/harness command in checkout-gecko-and-run. It should be relatively easy to port it to a generic "run-task" command. Regarding prefixing lines with timestamps, we should figure out how this interacts with mach and mozharness line prefixing...
Before, it ignored mach logging settings passed via command line arguments. Review commit: https://reviewboard.mozilla.org/r/68128/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/68128/
Attachment #8776233 - Flags: review?(dustin)
Attachment #8776234 - Flags: review?(dustin)
Attachment #8776235 - Flags: review?(dustin)
Before, we simply executed scripts inside Docker containers. This frequently resulted in a wall of text with command output. It was difficult to discern things like the time spent performing certain actions. Before, individual tasks had to drop permissions from the default root user themselves. Dropping permissions isn't exactly a trivial thing to do and a number of tasks didn't do it or did it wrong. Before, we had a "checkout-gecko-and-run" script that kinda/sorta did common activities for us. But it was written as a shell script and doing advanced things was difficult. This commit can be treated as a rewrite of "checkout-gecko-and-run" as a Python script. But it also does a bit more. It prefixes output with timestamps so we know how long operations took. It features more robust argument parsing, so we can add new features more easily. To prove the new wrapper script works, the lint image and all tasks using it have been converted to use it. Review commit: https://reviewboard.mozilla.org/r/68130/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/68130/
This commit does a lot. But it's really not too difficult to comprehend once you focus on the final state, which is basically the same as the "lint" image and derived tasks. Before, the "decision" image contained a "checkout-gecko" script and "run-action" and "run-decision" scripts. The latter 2 invoked the first script. The "run-task" script basically does what the combination of these scripts were doing before. So we switch to it. While we're here, we also replaced the custom Mercurial installation in this image with the shared install-mercurial.sh script. The system-setup.sh script for the decision image is now short and sweet. The YAML files for tasks using this image have been updated to use "run-task." We no longer have to pass an environment variable to hold command arguments. So we revert to putting these arguments inline in the task's command. Dustin never liked passing the arguments as environment variables, so it should make him happy ;) We add --log-no-times because "run-task" prefixes its own timestamps on output lines. The path to the Gecko checkout has been changed from /home/worker/workspace to /home/worker/checkouts to match changes made in bug 1289643. Finally, since "checkout-gecko-and-run" is no longer used, we delete it. The Docker image version has been bumped accordingly. Review commit: https://reviewboard.mozilla.org/r/68132/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/68132/
Attachment #8776233 - Flags: review?(dustin) → review+
Comment on attachment 8776233 [details] Bug 1290620 - Make `mach taskgraph` honor mach logging settings; https://reviewboard.mozilla.org/r/68128/#review65418 ::: taskcluster/mach_commands.py:195 (Diff revision 1) > if not quiet: > level = logging.DEBUG if verbose else logging.INFO > - self.log_manager.add_terminal_logging(fh=sys.stderr, level=level) > + self.log_manager.add_terminal_logging( > + fh=sys.stderr, level=level, > + write_interval=old.formatter.write_interval, > + write_times=old.formatter.write_times) I admit to blindly hacking my way through this when I wrote it. Is this the best approach? Would you prefer `terminal_handler.set_{level,fh}`, rather than enumerating all of the attributes that should not change here?
Attachment #8776234 - Flags: review?(dustin) → review+
Comment on attachment 8776234 [details] Bug 1290620 - Implement a run-task wrapper script; https://reviewboard.mozilla.org/r/68130/#review65420 ::: testing/docker/recipes/run-task:109 (Diff revision 1) > + task_args = [] > + > + parser = argparse.ArgumentParser() > + parser.add_argument('--user', default='worker') > + parser.add_argument('--group', default='worker') > + parser.add_argument('checkout') Why not make this a `--` style argument? ::: testing/docker/recipes/run-task:150 (Diff revision 1) > + # Drop permissions to requested user. > + print_line('setup', 'running as %s:%s\n' % (args.user, args.group)) > + os.setgroups(gids) > + os.umask(022) > + os.setresgid(gid, gid, gid) > + os.setresuid(uid, uid, uid) This will need some careful testing for the desktop-test image. We had a hell of a time getting pulseaudio to work using 'su', but switching to 'sudo' fixed it. Just as an FYI.
Attachment #8776235 - Flags: review?(dustin) → review+
https://reviewboard.mozilla.org/r/68128/#review65418 > I admit to blindly hacking my way through this when I wrote it. Is this the best approach? Would you prefer `terminal_handler.set_{level,fh}`, rather than enumerating all of the attributes that should not change here? The mach logging code is horribly hacky. The code in this file isn't ideal. But fixing it would likely require some new mach APIs. And I have to be in a special mood to touch that code :)
https://reviewboard.mozilla.org/r/68130/#review65420 > Why not make this a `--` style argument? Yeah, I could see us wanting to make the checkout optional. Best to change it now. > This will need some careful testing for the desktop-test image. We had a hell of a time getting pulseaudio to work using 'su', but switching to 'sudo' fixed it. Just as an FYI. I should have documented this in the commit message, but I based the system calls off what I observed `sudo` doing in a Docker container. The only major thing I'm not doing is calling setrlimit(). But I don't think that's too relevant inside a container anyway since containers have their own resource controls via cgroups. So I'm optimistic things will "just work" when ported. But, yeah, porting desktop-test could be "fun."
Comment on attachment 8776233 [details] Bug 1290620 - Make `mach taskgraph` honor mach logging settings; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/68128/diff/1-2/
Comment on attachment 8776234 [details] Bug 1290620 - Implement a run-task wrapper script; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/68130/diff/1-2/
Comment on attachment 8776235 [details] Bug 1290620 - Use run-task from decision task; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/68132/diff/1-2/
Assignee: nobody → gps
Status: NEW → ASSIGNED
Pushed by dmitchell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/392d902c517d Make `mach taskgraph` honor mach logging settings; r=dustin https://hg.mozilla.org/integration/autoland/rev/9b080fde55c4 Implement a run-task wrapper script; r=dustin https://hg.mozilla.org/integration/autoland/rev/b18ad418cba8 Use run-task from decision task; r=dustin
Blocks: 1291070
I suspect this regressed the lint jobs. Here's a f8 log from central: https://public-artifacts.taskcluster.net/Mx9NdTyDR02RCYtjCvuOjw/0/public/logs/live_backing.log Notice there's a lint error, but the job remains green. This is because the task returns 0, though jgraham verified that the |mach lint| command is returning 1. I'm guessing we are returning the code for the run-task wrapper instead of |mach lint|.
And of course as soon as I comment someone points me at bug 1291070. Looks like the fix is landed but not merged yet.
Product: TaskCluster → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: