Closed Bug 1573987 Opened 5 years ago Closed 5 years ago

[admin-tool] generalize ci-admin to support the "public" deployment as well

Categories

(Taskcluster :: Operations and Service Requests, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(2 files, 10 obsolete files)

(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details

We need a way to administer Taskcluster resources in the public deployment, suitable for pull-requesting by project members.

ci-admin is most of what we need, but also contains some Firefox-CI specific bits.

Summary: generalize ci-admin to support the "public" deployment as well → [admin-tool] generalize ci-admin to support the "public" deployment as well

It looks like I'll be taking this on. Callek, would you like to be involved in design discussion, or should I just forge on?

Flags: needinfo?(bugspam.Callek)

I'd say forge on, but if you document as you go I could at least touch base if I have any thoughts as you go.

Flags: needinfo?(bugspam.Callek)
Assignee: nobody → dustin

Here's a high-level version of what I'm thinking:

  • keep the two-repo model, but move the generation code (in_tree_actions.py, etc.) into the ci-configuration repository, using a "plugin" approach
    • we may later be able to generalize some plugins, e.g., worker_pools; for the moment we'll probably just copy/paste
  • allow Resources.manage to have "holes" in the set of things it manages, so we can say "manage foo/bar/* but not foo/bar/abc/*"
    • this will allow one ci-configuration repository to "delegate" to another without trying to overwrite its changes
  • move ci-admin to a git repository and start doing numbered releases
    • TC can adopt a Python project, but Python and hg is just asking too much ;)
  • support ci-configuration in both git and hg repositories
Depends on: 1577605

Per brief word from tomprince, -1 on moving to git and -1 on moving code into ci-configuration..

Alternative plan, then, is to make ci-admin a Python library, and then use that from both the firefox-ci ci-admin tool and from the tool that the TC team uses.

I'm +0 on python library for ci-admin, and -1 on moving away from hg as well. I think the seperate library for public-ci-config is worthwhile and that can live wherever (git or hg) but for consistency I like hg.

If we are using two repos, I also like the idea of some ability to "merge" two config repos to both make sure no overlaps and to help us identify holes in the stuff we do manage.

I get the desire to keep Firefox stuff in Hg, but the rest of the world has pretty clearly made its choice so I don't think it is to our advantage to require other users of or contributors to this system to learn hg and find hg hosting. Bitbucket's recent decision to stop supporting Hg really underlined this for me.

Just to be clear on the proposal:

ci-admin becomes a distribution on pypi, still in Python, and developed in Git.

https://hg.mozilla.org/ci/ci-admin requires the ci-admin distribution and uses it as the core of its functionality, reading config from https://hg.mozilla.org/ci/ci-configuration.

https://github.com/taskcluster/community-tc-config is a Python app that requires the ci-admin distribution and uses it as the core of its functionality, reading some .yml files in the same repository for its configuration.

Regarding two repos, I think you're describing the delegation situation? Each one is self-sufficient: community-tc-config would delegate project "persona" to the "persona" team, and that team is free to use whatever tool they want to manage it, including clicking in the UI or something based on ci-admin or any other tool (Terraform, for example). So there's nothing to "merge". I don't expect the delegation approach to be useful in firefox-ci, by the way.

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #6)

Just to be clear on the proposal:

ci-admin becomes a distribution on pypi, still in Python, and developed in Git.

It is the "developed in git" part that concerns me here, I also should say that I don't expect ci-admin to have a ton of active dev on it, outside of (primarily) the firefox-ci team and to a lesser extent the taskcluster team.

I should also say that I would lean toward git being usable for local dev of ci-admin without having ci-admin need to be on github - using something like https://github.com/glandium/git-cinnabar/wiki/Mozilla:-A-git-workflow-for-Gecko-development

https://hg.mozilla.org/ci/ci-admin requires the ci-admin distribution and uses it as the core of its functionality, reading config from https://hg.mozilla.org/ci/ci-configuration.

I would imagine ci-admin distribution is both a distribution and a command line tool like it is now, so that it can just be installed and called pointing at a config repo/project

Regarding two repos, I think you're describing the delegation situation? Each one is self-sufficient: community-tc-config would delegate project "persona" to the "persona" team, and that team is free to use whatever tool they want to manage it, including clicking in the UI or something based on ci-admin or any other tool (Terraform, for example). So there's nothing to "merge". I don't expect the delegation approach to be useful in firefox-ci, by the way.

The merge was a sense of the transition and about the reason I understood you wanted to support the regex in ci-admin from another patch..

I wonder if it would make sense to separate out the firefox-ci parts of ci-admin structurally, but leave them in the same repo, as a first step. I (eventually) hope to do something like this for the in-tree taskgraph: have python/taskgraph (or third_party/python/taskgraph) and taskcluster/gecko_taskgraph, but developed in the same repo. That isn't to say that we should split things, but it means that we can easily move things back and forth to begin with.

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #6)

I get the desire to keep Firefox stuff in Hg, but the rest of the world has pretty clearly made its choice so I don't think it is to our advantage to require other users of or contributors to this system to learn hg and find hg hosting. Bitbucket's recent decision to stop supporting Hg really underlined this for me.

I fully support having ci-admin being able to work with git repositories. However, Mozilla is committed to hg (not exclusively but in terms of tooling and support), and has good tooling for it (better in many places than on github). For mozilla focused products, I don't think "what the rest of the world does" is a good reason for using git.

(In reply to Justin Wood (:Callek) from comment #7)

I should also say that I would lean toward git being usable for local dev of ci-admin without having ci-admin need to be on github - using something like https://github.com/glandium/git-cinnabar/wiki/Mozilla:-A-git-workflow-for-Gecko-development

I would be surprised if this doesn't work already.

Let's set the version-control issue aside for now.

As for splitting-but-not-splitting, I'm doing that as part of the refactoring I'm working on, but with plans to move the library portion out to a different repo almost immediately.

(In reply to Tom Prince [:tomprince] from comment #8)

I fully support having ci-admin being able to work with git repositories. However, Mozilla is committed to hg (not exclusively but in terms of tooling and support), and has good tooling for it (better in many places than on github). For mozilla focused products, I don't think "what the rest of the world does" is a good reason for using git.

This isn't true for non-Gecko products. For many teams at Mozilla, having to deal with hg is a blocker. I'm guessing mercurial users at Mozilla know how to use git, but many git users at Mozilla don't know how to use mercurial. Also, regarding the library portion, we're talking about a piece of the Taskcluster platform, which is completely on Github.

OK, I've got some work done on this. It's a pretty big patch series that has minimal actual impact, so I'll summarize it here.

The breaking change is that it renames the command ci-admin to tc-admin. Everything else remains unchanged.

The non-breaking impact is that the entire project is divided into tcadmin/, which implements a generic library for administering TC using the familiar patterns: generated, current, diff, apply, check; and ciadmin/, which contains the implementation of Firefox-CI administration using that library. There are no imports of ciadmin.* from tcadmin.*. There's a new top-level tc-admin.py that sets things up for the particular implementation, and checks are moved to checks/.

Next steps would be

  • Abstract out some of the ciconfig support for loading data used to generate resources, in the form of some utility classes capable of loading things from another repo or local files, and of parsing YAML and creating attrs-based objects.
  • Separate tests for the two packages
  • Document tc-admin as a library:
    • User guide (tc-admin diff, tc-admin apply, etc.)
    • How to write tc-admin.py
    • APIs available to users
  • When that's all stable, snap off tc-admin and stick it on pypi

Oh, and I'm happy to squash some of these, too, since they involved moving a lot of files around only to end up where they started. Just let me know if you'd like that.

Depends on D44546

Depends on D44548

Depends on D44552

I skimmed this patch set and I have no strong concerns.

My noted things are:

  • Some stuff is moved/added only to be moved back/removed later (blessings, ciadmin->tcadmin->ciadmin, click.options() etc)
  • Each patch would be easier to review standalone if we had more descriptive prose in the commit message to what it was doing.
  • the spin-event-loop for pytest magic is a bit scary, and it does concern me in the sense that we could possibly be doing stuff not expected and may break at a pytest whim in the future
  • I'm not a huge fan of pytest.skip automatically in any sort of test like this, which relies primarily on an env var being set. Because it makes it easy to forget to set it and then you break a test. Would be good to do pytest.fail("Set TCADMIN_SKIP_AUTH_TESTS=1 in your environment to skip") or some such instead imo.

I am neck deep in some other higher priority work so any further review by me would be at least until next week, let me know if you feel my delay would block you though (I'm fully ok with tom reviewing without me giving it a second look)

(In reply to Justin Wood (:Callek) from comment #23)

My noted things are:

  • Some stuff is moved/added only to be moved back/removed later (blessings, ciadmin->tcadmin->ciadmin, click.options() etc)
  • Each patch would be easier to review standalone if we had more descriptive prose in the commit message to what it was doing.

Yeah, maybe I should squash them down. I'm sure that will make Phabricator's brain explode. I also forgot that Phab can't show you a combined diff over all the Revs in a stack. I had separated the commits mostly to keep from losing my mind as I moved the parts around.

  • the spin-event-loop for pytest magic is a bit scary, and it does concern me in the sense that we could possibly be doing stuff not expected and may break at a pytest whim in the future

The "interesting" bit is happening outside of pytest where we need to get a value available only asynchronously. Pytest (pytest-asyncio in particular) is pretty clear that it can't be run inside a running event loop, and I don't see that decision being reversed on a whim.

That said, I can amend the registry to allow sync values, too, which check_path always should be.

  • I'm not a huge fan of pytest.skip automatically in any sort of test like this, which relies primarily on an env var being set. Because it makes it easy to forget to set it and then you break a test. Would be good to do pytest.fail("Set TCADMIN_SKIP_AUTH_TESTS=1 in your environment to skip") or some such instead imo.

Agreed, and it's sort of weird that these tests require any external access. They're also super-slow. I changed that because it's annoying that the tests fail "out of the box". The approach we've taken with Taskcluster is to set NO_TEST_SKIP in CI, and only allow skipping tests due to missing information when that variable is not set. So CI would still run all tests. I'll refactor to approach it that way.

The breaking change is that the command ci-admin is renamed to tc-admin.
Other external behaviors remain unchanged.

The entire project is divided into tcadmin/, which implements a generic
library for administering TC using the familiar patterns: generated, current,
diff, apply, check; and ciadmin/, which contains the implementation of
Firefox-CI administration using that library. There are no imports of
ciadmin.* from tcadmin.*. There's a new top-level tc-admin.py that sets
things up for the particular implementation, and checks are moved to checks/.

This is a set-up for subsequent work to make tc-admin a separate Python
distribution package, on which ci-admin depends.

Attachment #9090202 - Attachment is obsolete: true
Attachment #9090203 - Attachment is obsolete: true
Attachment #9090204 - Attachment is obsolete: true
Attachment #9090205 - Attachment is obsolete: true
Attachment #9090206 - Attachment is obsolete: true
Attachment #9090207 - Attachment is obsolete: true
Attachment #9090208 - Attachment is obsolete: true
Attachment #9090209 - Attachment is obsolete: true
Attachment #9090210 - Attachment is obsolete: true
Attachment #9090211 - Attachment is obsolete: true

OK, squashed except for the NO_TEST_SKIP bit (because it was easily separated). More commit comments. Review of the big squash will require some careful attention to where files are renamed!

Attachment #9090208 - Attachment is obsolete: false

https://github.com/taskcluster/tc-admin/pull/1

TODO after that's merged:

  • integration tests for tcadmin/boot.py
  • release taskcluster-admin v1.0.0
  • make a Rev for ci-admin to require and use taskcluster-admin
  • Abstract out some of the ciconfig support for loading data used to generate resources, in the form of some utility classes capable of loading things from another repo or local files, and of parsing YAML and creating attrs-based objects.
  • set up pyup and do a whole bunch of updates

https://pypi.org/project/tc-admin/1.0.0/

Filed an issue for the testing, and another for the ciconfig support -- https://github.com/taskcluster/tc-admin/issues/6

I set up renovate.. let's see what it can do with a Python project.

Tom, how would you like to proceed here? We can leave ci-admin and tc-admin separate for now, or try to land that last patch to unify the two.

Flags: needinfo?(mozilla)
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(mozilla)
Resolution: --- → FIXED
Attachment #9094673 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: