1563341 - [worker-manager] update static provider to require workers to authenticate themselves

Assignee

Description

•

5 years ago

Per some discussions with relops, the idea is to pre-configure static workers with a shared secret that those workers can use to authenticate themselves to worker-manager and get taskcluster credentials.

We'll need an API to add and delete workers, with the ability for a provider to "reject" a worker (so e.g., google and ec2 providers won't allow adding workers). Then users can add workers in worker pools managed by a static provisioner.

The static provider will then need a registerWorker implementation that verifies the shared secret.

Note that those who would like to can still run workers without this sort of pre-configuration, just as they always have done: configure the worker with credentials including queue:claim-task:<workerId> and set it running.

I'll make the credential lifetime a provider configuration parameter. Although Firefox CI uses static workers that restart frequently, and thus will call registerWorker frequently, likely other use-cases will have long-running workers. Since we're issuing temporary credentials that have a maximum lifetime of 30 days, we have two options:

require workers to shut down and re-register before their credentials expire (worker-runner could do this pretty easily); or
issue permacreds for workers

We can solve that in a followup.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

5 years ago

Blocks: tc-cloudops

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 1

•

5 years ago

AJ, this is solving the problem of identifying a hardware worker to the Taskcluster services such that it can start claiming work. Messing this up would potentially mean that anyone can get credentials to claim and execute tasks.

I'll add to the description above that I'd like to validate the shared secret in such a way that it is not revealed. That will probably be by sending

{"salt": "iethu6mishaeSho2Thai", "hash": "aegh9loo1Nongier9ko1xaj3ok6na1Ahvahb7iez"}

where the hash is HMAC(<secret>, <salt>). The secret will be user-specified.

Do you see anything else to worry about here?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

5 years ago

Flags: needinfo?(abahnken)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 2

•

5 years ago

First half will be https://github.com/taskcluster/taskcluster/pull/998

AJ Bahnken [:ajvb] (she/her)

Comment 3

•

5 years ago

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #1)

AJ, this is solving the problem of identifying a hardware worker to the Taskcluster services such that it can start claiming work. Messing this up would potentially mean that anyone can get credentials to claim and execute tasks.

I'll add to the description above that I'd like to validate the shared secret in such a way that it is not revealed. That will probably be by sending
{"salt": "iethu6mishaeSho2Thai", "hash": "aegh9loo1Nongier9ko1xaj3ok6na1Ahvahb7iez"}
where the hash is HMAC(<secret>, <salt>). The secret will be user-specified.

Do you see anything else to worry about here?

This all sounds pretty good, I do have a few questions:

"validate the shared secret in such a way that it is not revealed" - Why? Why not have the workers have something adjacent to an API key that is sent along in the request?
How will these "shared secrets" be created? Who will be able to create them?
When you "add a worker", does that create a "shared secret" that is then used by the worker to claim tasks? If not, then what exactly does "adding a worker"/registerWorker do?

Flags: needinfo?(abahnken) → needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 4

•

5 years ago

"validate the shared secret in such a way that it is not revealed" - Why? Why not have the workers have something adjacent to an API key that is sent along in the request?

The particular way I implemented this accomplishes nothing, now that I look at it again. Yeah, let's just use it as a bearer token.

How will these "shared secrets" be created? Who will be able to create them?

They'll be created by the caller of the createWorker API method. The alternative is for createWorker to generate a random secret and return it. The chosen approach is a little more flexible for users, allowing cases where, for example, all workers in a pool have the same secret. Users can make that decision for themselves.

When you "add a worker", does that create a "shared secret" that is then used by the worker to claim tasks? If not, then what exactly does "adding a worker"/registerWorker do?

Yes, basically. There are two steps here:

user calls createWorker with a secret value, and sets up worker with that same value
worker calls registerWorker on startup, using that secret value as an "identity proof", and gets Taskcluster credentials (which include scopes to claim tasks) in response

Flags: needinfo?(dustin)

AJ Bahnken [:ajvb] (she/her)

Comment 5

•

5 years ago

How will these "shared secrets" be created? Who will be able to create them?

They'll be created by the caller of the createWorker API method. The alternative is for createWorker to generate a random secret and return it. The chosen approach is a little more flexible for users, allowing cases where, for example, all workers in a pool have the same secret. Users can make that decision for themselves.

This is interesting. So the user supplies there "shared secret"? Are there requirements on the format and entropy of the secret? How is accidental re-use prevented?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 6

•

5 years ago

There aren't any such requirements -- those are all security requirements that the user/deployer would enforce, and not a threat to the TC platform itself. Re-use might be beneficial in some cases, e.g., a collection of identical VM's from the same template, where the distinction between the workers is not important.

That said, I can implement what you're suggesting. I'd like to get some feedback from the relops folks as to whether that makes their work more difficult first.

AJ Bahnken [:ajvb] (she/her)

Comment 7

•

5 years ago

There aren't any such requirements -- those are all security requirements that the user/deployer would enforce, and not a threat to the TC platform itself. Re-use might be beneficial in some cases, e.g., a collection of identical VM's from the same template, where the distinction between the workers is not important.

I hear that. If this route is taken, I'd suggest some sort of format/entropy requirements to prevent bruteforcing.

That said, I can implement what you're suggesting. I'd like to get some feedback from the relops folks as to whether that makes their work more difficult first.

Ok sounds good.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

5 years ago

Blocks: 1562975

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 8

•

5 years ago

I hear that. If this route is taken, I'd suggest some sort of format/entropy requirements to prevent bruteforcing.

Hm, I think this could work. For client accessTokens, we use two sluigid's back to back, so 44 characters matching some charset. We could do teh same here and include in the docs that we recommend using slugid() + slugid() and that each worker have a unique secret. That doesn't prohibit aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa but does prohibit TODO and foobar.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 9

•

5 years ago

https://github.com/taskcluster/taskcluster/pull/1044

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

5 years ago

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → FIXED

Bugzilla

[worker-manager] update static provider to require workers to authenticate themselves

Categories

(Taskcluster :: Services, task)

Tracking

(Not tracked)

People

(Reporter: dustin, Assigned: dustin)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Comment 9

Updated