Closed
Bug 1284991
Opened 8 years ago
Closed 8 years ago
enable chain-of-trust artifact generation in docker-worker
Categories
(Taskcluster :: Workers, defect)
Taskcluster
Workers
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mozilla, Assigned: garndt)
References
Details
Attachments
(1 file)
For TaskCluster nightly security, we need chain of trust artifacts generated+signed by the workers. This will be enabled by a boolean in the task definition (task.payload.features.generateCertificate = true). It will be signed by an embedded GPG key, which will be unique per AMI.
* generate hashes for all task artifacts
* generate CoT json
* plaintext-sign the json with embedded private GPG key
* upload signed json
The chain-of-trust artifact is as follows: (public/certificate.json.gpg)
{
"artifacts": [
{
"name": "public/live_backing.log",
"sha256": "..."
},
...
],
"task": {
// taskdefn
},
"taskId": '...', // taskId of current task
"runId": ..., // RunId of current run
"workerGroup": '...', // workerGroup
"workerId": '...', // workerId
"extra": {
imageHash: '<hash(dockerImage)>', // as given by docker hub (maybe nice to have)
imageArtifactSha256: '...', // sha256 of the image artifact, if any
// link to docker image artifact builder if applicable
region: 'us-west-2',
instanceId: '...',
instanceType: '...',
publicIpAddress: '...',
privateIpAddress: '...',
}
}
then plaintext-gpg signed, so it's both human-readable and machine-verifiable.
Then add it to the list of artifacts to upload, and upload.
Ideally the private key is never in a human's hands, only the public key, which will be added to a git repo or gpg keyserver. (Discussion on this in the previous bug 1284968 )
Updated•8 years ago
|
Assignee | ||
Comment 1•8 years ago
|
||
When hashing artifacts, are we only concerned with the artifacts that were produced by the task itself or do we want *all* artifacts including those produced by the worker (such as live_backing.log)?
Reporter | ||
Comment 2•8 years ago
|
||
I hashed all artifacts, including live_backing.log, in my proof of concept repo.
Will the live_backing.log be quiescent at that point? If there are technical (or other) reasons why this is difficult, we might be able to do without it in the artifact list, especially once we get the image sha into the CoT artifact itself. Otherwise, knowing that the log we're auditing hasn't been modified is a good thing.
Assignee | ||
Comment 3•8 years ago
|
||
Just putting some notes here based on IRC conversation:
at the time of artifact upload, the task log will be copied to a temporary file that's hashed and uploaded as something like "certified.log" since at this point in the task life cycle things are not complete and the task log not completely closed.
Comment 4•8 years ago
|
||
Aki's proof of concept links, for reference:
* https://github.com/escapewindow/nightly-cot-poc/blob/master/download.py
* http://people.mozilla.org/~asasaki/certificate.tgz
* http://people.mozilla.org/~asasaki/certs.tar.bz2
Assignee | ||
Updated•8 years ago
|
Component: Docker-Worker → Worker
Assignee | ||
Comment 5•8 years ago
|
||
Some of the pieces in the extra section are very ec2 specific. Such as instance type, region, instance id. Do we really want those or is there some way we can make some of that data more generic and useful for things that are nto run in ec2? Maybe it doesn't matter that it's ec2 specific
Also, for private IP....I assume this means the private IP of the host and not of the container (at least for docker-worker).
Reporter | ||
Comment 6•8 years ago
|
||
(In reply to Greg Arndt [:garndt] from comment #5)
> Some of the pieces in the extra section are very ec2 specific. Such as
> instance type, region, instance id. Do we really want those or is there
> some way we can make some of that data more generic and useful for things
> that are nto run in ec2? Maybe it doesn't matter that it's ec2 specific
I think they're good things to know about EC2 instances.
I'm open here. I'm thinking maybe specify those things for EC2 instances, and specify a different set of things for non-EC2 instances. We'll similarly have a different set of things for generic worker vs docker worker. These may be more important for a human audit trail than automated checks, though we could potentially add some checks around these.
> Also, for private IP....I assume this means the private IP of the host and
> not of the container (at least for docker-worker).
I think so. I think :jonasfj suggested that one.
Reporter | ||
Comment 7•8 years ago
|
||
Thanks Greg!
I took a look at https://tools.taskcluster.net/task-inspector/#wMmC2C6PR_aRzHWLfvbIzw/ . I'm aware this isn't in a complete state yet; I just wanted to point out what I noticed just in case.
* For the artifacts, we have
"hash": "7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730"
I was thinking maybe "sha256:HASH" so we can explicitly set or change hashing algorithms as needed.
* The task definition looks good. We seem to be missing 'extra' and 'runId'.
* I'm unable to validate the signature without the public key
* The names are slightly different:
certificate.json.gpg -> certificate.gpg
task.payload.features.generateCertificate -> task.payload.features.certificateOfTrust
and we have a new certified.log.
I don't have strong opinions here, other than the names we choose should be consistent across worker implementations. It's likely the names we choose will stick around for a while: once they're implemented in nightly graphs in-tree and across all worker types, they may be tricky to change.
Pretty sure you're aware of most, if not all of the above, already. Great to see this nearly implemented! I should be able to focus on the verification portion in scriptworker next week.
Reporter | ||
Comment 8•8 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #7)
> * I'm unable to validate the signature without the public key
n/m, used https://raw.githubusercontent.com/gregarndt/docker-worker/aad4b90727a39d209edda23653388288e713831a/test/fixtures/gpg_signing_key.asc :)
Assignee | ||
Comment 9•8 years ago
|
||
Ok, I made some changes to fix problems noticed.
Here is a new task:
https://tools.taskcluster.net/task-inspector/#T2v5ZGD2Scywq0UAVbKuEQ/0
Reporter | ||
Comment 10•8 years ago
|
||
That looks good!
We seem to not have uploaded the xfoo, bar, or live[_backing].log, though.
Assignee | ||
Comment 11•8 years ago
|
||
ah shoot, let me adjust the expiration time for that stuff. This was created while running a test so we set the expiration of the artifacts very low. like 15 minutes.
Assignee | ||
Comment 12•8 years ago
|
||
here's a new one with artifacts that will expire in a month. https://tools.taskcluster.net/task-inspector/#HJWHCXN1QvW-Yzc_oltG_g/0
Assignee | ||
Comment 13•8 years ago
|
||
Attachment #8783622 -
Flags: review?(wcosta)
Updated•8 years ago
|
Assignee: nobody → garndt
Status: NEW → ASSIGNED
Comment 14•8 years ago
|
||
Comment on attachment 8783622 [details]
docker-worker PR 237
lgtm, as long as the tests pass.
Attachment #8783622 -
Flags: review?(wcosta) → review+
Updated•8 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Component: Worker → Workers
You need to log in
before you can comment on or make changes to this bug.
Description
•