Closed Bug 1545524 Opened 6 years ago Closed 5 years ago

Signing shippable builds doesn't work on GCP

Categories

(Release Engineering :: Release Automation: Signing, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: coop, Unassigned)

References

Details

We are hitting chain-of-trust errors on the Bgs signing builds that were just enabled in GCP. Specifically, the jobs can't find the chain-of-trust json or signature, e.g.:

2019-04-17T23:41:49 DEBUG - 404 downloading https://queue.taskcluster.net/v1/task/DMtFwlJTT_mhdus60qpabw/artifacts/public%2Fchain-of-trust.json: 404; body={
"code": "ResourceNotFound",
"message": "Artifact not found\n\n---\n\n* method: getLatestArtifact\n* errorCode: ResourceNotFound\n* statusCode: 404\n* time: 2019-04-17T23:41:49.240Z",
"requestInfo": {
"method": "getLatestArtifact",
"params": {
"0": "public/chain-of-trust.json",
"taskId": "DMtFwlJTT_mhdus60qpabw",
"name": "public/chain-of-trust.json"
},
"payload": {},
"time": "2019-04-17T23:41:49.240Z"
}
}

(Taken from https://tools.taskcluster.net/groups/R6LARtCYSCaQttAPnCu1Lw/tasks/CDCO6hZ5QKSC3t6l_c_iKw/details)

Note: this is absolutely expected right now since we just turned these builds on at tier 3. We don't need a fix until we try to push these builds to tier 1, but wanted to get this on releng's radar ASAP.

This would be because chain of trust is not enabled on the gcp workers. Once that is done, this should just work, and there is nothing that needs to be done on releng's side.

When shipping builds are moved, the appropriate keys will need to be added here. We should probably generate a new set of keys for GCP.

Workers should be able to generate the unsigned chain-of-trust.json if properly configured, even if they’re not level3. Once they have a valid keypair (once they’re level3) they should be able to generate the valid .sig. The error appears to be a missing unsigned cot artifact, which the tc team should be able to fix... once we have this we should be able to dep sign, at least.

Blocks: 1547111

This involves generating trusted docker-worker GCE images and configuring a new worker-type to run these images. Through scopes, we must guarantee only level 3 task can run in trusted instances.

Can we turn off those signing jobs, at least on integration (mozilla-inbound, autoland)? Not sure if reducing the schedule to like hourly runs or on request on integration for all gcp builds makes sense as long as they don't run tests.

Had a meeting with jlund and fubar yesterday to discuss moving release builds to GCP. Turns out shippable builds are the linchpin, so we'll need to fix this.

We did determine yesterday that we can have multiple CoT keys in a single taskgraph, so between comment #2 and comment #3, I think we have all the info we need to roll new versions of the workers for use in GCP.

just a note that when we briefly ran parallel windows builds on gcp, we deliberately omitted the ed25519 key from the windows level 3 workers there, in order to not generate signed, trusted, builds on a platform we had just started testing on.

adding the trusted key to level 3 windows builders on gcp, should also resolve this issue for windows builds there. please let me know if i should proceed with adding the trusted key on windows.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.