Closed
Bug 1317747
Opened 8 years ago
Closed 8 years ago
enable chain of trust verification in beetmoverworker
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mozilla, Assigned: jlund)
References
Details
Attachments
(3 files, 1 obsolete file)
Looks like beetmover scriptworker is ready to be tier-1 enabled.
We'll have to make these changes:
a. scopes
For signing, we have 3 levels of permissions, guarded by scopes.
project:releng:signing:cert:release-signing [1], which we only allow on release-capable branches, project:releng:signing:cert:nightly-signing [2], which we only allow on nightly-capable branches, and project:releng:signing:cert:dep-signing, which can be used anywhere.
Scriptworker uses the above-linked data structures to verify that a privileged scope is only used on an appropriate branch. Signingscript determines which level of access to grant based on those scopes [3].
We need to follow this model for the other *scripts. For beetmover, I think this would be bucket credentials. Release bucket creds, nightly bucket creds, and staging bucket creds. That would allow chain of trust verification to make sure we're not pushing to a privileged bucket from a non-privileged branch.
If we can't separate creds at first, let's file a followup bug to do so, and verify our upload bucket location matches release, nightly, or staging based on these scopes.
[1] https://github.com/mozilla-releng/scriptworker/blob/121c474f5b21084a4a3742f21c3f30c018e5c766/scriptworker/constants.py#L219
[2] https://github.com/mozilla-releng/scriptworker/blob/121c474f5b21084a4a3742f21c3f30c018e5c766/scriptworker/constants.py#L232
[3] https://github.com/mozilla-releng/signingscript/blob/master/signingscript/task.py#L19
b. downloads / upstreamArtifacts
For signing, we used to have task.payload.unsignedArtifacts, which was a list of URLs. Now we have task.payload.upstreamArtifacts [4], which is a list of dictionaries that look like
"upstreamArtifacts": [
{
"paths": [
"public/build/target.tar.bz2",
"public/build/target.checksums"
],
"formats": [
"gpg"
],
"taskId": "GFPKeLbAQN2fytGOXgatIg",
"taskType": "build"
},
{ ... }
]
The taskId is the taskId of the task we're downloading from. The paths are the artifact paths we're downloading. I don't know if you need to use the "formats" key or need to embed any additional information; we can play with this schema. "taskType" is there for chain of trust verification. Currently we only support "build", "l10n", "decision", "docker-image", but we can add more.
Scriptworker will pre-download these artifacts into $artifact_dir/public/cot/$task_id/$path , and verify their SHAs before calling the script. Beetmoverscript no longer needs to download these artifacts; it can and should use the pre-downloaded artifacts on disk.
I don't know how many artifacts we download, and how large this is going to get. If we don't want to upload all of these upstreamArtifacts at the end of the beetmover task, we can move them to $work_dir or otherwise remove from $artifact_dir before the end of the task, or change where scriptworker downloads them to.
[4] https://queue.taskcluster.net/v1/task/M81unWcDQje2XEwhtmDXrw
c. scriptworker.cot.verify will need to support beetmover type workers
We'll also have to support any other new task types that we depend on.
d. upstream tasks will need to point at the right deps and have chain of trust generation enabled.
To enable chain of trust generation in a non-scriptworker task, set task.payload.features.ChainOfTrust to true.
When there are additional tasks we need to set as chain of trust dependencies in non-scriptworker tasks, we add them to task.extra.ChainOfTrust.inputs, which looks like
"inputs": {
"docker-image": "taskId",
...
}
For upstream scriptworker tasks, we have sign_chain_of_trust [5] and upstreamArtifacts. We can also follow the same task.extra.chainOfTrust.inputs model if that's easiest.
This may prevent us from fully enabling chain of trust verification on beetmover if we depend directly on other non-signing scriptworker tasks that don't yet have chain of trust enabled, but we have some prefs [6] we can use until it's all enabled end-to-end.
[5] https://github.com/mozilla-releng/scriptworker/blob/121c474f5b21084a4a3742f21c3f30c018e5c766/scriptworker/constants.py#L58
[6] https://github.com/mozilla-releng/scriptworker/blob/121c474f5b21084a4a3742f21c3f30c018e5c766/scriptworker/constants.py#L57-L60
e. puppet
With bug 1316702, we now have a shared scriptworker puppet module. Let's use that.
* There are updated dependencies, all pushed to the python3.5 pypi location.
* We now use a scriptworker.yaml which is much larger than our previous config.json. This is populated in the scriptworker module, using variables you pass [7]. I still have the supervisord settings in the signing scriptworker area, because the watch file list can be different per instance type.
* gpg keys - we'll need to create new gpg keys per scriptworker instance, and make sure they're signed by an appropriate key. The trusted keys are in scriptworker/trusted and the worker keys go into scriptworker/valid in the cot-gpg-keys repo [8].
[7] https://hg.mozilla.org/build/puppet/file/tip/modules/signing_scriptworker/manifests/init.pp#l54
[8] https://github.com/mozilla-releng/cot-gpg-keys
Updated•8 years ago
|
Summary: enable chain of trust verification in beetmover → enable chain of trust verification in beetmoverworker
Reporter | ||
Comment 1•8 years ago
|
||
I imagine we're going to hit similar issues, and you may have more context around balrog scriptworker, so let's work closely on these bugs. And thank you!
Assignee: nobody → jlund
Reporter | ||
Comment 2•8 years ago
|
||
I have a wip date patch in 1317800 that partially addresses beetmover, and started addressing the jsonschema in https://github.com/escapewindow/beetmoverscript/commits/cot ... I'm hoping those are helpful; if not, we don't have to use them.
Reporter | ||
Comment 3•8 years ago
|
||
https://github.com/escapewindow/scriptworker/commit/914e5c7b3e8604fd3cf7aacfd9649f4e7638f803 should check the restricted scopes against the tree in scriptworker. Once beetmoverscript determines which bucket/creds to use based on scopes (and uses the latest scriptworker with that patch), that will complete the scopes circuit.
Comment 4•8 years ago
|
||
* fince I'm already testing the balrogworker patch, I thought it'd be a good idea to tweak the beetmoverworker side with all the puppet knowledge still fresh.
* first iteration on beetmoverworker puppet refactoring usinng the shared scriptworker module. Didn't test it yet, will likely need few more tweakings before being production-ready.
* won't add reviwer yet, will follow-up with more tweakings later
Updated•8 years ago
|
Comment 5•8 years ago
|
||
Dropping here for later use the PR used in puppet to pin the loaner environment and prepare beetmoverworker CoT-enabled for the production switch. Won't add any feedback or review as it's for testing purposes only and we're going to re-tweak this diff again before going to production, to get rid of all the staging-environment dependendt variables.
Attachment #8823182 -
Attachment is obsolete: true
Comment 6•8 years ago
|
||
* Used a hello-world dummy task https://tools.taskcluster.net/task-inspector/#RHF9KDIGRaCZzuh73RsIpg/0 to make sure we're getting to the task script running section. Am ready to push a new version of beetmoverscript (most likely 0.1.0 as jlund pointed in his last PR) and adapt accordingly the task to see if new CoT changes work as expected
* created gpg keys for existing beetmoverworker-1 - PR accordingly https://github.com/mozilla-releng/cot-gpg-keys/pull/12
* added the corresponding gpg keys in hiera
Attachment #8825057 -
Flags: review?(aki)
Reporter | ||
Comment 7•8 years ago
|
||
Comment on attachment 8825057 [details]
Add beetmoverworker-1 key in cot-gpg-keys.
Merged.
Attachment #8825057 -
Flags: review?(aki) → review+
Comment 8•8 years ago
|
||
Prepping build-cloud-tools for ramping up new instances for both {beetmover,balrog}workers.
Attachment #8825097 -
Flags: review?(rail)
Updated•8 years ago
|
Attachment #8825097 -
Flags: review?(rail) → review+
Comment 9•8 years ago
|
||
Brought up another instance to prepare for the cut-over: beetmoverworker-2.srv.releng.usw2.mozilla.com
Reporter | ||
Comment 10•8 years ago
|
||
https://tools.taskcluster.net/task-inspector/#C44A3KawSoyXFt_xP2mnlQ/0 -
2017-01-13 00:30:16,502 - beetmoverscript.utils - INFO - {'mapping': {'en-US': {'target.complete.mar': {'s3_key': 'firefox-53.0a1.en-US.linux-i686.complete.mar',
'update_balrog_manifest': True},
'target.tar.bz2': {'s3_key': 'firefox-53.0a1.en-US.linux-i686.tar.bz2'}}},
'metadata': {'description': 'Maps Firefox Nightly artifacts to pretty names '
'for en-US',
'name': 'Beet Mover Manifest',
'owner': 'release@mozilla.com'},
's3_prefix_dated': 'pub/firefox/nightly/2017/01/2017-01-12-22-01-43-date/',
's3_prefix_latest': 'pub/firefox/nightly/latest-date/'}
Traceback (most recent call last):
File "/builds/beetmoverworker/bin/beetmoverscript", line 9, in <module>
load_entry_point('beetmoverscript==0.1.2', 'console_scripts', 'beetmoverscript')()
File "/builds/beetmoverworker/lib/python3.5/site-packages/beetmoverscript/script.py", line 217, in main
loop.run_until_complete(async_main(context))
File "/tools/python35/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
return future.result()
File "/tools/python35/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/tools/python35/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "/builds/beetmoverworker/lib/python3.5/site-packages/beetmoverscript/script.py", line 52, in async_main
await move_beets(context, context.artifacts_to_beetmove, mapping_manifest)
File "/builds/beetmoverworker/lib/python3.5/site-packages/beetmoverscript/script.py", line 65, in move_beets
manifest['mapping'][locale][artifact]['s3_key'])
KeyError: 'target.checksums.asc'
Reporter | ||
Comment 11•8 years ago
|
||
https://tools.taskcluster.net/task-inspector/#XpvkNZbqTSeylnuxVYJAOg/0
scriptworker.exceptions.CoTError: 'path public/build/sv-SE/update/target.complete.mar not in beetmover:signing DsvGCksVRWupNNEDy8izXw chain of trust artifacts!'
It looks like we have an errant update/ (should be public/build/sv-SE/target.complete.mar )
Reporter | ||
Comment 12•8 years ago
|
||
Reporter | ||
Comment 13•8 years ago
|
||
https://hg.mozilla.org/projects/date/rev/3a13e245a272e5e58ff13c3ef4e3f5769b5c29a6
bug 1317747 - remove target.checksums{,.asc} references in beetmover. r=bustage
Reporter | ||
Comment 14•8 years ago
|
||
https://public-artifacts.taskcluster.net/IFJ90KFDRZSzm_UQqjr4uQ/0/public/logs/task_error.log
and https://public-artifacts.taskcluster.net/VzI_ye6ASvSmL3bQNGG9cg/0/public/logs/task_error.log
File "/builds/beetmoverworker/lib/python3.5/site-packages/beetmoverscript/script.py", line 65, in move_beets
manifest['mapping'][locale][artifact]['s3_key'])
KeyError: 'target.tar.bz2.asc'
https://public-artifacts.taskcluster.net/J6oqzvchT9yAcOy5OqsX9g/0/public/logs/task_error.log
uploaded 34 artifacts to s3 in parallel; we got
2017-01-13 04:41:51,867 - beetmoverscript.script - INFO - 400
2017-01-13 04:41:51,912 - beetmoverscript.script - INFO - <?xml version="1.0" encoding="UTF-8"?><Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>662ECD237729A92F</RequestId><HostId>kX6DO
I'm thinking we should limit the parallel uploads to... 20? 10?
I'm doing this in scriptworker here: https://github.com/mozilla-releng/scriptworker/blob/master/scriptworker/worker.py#L104
We could do that here: https://github.com/mozilla-releng/beetmoverscript/blob/master/beetmoverscript/script.py#L213
Hoping those are the last 2 errors, but we'll see.
Reporter | ||
Comment 15•8 years ago
|
||
Green BM and BM-S!
I still see a number of retries for the upload at 20. We can try 15 or 10 to see if that improves things; I'd aim for 100% successful as the general case, and use retries for actual errors, rather than always relying on retries and not having many left if there's a real hiccup.
Reporter | ||
Comment 16•8 years ago
|
||
Done on date. Please comment+reopen if that's not the case.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•