Closed
Bug 1307826
Opened 8 years ago
Closed 8 years ago
Deploy PushApkWorker on its own production machine
Categories
(Release Engineering :: Release Automation: Other, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jlorenzo, Assigned: jlorenzo)
References
Details
Attachments
(3 files)
Bug 1306307 is making the staging instance live. This bug tracks the work related to the production environment.
Assignee | ||
Comment 1•8 years ago
|
||
:mtabara told me you were the expert on that matter :) Feel free to redirect.
Attachment #8809077 -
Flags: review?(rail)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → jlorenzo
Comment 2•8 years ago
|
||
Comment on attachment 8809077 [details]
Build cloud tool PR
commented in the PR
Attachment #8809077 -
Flags: review?(rail) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 6•8 years ago
|
||
Attachment #8812754 -
Flags: review?(rail)
Assignee | ||
Comment 7•8 years ago
|
||
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
Here's the puppet changes for pushapkwoker production. Staging was already reviewed by Aki in bug 1306307.
As of now, the production instance is already live (via my personal environment). I followed the steps you mentioned at in bug 1308042 comment 33:
1. Packages with the correct version (0.1.3) exist at [1]
2. Hiera secrets are fed in releng-puppet2:/etc/hiera/secrets.eyaml
3. 700 on /build/pushapkworker[2]
4. no nuke needed, for now
5. verbose is currently on[3]
Regarding your other points:
a) production instance was built against a clean ec2
e) PR in build-cloud-tools already landed (see other attachments of this bug)
f) fqdn updated
g) scope needed are held here[4]
j) I'm waiting on your review before pushing to hg.m.o :)
[1] https://releng-puppet2.srv.releng.scl3.mozilla.com/python/packages-3.5/
[2] https://reviewboard.mozilla.org/r/92024/diff/3#38
[3] https://reviewboard.mozilla.org/r/92024/diff/3#0
[4] https://tools.taskcluster.net/auth/clients/#project%252freleng%252fscriptworker%252fpushapk%252fproduction
Attachment #8809433 -
Flags: review?(mtabara)
Updated•8 years ago
|
Attachment #8812754 -
Flags: review?(rail) → review+
Comment 8•8 years ago
|
||
mozreview-review |
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
https://reviewboard.mozilla.org/r/92024/#review94970
All in all it looks good ;) However I'd just advise for a rebase against the current head of puppet, lots have changed in terms of signing/scriptworker there and there might be numerous conflicts you'd have to tackle. Also, please make sure you push these changes to a temporary PR on https://github.com/mozilla/build-puppet to check the linters. You can close the PR once the tests pass. It saves lots of trouble later on.
I'll mark this as r- for now since I'm not sure if all the changes are successfully merge-able in the main repo.
Please correct me if I'm wrong.
::: manifests/moco-config.pp:406
(Diff revision 3)
> # TC signing workers
> $signingworker_exchange = "exchange/taskcluster-queue/v1/task-pending"
> $signingworker_worker_type = "signing-worker-v1"
>
> - # TC signing scriptworkers
> - $signing_scriptworker_provisioner_id = "scriptworker-prov-v1"
> + # TC Scriptworkers
> + $scriptworker_provisioner_id = "scriptworker-prov-v1"
Might need to rebase. This var no longer exists.
::: manifests/moco-config.pp
(Diff revision 3)
>
> # TC beetmover scriptworkers
> $beetmover_scriptworker_task_max_timeout = 2400
> $beetmover_scriptworker_artifact_expiration_hours = 336
> $beetmover_scriptworker_artifact_upload_timeout = 600
> - $beetmover_scriptworker_verbose_logging = false
Slippery fingers? :P Please leave this be, it's used in the beetmoverworker templates :)
::: manifests/moco-config.pp:465
(Diff revision 3)
> beetmover_aws_s3_fennec_bucket => "net-mozaws-stage-delivery-archive",
> }
> }
>
> + ## TC pushapk scriptworkers
> + $pushapk_scriptworker_taskcluster_client_id = secret("pushapk_scriptworker_taskcluster_client_id")
Don't forget to add these two in hiera before merging default to production or else it'll complain very ugly.
::: manifests/moco-nodes.pp:1189
(Diff revision 3)
> + $timezone = "UTC"
> + include toplevel::server::pushapkworker
> +}
> +
> ## Loaners
> +
All good here, but I'm wondering if this should land on production branch. Might need to double-check with @rail or @aki on this one.
The way I did this was to take the diff and add the ## Loaners section as a patch on my puppet environment. You get the same result, absent having it deployed on production. But again, I could be completely wrong, please double check with more experienced puppet folks from our team ;)
::: modules/pushapkworker/manifests/init.pp:102
(Diff revision 3)
> + group => "${users::signer::group}",
> + content => secret('pushapk_scriptworker_release_google_play_certificate'),
> + show_diff => false;
> + }
> +
> + service {
No longer need this - since https://hg.mozilla.org/build/puppet/rev/f295d0822bd4 it's been a common disabled service
::: modules/pushapkworker/templates/config.json.erb:21
(Diff revision 3)
> + "verify_chain_of_trust": false,
> + "sign_chain_of_trust": false,
> +
> + "credentials": {
> + "clientId": "<%= scope.function_secret(["pushapk_scriptworker_taskcluster_client_id"]) %>",
> + "accessToken": "<%= scope.function_secret(["pushapk_scriptworker_taskcluster_access_token"]) %>"
@rail discouraged me to use scope.function_secret directly in the templates as it hides out the secrets from the main moco-config.file. Instead, you can use the same mechanism used for worker_id above ^
::: modules/pushapkworker/templates/script_config.json.erb:8
(Diff revision 3)
> + "schema_file": "<%= scope.lookupvar("config::pushapk_scriptworker_root") %>/lib/python3.5/site-packages/pushapkscript/data/pushapk_task_schema.json",
> + "verbose": <%= @env_config["pushapk_scriptworker_verbose_logging"] %>,
> +
> + "google_play_accounts": {
> + "aurora": {
> + "service_account": "<%= scope.function_secret(["pushapk_scriptworker_aurora_google_play_service_account"]) %>",
More or less a nit: @rail discouraged me to use scope.function_secret directly in the templates as it hides out the secrets from the main moco-config.file. Instead, you can use the same mechanism used for verbose above ^
::: modules/pushapkworker/templates/script_config.json.erb:12
(Diff revision 3)
> + "aurora": {
> + "service_account": "<%= scope.function_secret(["pushapk_scriptworker_aurora_google_play_service_account"]) %>",
> + "certificate": "<%= scope.lookupvar("config::pushapk_scriptworker_aurora_google_play_certificate") %>"
> + },
> + "beta": {
> + "service_account": "<%= scope.function_secret(["pushapk_scriptworker_beta_google_play_service_account"]) %>",
@rail discouraged me to use scope.function_secret directly in the templates as it hides out the secrets from the main moco-config.file. Instead, you can use the same mechanism used for verbose above ^
::: modules/pushapkworker/templates/script_config.json.erb:16
(Diff revision 3)
> + "beta": {
> + "service_account": "<%= scope.function_secret(["pushapk_scriptworker_beta_google_play_service_account"]) %>",
> + "certificate": "<%= scope.lookupvar("config::pushapk_scriptworker_beta_google_play_certificate") %>"
> + },
> + "release": {
> + "service_account": "<%= scope.function_secret(["pushapk_scriptworker_release_google_play_service_account"]) %>",
same here.
Attachment #8809433 -
Flags: review?(mtabara) → review-
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 13•8 years ago
|
||
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
Thanks for the review Mihai! Here's a new revision that addresses the following points:
* Rebased on top of the latest tip[1]
* `function_secret()` is not used anymore in templates. Vars are now retrieved in moco-config.pp. This allows tc_credentials to be defined in the same hiera file
* PushApkScript has been upgraded to 0.1.4[2] (to avoid debug logs in oath2)
* Single quotes are now used when a string doesn't need to be evaluated
* The loaner entry has disappeared
* So does the rpcbind one
* The signer user is not the owner anymore. cltbld is.
This patch has been tested against
* the linter running in Github[3]
* the staging instance[4]
* the production instance[5] (still pinned to my personal env)
Asking new review to Rail, as Mihai is on PTO this week.
[1] https://hg.mozilla.org/build/puppet/rev/43cd21057086
[2] https://github.com/mozilla-releng/pushapkscript/releases/tag/0.1.4
[3] https://github.com/mozilla/build-puppet/pull/20
[4] https://tools.taskcluster.net/task-inspector/#J9WN95hvQOOw20tqaFfXyA/0
[5] https://tools.taskcluster.net/task-inspector/#JTWEANzkSBSZBL_ZPxfT7A/0
Attachment #8809433 -
Flags: review?(rail)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 15•8 years ago
|
||
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
Bug 1320672 being fixed, I changed the PR so we now have different certificates for dev and prod. This has been tested against staging[1] and against prod[2]. When puppet agent applied the changes, nothing happened in production (as expected) and new certs were applied to staging. I am sure the new certs were picked up as [3] complained about the insufficient permission (this has been fixed since).
As you can see [1] and [2] are marked as failed. This is a limitation of Google Play, which doesn't accept APKs to be uploaded twice. Today's APKs were uploaded today, in my preview comment.
r? Rail
[1] https://tools.taskcluster.net/task-inspector/#CIYaqzgdQc6GHIeVeN050w/0
[2] https://tools.taskcluster.net/task-inspector/#IixzRzW8TYuo3jMcVnk78g/0
[3] https://tools.taskcluster.net/task-inspector/#AD9kUpfNQyW43OhkKALY-Q/0
Attachment #8809433 -
Flags: review?(rail)
Comment 16•8 years ago
|
||
mozreview-review |
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
https://reviewboard.mozilla.org/r/92024/#review96122
::: modules/pushapkworker/manifests/mime_types.pp:7
(Diff revision 8)
> +
> + case $::operatingsystem {
> + CentOS: {
> + file { '/etc/mime.types':
> + mode => '0644',
> + content => 'application/vnd.android.package-archive apk',
This one is a bit brutal. :) I hope we don't use this file anywhere else.
Attachment #8809433 -
Flags: review?(rail) → review+
Assignee | ||
Comment 17•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
https://reviewboard.mozilla.org/r/92024/#review96122
> This one is a bit brutal. :) I hope we don't use this file anywhere else.
This file is used by google-api-python-client to make sure it's pushing an APK. It relies on https://docs.python.org/3/library/mimetypes.html which needs this file, no matter what disto we're on. Without this file, google-api-python-client just errors out it can handle the given file. I'll add a comment in the code about that.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 19•8 years ago
|
||
https://hg.mozilla.org/build/puppet/rev/94d76acb93cf9e88d18183aef12a5e237edd705a
Bug 1307826 - Deploy PushApkWorker on its own production machine
Assignee | ||
Comment 20•8 years ago
|
||
Comment on attachment 8809433 [details]
Bug 1307826 - Deploy PushApkWorker on its own production machine
Carrying over r+. I just added a comment to explain why /etc/mime.types was necessary https://reviewboard.mozilla.org/r/92024/diff/8-9/
Landed on default branch at https://hg.mozilla.org/build/puppet/rev/94d76acb93cf
Attachment #8809433 -
Flags: review+
Comment 21•8 years ago
|
||
Merged to production: https://hg.mozilla.org/build/puppet/rev/190761744079
Assignee | ||
Comment 22•8 years ago
|
||
I manually ran `sudo puppet agent --test`, which gave:
> Info: Retrieving pluginfacts
> Info: Retrieving plugin
> Info: Loading facts
> Info: Caching catalog for pushapkworker-1.srv.releng.use1.mozilla.com
> Info: Applying configuration version '190761744079'
Process seems still up and running, even after a `sudo supervisorctl restart pushapkworker`. I'll wait until the next aurora comes to see if the whole pipeline works.
Assignee | ||
Comment 23•8 years ago
|
||
It worked:
* Job appearing in Treeherder: https://treeherder.mozilla.org/#/jobs?repo=mozilla-aurora&revision=96503957841c8c7617a416719c89a06778de396a&filter-tier=3&selectedJob=4340617
* TC task: https://tools.taskcluster.net/task-inspector/#Lpu-6VuHT1ekFOP8FnlqTg/0
Due to yesterday's aurora bustage, I discovered a minor bug that prevents some results from being seen in Treeherder: https://github.com/mozilla-releng/fennec-aurora-task-creator/issues/9. As it's not related to the production deployment per se, I'll fix it there.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment 24•8 years ago
|
||
Good job on finishing this, jlorenzo++ ;)
You need to log in
before you can comment on or make changes to this bug.
Description
•