Closed Bug 1422740 Opened 7 years ago Closed 7 years ago

Put symbol upload token in Taskcluster secrets service, stop using private Docker image

Categories

(Firefox Build System :: General, enhancement)

enhancement
Not set
normal

Tracking

(firefox59 fixed)

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: ted, Assigned: ted)

References

(Blocks 1 open bug)

Details

Attachments

(5 files)

We currently use a private Docker image with a baked-in Socorro authentication token to upload crashreporter symbols to Socorro:
https://dxr.mozilla.org/mozilla-central/rev/574f4f58fe09dd590ea892406e237318c31705b4/taskcluster/ci/upload-symbols/kind.yml#46

bug 1315287 moved the Dockerfile into the tree, but it's not actually built on-demand like our other in-tree Docker images because of the baked-in token:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/docker/upload-symbols/Dockerfile

The Taskcluster secrets service didn't exist when this was originally implemented, but now it does, so we should store the token in there and change the symbol upload script to fetch the token from the secrets service. We can then  change the symbol upload Docker image to be a non-private image and life will be better. Alternately, we could get rid of it entirely and just use desktop-build or something to run the symbol upload script.

This doesn't strictly block bug 1422735, since the token we currently use will be migrated to Tecken so that the existing setup should Just Work, but it will make testing changes easier. In the interim we can wire things up to use Tecken's staging instance (https://symbols.stage.mozaws.net/) for uploads from Try so we can test that changes to the symbol upload task work as expected. peterbe has some followup work for Tecken that will fix bug 1138617 and let us actually upload Try symbols somewhere useful, so fixing this will make it simpler to enable that as well.
dustin: you seem to be the last one to locate the current token we're using (in bug 1315287). Would you be able to dig it out and put it in a secret in Taskcluster? I'm planning to try to make things better so that we can use this on try as well, so we should probably name them something like `project/releng/gecko/build/level-{1,3}/gecko-symbol-upload`. I can create a token on the Tecken staging server to use for level 1 uploads for now. (It's hard to get symbol uploads without explicitly asking for them anyway, so uploading to a staging server should be fine.)
Flags: needinfo?(dustin)
Hm, I don't have access to puppet secrets anymore.  I can ask someone who does to get me that secret, or would it be easier to just generate a new one?
I'd like to try to minimize the changes here, so if we can dig out the existing secret that would be best. Let's find someone in releng who has access.
peterbe gave me admin access on the Tecken staging server: https://symbols.stage.mozaws.net/

So I generated a symbol upload token there and dustin helpfully stored it in a level-1 secret in taskcluster: project/releng/gecko/build/level-1/gecko-symbol-upload
https://tools.taskcluster.net/secrets/project%2Freleng%2Fgecko%2Fbuild%2Flevel-1%2Fgecko-symbol-upload

I can now write a patch to test using that to upload to Tecken stage on try.

We'll still need to get someone from Releng with access to put the current prod Socorro token into a level-3 secret.
Flags: needinfo?(dustin)
project/releng/gecko/build/level-3/gecko-symbol-upload has been created
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #6)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=3a747220d5068a70415eec957f2d51c4d9048683

I went too far with this patch and tried to make the symbol-upload image built in-tree but didn't actually make that all work so it broke the decision task.

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #7)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=7e27cd9a70a7bc581dcf75db042d0c5d710eb286

This was closer, as I kept the existing private docker image, but I:
a) forgot to put back the scope I had removed for access to that private image and
b) made a dumb copy/paste error in the Python script

I fixed both of those so hopefully the try push in comment 8 will work. I'm planning on writing another commit on top of this to just make the upload-symbols task use an existing in-tree image, like maybe the `lint` image, and get rid of the upload-symbols image entirely.
No longer blocks: 1422735
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #10)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=b72274942466d7fc74b959cf380bbf1b7626dfd5

Nope, I'm *still* bad at this. I forgot that while `{level}` in scopes gets replaced by one of the existing transforms, the same thing isn't true of environment variables, so the env var in the resulting task had a literal `{level}` in it. :-/

Fixed that, maybe the fifth time's the charm?
This last try push *finally* worked! I don't think this link will work for anyone without the right permissions, but:
https://symbols.stage.mozaws.net/uploads/upload/699

The symbols wound up in the staging bucket like:
https://symbols.stage.mozaws.net/firefox/B6978DA8BE95980FF9946AA3AE199DA40/firefox.dbg.gz
https://symbols.stage.mozaws.net/firefox/B6978DA8BE95980FF9946AA3AE199DA40/firefox.sym
Assignee: nobody → ted
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=e2c0dc47e18a4d8d60ebf3360780e7396f48b12a

This looks good too! \o/

This push also switched the symbol-upload tasks to use run-task and the in-tree lint image.
Oh, I spoke too soon. One of the two symbol-upload tasks on that try push failed because it exceeded its max runtime. Adding the clone+checkout time adds just enough overhead to make this take longer than 10 minutes, I guess. I should look into using a sparse checkout profile.
Blocks: 1423881
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #18)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=58bde84f05ec1c346bb89224ff83b08e6054639b

I added a sparse profile and used it here and the upload-symbols tasks completed successfully, but the build-linux64-nightly/opt-upload-symbols task took ~10 mins, which is a little too close for comfort, so I'll bump that timeout up a bit to have a margin for safety.
Blocks: 1423900
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #20)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=ebed78ff53d1fab074b004592210c6bed11b6f56

I made a few additional changes and wanted to make sure they worked, but I also rebased on top of autoland to merge with my patch from bug 1422735, but something else on autoland broke things. :-/
Note to my future self: pulling from autoland is not a good idea. Two burned try pushes in a row! I picked a green changeset from autoland and rebased on that for that most recent try push. In any event, I had enough green builds before that that I'm pretty confident in most of the patch stack, and it's only the last changeset (changing the worker type) that hasn't been tested on try yet.
Comment on attachment 8935461 [details]
bug 1422740 - Change upload_symbols script to fetch token from Taskcluster secrets service.

https://reviewboard.mozilla.org/r/206362/#review212002
Attachment #8935461 - Flags: review?(gps) → review+
Comment on attachment 8935462 [details]
bug 1422740 - change upload-symbols tasks to use in-tree lint image.

https://reviewboard.mozilla.org/r/206364/#review212004
Attachment #8935462 - Flags: review?(gps) → review+
Comment on attachment 8935463 [details]
bug 1422740 - Define and use a sparse profile for upload-symbols tasks.

https://reviewboard.mozilla.org/r/206366/#review212006
Attachment #8935463 - Flags: review?(gps) → review+
Comment on attachment 8935464 [details]
bug 1422740 - Remove the upload-symbols Docker image files.

https://reviewboard.mozilla.org/r/206368/#review212008
Attachment #8935464 - Flags: review?(gps) → review+
Comment on attachment 8935465 [details]
bug 1422740 - Use the gecko-{level}-b-linux worker for upload-symbols tasks.

https://reviewboard.mozilla.org/r/206370/#review212010

I'll approve this because it eliminates a worker type. But using a build worker for this lightweight task is wasteful. gecko-t-linux-large feels better. I just don't know if it needs to be associated with a level-specific worker so credentials can't leak. That's a question for Dustin.
Attachment #8935465 - Flags: review?(gps) → review+
It should be level-specific, yes.  I think we would still have issues running lint (CentOS-based) and test (Ubuntu-based) docker images on the same hosts?  At any rate, either a new workerType or pairing with one of the test workerTypes makes sense down the road.
Dustin recommended the build worker type because it is likely to have the lint image already cached, so startup time should be faster. I agree that the worker type is overkill. I filed bug 1424042 on making this better.
Pushed by tmielczarek@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/57736bb879c1
Change upload_symbols script to fetch token from Taskcluster secrets service. r=gps
https://hg.mozilla.org/integration/autoland/rev/b93986a970e0
change upload-symbols tasks to use in-tree lint image. r=gps
https://hg.mozilla.org/integration/autoland/rev/8b5417a83046
Define and use a sparse profile for upload-symbols tasks. r=gps
https://hg.mozilla.org/integration/autoland/rev/12d4832c9a29
Remove the upload-symbols Docker image files. r=gps
https://hg.mozilla.org/integration/autoland/rev/1117fd435051
Use the gecko-{level}-b-linux worker for upload-symbols tasks. r=gps
Blocks: 1424236
Blocks: 1424651
Depends on: 1424967
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: