Closed Bug 1385887 Opened 7 years ago Closed 7 years ago

Create an S3 bucket to hold uploaded generated sources from builds

Categories

(Taskcluster :: General, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ted, Assigned: dustin)

References

Details

I'd like to upload generated source files from builds that we ship to users, so that we can link to the source from crashes in crash-stats, and also serve the source to people debugging Nightly/Release builds. I think the simplest thing we can do here is set up an S3 bucket and upload the source files as individual keys. The source files will likely vary by build type (across platform/bitness), but probably won't change much from build-to-build. In light of that, my thought was to upload them with a content hash in the key, so something like: c1cb54a78cb0391f917c3dc48cd5c70973ed4d73f8a8a25326927dea220280deaa3e1a7b27c89308cb4e7323715145ce7d74480626fcc07b8c78bcf33cfca828/toolkit/components/telemetry/TelemetryEventEnums.h ...for the file ${objdir}/toolkit/components/telemetry/TelemetryEventEnums.h with that SHA-512 hash. That way files that don't change from build to build will not take up additional space, and files with the same path but different content (because they're from a different build or the generation process has changed) will not conflict. We could omit the filename and just use the content hash, but that would make the URLs not very friendly and there might be some places (like Microsoft debuggers) that display the URL, so I'd like to have the filename in there as well. In my local Linux64 build there are ~65MB worth of generated files that would be uploaded. As I said, I'd expect most of these to not change between builds of the same type, so it's probably not going to actually grow very large over time. I think we could set a default bucket policy to expire entries after 6 months or so, and configure the upload process to either overwrite existing entries or bump the expiration time so that we don't expire out files that haven't changed in a while (but are used in newer builds).
Is this a purpose to which https://docs.taskcluster.net/manual/using/s3-uploads could be put? I suppose the question still remains, which account to create the bucket under, but ultimately that's probably not terribly important. I think the code to handle the check-and-then-upload process will be a bit tricky to do with an economy of data transfer / API calls, and I'm not sure how best to handle the lifetime - I don't know if there's a way to "bump" an object. But that could all be solved. I'm doing something similar in bug 1382729 to upload docs.
That looks like it'd work fine (I saw it in that bug just recently). I can't find anyway to bump an S3 object's expiration date that comes from a lifecycle policy, so we might as well just re-upload all the files every time even if they already exist. It's ~3,000 files per build, so we're talking on the order of $0.015 in request costs per build. I'm planning on doing this in a separate task after the build (like we do uploadsymbols), so the time shouldn't be a huge factor.
dustin: when you have time, would you mind creating a bucket for this purpose and putting credentials for it in secrets? It should be fine for them to be level-3 only, since this will only get triggered on builds that we ship to users. (I'd like to make it work for try builds in the future but we don't currently upload symbols for try builds either.)
Flags: needinfo?(dustin)
Oh, apparently you can configure CORS for an entire bucket, which is something I'd want (I'm likely to consume these files in a webapp eventually): http://docs.aws.amazon.com/AmazonS3/latest/dev/cors.html We probably just want something like: <CORSConfiguration> <CORSRule> <AllowedOrigin>*</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> </CORSRule> </CORSConfiguration> ...to allow any site to GET files from this bucket. Similarly we'd want to allow anonymous read permission like: http://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html#example-bucket-policies-use-case-2
Bucket: gecko-generated-sources Region: us-west-2 (Oregon) Bucket Policy: { "Version":"2012-10-17", "Statement":[ { "Sid":"AddPerm", "Effect":"Allow", "Principal": "*", "Action":["s3:GetObject"], "Resource":["arn:aws:s3:::gecko-generated-sources/*"] } ] } CORS Configuration: <CORSConfiguration> <CORSRule> <AllowedOrigin>*</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> </CORSRule> </CORSConfiguration> IAM User: gecko-generated-sources-upload IAM Policy: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::gecko-generated-sources" ] }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:PutObjectAcl" ], "Resource": [ "arn:aws:s3:::gecko-generated-sources/*" ] } ] } Secret name: project/releng/gecko/build/level-3/gecko-generated-sources-upload I'm sure I got something wrong, so let me know when it doesn't work :)
Flags: needinfo?(dustin)
I'm testing this out in bug 1259832 by manually creating tasks in the task creator, if it works we can close this bug. Thanks!
Assignee: nobody → dustin
Hooray, it works!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.