Closed
Bug 1494750
Opened 6 years ago
Closed 6 years ago
Move Taskcluster runnable jobs handling client-side
Categories
(Tree Management :: Treeherder, enhancement, P2)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: KWierso)
References
Details
(Keywords: leave-open)
Attachments
(2 files)
As part of buildbot cleanup (bug 1443251), the Treeherder runnable jobs API now only returns taskcluster runnable jobs:
https://github.com/mozilla/treeherder/pull/4071
Unlike the now-removed buildbot runnable jobs support, these are not stored in Treeherder's DB and are instead fetched from:
`https://public-artifacts.taskcluster.net/{task_id}/0/public/runnable-jobs.json.gz`
There is very little processing that occurs server-side (and no caching), so I think we should just move this to the frontend to save the additional HTTP round-trip.
See:
https://github.com/mozilla/treeherder/blob/ffaa2e4b2af200d3ae91e3e6ecdc2d4e4f093e84/treeherder/webapp/api/runnable_jobs.py#L8-L21
https://github.com/mozilla/treeherder/blob/ffaa2e4b2af200d3ae91e3e6ecdc2d4e4f093e84/treeherder/etl/runnable_jobs.py#L16-L61
(Whilst we'll have to leave in some of the functionality server-side, since it's also used by SETA, it will allow for a reasonable amount of clean-up, since both cases only partially overlapped)
Assignee | ||
Comment 1•6 years ago
|
||
Out of curiosity, how would one do https://github.com/mozilla/treeherder/commit/9cb37712473b0691a03aacf37110c7b24ddfc6ab in js on the frontend? I've got the following patch applied, but the response seems like it's still a gzipped blob. response.json() throws an error because the response isn't JSON. response.blob() returns a blob of size 187525 and type application/octet-stream
diff --git a/ui/job-view/pushes/Push.jsx b/ui/job-view/pushes/Push.jsx
index 9eae61619..995ac3997 100644
--- a/ui/job-view/pushes/Push.jsx
+++ b/ui/job-view/pushes/Push.jsx
@@ -316,6 +316,17 @@ class Push extends React.Component {
try {
const decisionTaskId = await getGeckoDecisionTaskId(push.id, repoName);
+
+ const url = `https://queue.taskcluster.net/v1/task/${decisionTaskId}/runs/0/artifacts/public/runnable-jobs.json.gz`
+ let headers = new Headers();
+ headers.append("x-taskcluster-skip-cache", "true");
+
+ var json = await fetch(url, headers)
+ .then(function(response) {
+ return response.text();
+ });
+ console.log(json);
+
const jobList = await RunnableJobModel.getList(repoName, {
decision_task_id: decisionTaskId,
});
------------------------------------
How do I get that back into JSON?
Flags: needinfo?(emorley)
Assignee | ||
Comment 2•6 years ago
|
||
I switched from public-artifacts.t.n to queue.t.n after this conversation https://gist.github.com/KWierso/d255be39f564f57c56758a46c3f86302
Assignee | ||
Comment 3•6 years ago
|
||
Wonder if it'd be easier to just have taskcluster publish that artifact as JSON instead of gzipped JSON, like every other json artifact like https://bugzilla.mozilla.org/show_bug.cgi?id=1423215#c15 says...
full-task-graph.json is 64MB and compresses to 1.4MB, while runnable-jobs.json is 2.6MB and compresses to 188KB.
Assignee | ||
Comment 5•6 years ago
|
||
Okay, I emailed Dustin after posting my previous comment. We'll see how it goes. :)
Assignee | ||
Comment 6•6 years ago
|
||
Dustin didn't seem opposed to just uploading runnable-jobs.json as uncompressed json.
The taskcluster change would just be dropping the ".gz" from https://hg.mozilla.org/mozilla-central/file/tip/taskcluster/taskgraph/decision.py#l173 but all of the consumers (Treeherder backend, SETA, evenually-Treeherder-frontend, others?) would probably want to support both json and gz for a while until the older gz artifacts expire. The taskcluster change would probably want to be uplifted to all actively maintained trees.
On treeherder's side, this change would mostly be undoing (or leaving in place as a fallback if the not-compressed version can't be found?) https://github.com/mozilla/treeherder/commit/9cb37712473b0691a03aacf37110c7b24ddfc6ab#diff-21690b89ebf4f1dde1eb5aa5a3d258a8R136 and then dropping the ".gz" from https://github.com/mozilla/treeherder/blob/ffaa2e4b2af200d3ae91e3e6ecdc2d4e4f093e84/treeherder/etl/runnable_jobs.py#L12 right?
And for the interim where artifacts might be gz or not-gz, try to fetch the not-gz version and if it fails, tack on a ".gz" and try retrieving it once more? Is there anywhere else SETA-specific that would need a change?
I'd be happy to try doing this work as either part of this bug or as a prerequisite, just want to make sure I know all/most of the spots that this would touch.
Flags: needinfo?(emorley)
Comment 8•6 years ago
|
||
Assignee | ||
Comment 9•6 years ago
|
||
This should work. I did a test push to try with the uncompressed runnable-jobs.json artifact, and everything worked correctly with it, and it still works for older pushes that only had runnable-jobs.json.gz.
So, this should be free to land at any point, and everything will just fall through to the gzip fallback, and then I can land the taskcluster change to actually start publishing the non-gzip artifacts and every new push will start using that.
Once all of the old gzip artifacts expire, we could drop the fallback, though it doesn't really hurt anything to have it sitting there.
Once the non-gzip artifacts are in use, I can actually make the change to fetching runnable-jobs directly from the frontend.
Assignee: nobody → wkocher
Keywords: leave-open
Assignee | ||
Updated•6 years ago
|
Comment 10•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Attachment #9029093 -
Flags: review?(emorley)
Assignee | ||
Updated•6 years ago
|
Attachment #9029113 -
Flags: feedback?(emorley)
Assignee | ||
Comment 11•6 years ago
|
||
No rush on looking at either of these, Ed. Just want to make sure I pushed them somewhere safer than my local clone of treeherder. :)
And to recap, order of things:
First: Land part 1.
Second: Make sure part 1 is deployed to prod and sticks.
Third: Land Bug 1423215.
Fourth: (Way in the future once the gz artifacts expire, or after I put in a UI-side fallback to the gz artifacts) Land part 2.
Reporter | ||
Comment 12•6 years ago
|
||
Comment on attachment 9029093 [details]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/4332
(Managed to take a quick look at this before heading out on PTO)
Looks good! I wonder if it's worth changing the mozilla-central part so it initially creates both file formats? This would mean:
* it could land 1st, allowing easier testing of the Treeherder changes
* less chance of breaking any other tooling that relies on the .gz file - and as such presumably less pushback when asking to uplift to beta/esr etc
Attachment #9029093 -
Flags: review?(emorley) → feedback+
Reporter | ||
Comment 13•6 years ago
|
||
Comment on attachment 9029113 [details]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/4333
Left some comments :-)
Attachment #9029113 -
Flags: feedback?(emorley) → feedback+
Assignee | ||
Comment 14•6 years ago
|
||
Okay, so I fixed up your feedback for each PR.
On the server-side PR, I added some logging to NewRelic when the gzipped artifacts are fetched, so we can see how often that happens and we can remove the gzip fallback when that drops to zero.
For the UI-side PR, I also added a fallback to attempt to fetch the runnable-jobs.json.gz from the server-side API (since I'm having trouble unpacking the gzipped artifacts in frontend JS).
So... with those changes made, either of these PRs can independently land first.
We may not need the server-side PR at all, though the NewRelic logging would probably be good to include if we only land the UI-side PR so we can tell when the fallback stops being used.
I'll see about having the taskcluster patch write both versions of the artifact and getting that uplifted ASAP, but I think we're good to land some or all of the patches in here.
(No idea if treeherder logging will record all of the 404 responses for the attempts to fetch the potentially-non-existant uncompressed artifacts before trying the fallback or if that would trigger alerts.)
Assignee | ||
Comment 15•6 years ago
|
||
And to follow up on that, bug 1423215 just landed the new version of my patch that uploads both the gzipped and the uncompressed runnable-jobs.json artifact, decoupling it completely from the PRs in this bug.
Assignee | ||
Comment 16•6 years ago
|
||
That patch has been uplifted to release and esr60, and should be part of the upcoming m-c > beta merge, so will be everywhere we care about within a week or so.
Again, that's now decoupled from these Treeherder patches, and each of the PRs has fallbacks to the old gzip versions in the case where the uncompressed runnable-jobs.json is not available, so either/both can land at any point:
If neither is landed, both artifacts are uploaded and Treeherder will only try to fetch the gzip version via TH backend.
If only the TH backend PR lands, Treeherder will attempt to fetch the uncompressed artifact via the backend. Failing that, will attempt to fetch the gzip version via the backend. NewRelic will log all attempts at fetching the gzip version for tracking purposes.
If only the UI-side patch lands, Treeherder will attempt to fetch the uncompressed artifact directly via the UI. Failing that, the UI will ask the backend to fetch the gzip version. No NewRelic logging will occur.
If both patches land, Treeherder will attempt to fetch the uncompressed artifact directly via the UI. Failing that, the UI will ask the backend to fetch the gzip version. NewRelic will log all attempts at fetching the gzip version for tracking purposes.
Up to you, Ed, on how you want to proceed. The UI-side patch will also get me to a position to hopefully implement a new "Add New Jobs" mode similar to `mach try fuzzy` where you can search for new jobs to run on the push.
Assignee | ||
Updated•6 years ago
|
Attachment #9029093 -
Flags: review?(emorley)
Assignee | ||
Updated•6 years ago
|
Attachment #9029113 -
Flags: review?(emorley)
Comment 17•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/f74c948e78e1e55b7fff9fdfccf3a2e8e702a797
Bug 1494750 - Use the uncompressed runnable-jobs.json file in the backend (#4332)
Taskcluster now outputs the runnable jobs file in two formats, the
original gzipped version, plus an uncompressed version that is
compatible with client-side usage.
This changes the backend to first try and use the uncompressed version,
before falling back to the compressed one as before.
Once Treeherder fully supports the uncompressed version and enough time
has passed such that older Try pushes are no longer being used, then
Taskcluster will be made to no longer output the compressed version.
In a later PR (#4333), Treeherder's UI will fetch the runnable jobs file
directly, meaning these backend parts are used only by SETA.
Reporter | ||
Updated•6 years ago
|
Attachment #9029093 -
Flags: review?(emorley) → review+
Reporter | ||
Comment 18•6 years ago
|
||
Comment on attachment 9029113 [details]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/4333
I've left some comments on the PR :-)
Attachment #9029113 -
Flags: review?(emorley) → review-
Assignee | ||
Comment 19•6 years ago
|
||
Comment on attachment 9029113 [details]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/4333
Rebased and comments addressed.
Attachment #9029113 -
Flags: review- → review?(emorley)
Reporter | ||
Updated•6 years ago
|
Attachment #9029113 -
Flags: review?(emorley) → review+
Comment 20•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/e69b3f01804de91028cbe7b4e263f4977ff45699
Bug 1494750 - Fetch runnable-jobs.json directly from the UI (#4333)
Reporter | ||
Comment 21•6 years ago
|
||
I'll file new bugs for the cleanup to both Treeherder and Taskcluster configs, which will need to wait a bit before landing.
Comment 22•5 years ago
|
||
I used pako
to decompress a gzip
file if it ever helped anyone in the future (since KWierso pointed me to this bug):
https://github.com/mozilla/treeherder/pull/6094/files#diff-27afdf1baa888ea21beb4b2ea894e999R45
Updated•3 years ago
|
Component: Treeherder: Job Triggering & Cancellation → TreeHerder
You need to log in
before you can comment on or make changes to this bug.
Description
•