Closed Bug 1525038 Opened 6 years ago Closed 4 years ago

Add artifact info to listTaskGroup() endpoint

Categories

(Taskcluster :: Services, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: coop, Unassigned)

References

Details

Releng is trying to measure and track the aggregate size of artifacts associated with releases. They are currently having to bend over backwards (well, chain a bunch of look-ups together at least) to look up artifact size, etc given a taskgroup.

Can we add basic information about artifacts to the information returned by the listTaskGroup() endpoint?

Alternately (and this is why I've cc-ed Hassan), would this be something better accomplished by our graphql setup?

For some context, we're also trying to measure the infra cost associated with every push, not only releases. The primary costs are the compute time and storage. We have a handle on compute time by looking at the task status metadata reported by listTaskGroup(), but are currently missing metadata about artifacts produced.

(In reply to Chris Cooper [:coop] pronoun: he from comment #0)

Alternately (and this is why I've cc-ed Hassan), would this be something better accomplished by our graphql setup?

I don't think we should implement that extra information in the graphQL gateway but rather in the client directly. The main purpose of the gateway is to call the relevant client methods in order to gather and filter the responses before sending it back to the client.

Component: Queue → Services
Blocks: 1585578

In Triage, hassan noted that once both object service and postgres are finished, this will be something that can be determined with an ad-hoc db query that joins queue tables and object-service tables.

Depends on: 1436478

I suspect that the information required is the artifact size, which we don't have, in the absence of the object service. The "chain" coop mentions in the first comment is, I'm guessing:

  • list tasks in taskgroup
  • list artifacts for each task
  • decipher the redirects for the artifact to an S3 bucket and object name
  • get S3 metadata about the object

Including artifact metadata in the list of taskgroups only helps with the first two steps, which are -- I suspect -- the easiest bits.

I think the ad-hoc query is probably the best plan here. Secondarily, getting the object service set up would at least make all of this data visible in the API without querying S3 directly. So, I don't think we should modify the listTaskGroup method.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.