Closed Bug 1151806 Opened 10 years ago Closed 9 years ago

Add chunking to treeherder-client / ETL to keep POSTs under the 30s Heroku request time limit

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: camd)

References

Details

Attachments

(1 file, 1 obsolete file)

Ingestion Chunking PR 9 years ago Cameron Dawson [:camd] (deleted), text/x-github-pull-request	mdoglio : review+	Details
fixed PR after backout 9 years ago Cameron Dawson [:camd] (deleted), text/x-github-pull-request	mdoglio : review+	Details

Ed Morley [:emorley]

Reporter

Description

•

10 years ago

Breaking out of the overall Heroku bug (bug 1145606), since this is camd's Treeherder deliverable for this quarter: https://docs.google.com/a/mozilla.com/document/d/1U3VXk7K5iTmZvqqtX4sc-Znhch9ZAD98Vl1OyWRqZwo/edit Heroku has a 30 second cutoff for requests to web nodes (https://devcenter.heroku.com/articles/request-timeout). Our current ETL process submits the data to the publicly accessible API on the web nodes, quite often in big chunks due to builds-4hr etc. We'll likely hit the 30s limit unless we do one of: 1) Chunk the submissions to the API. 2) Make the ETL layer submit not use the web-accessible API (eg internally make the model/DB updates). 3) Be more intelligent about the amount of busywork we repeat (eg switch to builds-2hr or use memcached to keep track of ingested jobs, so we don't continually re-insert the builds-4hr jobs list, when only a small percentage of it is new each time).

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

No longer depends on: 1151803

Cameron Dawson [:camd]

Assignee

Comment 1

•

10 years ago

use memcache to keep track of which jobs we've already ingested from pending.js, running.js or build4hr.js. So we can check that prior to ingestion rather than relying always on the failover of ON DUPLICATE KEY. This will reduce DB traffic and speed it up. One memcache key per repo. Add job_guid to the list only on successful ingestion.

Cameron Dawson [:camd]

Assignee

Comment 2

•

10 years ago

chunking can be added as a param in the th_client. We can specify the chunk size in our settings file. This could be broken up by chunk size for pending, running, build4hr. Even resultsets. So the code change will be primarily in th_client. But then OAuthLoaderMixin will need to pass the param from the settings. Mauro mentioned, too, to change our timeout to match the Heroku limit requirement. Then when we deploy to the existing staging env, we'll know we're good.

Cameron Dawson [:camd]

Assignee

Comment 3

•

10 years ago

Above are just some notes from chatting with mdoglio about this task. Sorry they're a bit choppy. :)

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

Depends on: 1096878

Cameron Dawson [:camd]

Assignee

Updated

•

9 years ago

Status: NEW → ASSIGNED

Cameron Dawson [:camd]

Assignee

Comment 4

•

9 years ago

Attached file Ingestion Chunking PR (obsolete) (deleted) — Details

Attachment #8606000 - Flags: review?(mdoglio)

Cameron Dawson [:camd]

Assignee

Updated

•

9 years ago

Attachment #8606000 - Attachment description: PR → Ingestion Chunking PR

Ed Morley [:emorley]

Reporter

Comment 5

•

9 years ago

When this lands, we'll want to double check it wasn't the cause of the memory usage spikes seen in bug 1164888 comment 2, which was either that bug or this one. Perhaps Mauro's idea of using a generator (https://github.com/mozilla/treeherder/pull/533#discussion_r30598047) will help avoid this? :-)

Mauro Doglio [:mdoglio]

Updated

•

9 years ago

Attachment #8606000 - Flags: review?(mdoglio) → review+

Treeherder GitHub Bugbot

Comment 6

•

9 years ago

Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/e71e78156555baada7f6d60d291188387ce96b12 Bug 1151806 - Implement chunking for job ingestion

Ed Morley [:emorley]

Reporter

Updated

•

9 years ago

Depends on: 1167091

Ed Morley [:emorley]

Reporter

Comment 7

•

9 years ago

Something that occurred to me: the "too many requests" is presumably us hitting the API thresholds we put in place for things like taskcluster (though it does seem right that we hit them too; makes sense for us to do so). To decide what values we should set chunk size to, I think it would help to know what a typical batch size would be if we weren't using chunking at all (likely for builds-4hr, since that's the worst case file). ie: if previously we'd been submitting up to 10,000 jobs at once, then perhaps the 150 job chunk size we have after the followup https://github.com/mozilla/treeherder/commit/e9d127f7eee4a3dbffee6b421e293adcf46fcc52 is still a case of "one extreme to the other"? (And so we could say set it to 500 or 1000 jobs and avoid the timeouts on Heroku but also not increase the number of requests ten-fold).

Summary: Get ETL layer working with Heroku → Add chunking to treeherder-client / ETL to keep POSTs under the 30s Heroku request time limit

Treeherder GitHub Bugbot

Comment 8

•

9 years ago

Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/942e314361ed3fbe90e58780f7bc983cbd9edd4a Revert "Bug 1151806 - Implement chunking for job ingestion" This reverts commit e71e78156555baada7f6d60d291188387ce96b12. This commit caused pending and running jobs to be put into the objectstore. This causes their completed versions not to be ingested.

Cameron Dawson [:camd]

Assignee

Comment 9

•

9 years ago

Attached file fixed PR after backout (deleted) — Details

Attachment #8609401 - Flags: review?(mdoglio)

Cameron Dawson [:camd]

Assignee

Updated

•

9 years ago

Attachment #8606000 - Attachment is obsolete: true

Cameron Dawson [:camd]

Assignee

Comment 10

•

9 years ago

mdoglio hasn't actually marked this an r+, but he said it was on a vidyo chat today.

Treeherder GitHub Bugbot

Comment 11

•

9 years ago

Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/f598198c72470209c698f4099ed748ac65c49857 Bug 1151806 - Implement chunking for job ingestion - fixed

Ed Morley [:emorley]

Reporter

Updated

•

9 years ago

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Mauro Doglio [:mdoglio]

Updated

•

9 years ago

Attachment #8609401 - Flags: review?(mdoglio) → review+

You need to log in before you can comment on or make changes to this bug.