Closed Bug 1097090 Opened 10 years ago Closed 10 years ago

Refactor the API for jobs retrieval & start using last_modified for improved performance

Categories

(Tree Management :: Treeherder: API, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mdoglio, Assigned: mdoglio)

References

Details

Attachments

(7 files)

The resultset endpoint is currently abused to serve most of the data requested by the UI, no matter it belongs to the resultset table or the job table.
Assignee: nobody → mdoglio
Status: NEW → ASSIGNED
Component: Treeherder → Treeherder: API
Priority: -- → P3
Blocks: 1101040
Priority: P3 → P1
Attached file PR 329 on treeherder-service (deleted) —
Attachment #8551781 - Flags: review?(cdawson)
Attachment #8551781 - Flags: feedback?(emorley)
Attached file PR 340 on Treeherder UI (deleted) —
Attachment #8551782 - Flags: review?(cdawson)
Attachment #8551782 - Flags: feedback?(emorley)
Attachment #8551781 - Flags: review?(cdawson) → review+
Comment on attachment 8551781 [details] PR 329 on treeherder-service f- just until we figure out the best plan for the platform order mappings.
Attachment #8551781 - Flags: feedback?(emorley) → feedback-
Attachment #8551782 - Flags: feedback?(emorley) → feedback-
Comment on attachment 8551782 [details] PR 340 on Treeherder UI Just marking this as minus till you address the few comments. But I like you as a person! :)
Attachment #8551782 - Flags: review?(cdawson) → review-
Attachment #8551782 - Flags: review- → review+
Commit pushed to master at https://github.com/mozilla/treeherder-service https://github.com/mozilla/treeherder-service/commit/119407ad8ecaa787abd629891645b8ae38796098 Bug 1097090 - jobs endpoint refactoring Main changes: - removed the full parameter on the jobs endpoint, since in both cases the data returned had similar shape/size but slightly different set of attributes. - removed the exclusion_state parameter in favour of exclusion_profile. The latter allows to specify which profile to apply to the jobs; by default it will use the default profile and can be disabled using exclusion_profile=false - the data is now returned as a flat list instead of a triple nested structure. As a result the jobs endpoint is now much faster to return data and it allows to easily do basic operations like filtering, sorting, and pagination. Also, it will allow to implement attribute selection with a minimal effort. - removed the debug parameter in favour of a more explicit return_type (dict|list) that allows a consumer to specify the type of structure expected for the results (dict by default) - the resultset endpoint doesn't return jobs anymore but only resultsets.
I just noticed a regression on staging: the ui is not displaying new jobs, although they are coming down the wire.
Attached file Github PR #355 on treeherder-ui (deleted) —
Attachment #8559170 - Flags: review?(emorley)
Comment on attachment 8559170 [details] Github PR #355 on treeherder-ui Commented on the PR :-)
Attachment #8559170 - Flags: review?(emorley)
Comment on attachment 8559170 [details] Github PR #355 on treeherder-ui I've only had a quick glance, since tbh I'm not going to notice anything wrong without spending an hour sitting down and paging all this in, and it looks roughly good to me - and we need to unblock pushing to prod :-)
Attachment #8559170 - Flags: review+
I fixed a couple of other things and I don't see any other issues. I'll merge it.
Commits pushed to master at https://github.com/mozilla/treeherder-ui https://github.com/mozilla/treeherder-ui/commit/cb9e520b476616efe45df85db43f231cace08a11 Bug 1097090 - fix regression on jobs update https://github.com/mozilla/treeherder-ui/commit/b5fec5253d553c7219cbeb848155a5481b8f5972 Bug 1097090 - fix jobs pagination This is to cover those cases where we have more than 2000 jobs either on a single push or on a periodic update; it also sync the number of jobs requested to the limit imposed by the service (again, 2000) https://github.com/mozilla/treeherder-ui/commit/a658402bfdeb0639fc102e1c5a515a72058ebd89 Bug 1097090 - Fix update of the job list stored
I had to roll back the production push, since there were a few more regressions: * Filtering using a string or substring from the buildername doesn't return any jobs: https://treeherder.mozilla.org/#/jobs?repo=fx-team&filter-searchStr=Rev5 * When viewing eg https://treeherder.mozilla.org/#/jobs?repo=fx-team - new pushes appear, but running/pending jobs do not, unless the page is refreshed.
Attachment #8560627 - Flags: review?(emorley)
Attachment #8560630 - Flags: review?(mdoglio)
Note: I can't reproduce the part where you don't get new pending/running jobs on fx-team. I've been sitting on it for an hour and get the same as prod.
Attachment #8560627 - Flags: review?(emorley) → review+
Attachment #8560630 - Flags: review?(mdoglio) → review+
Commits pushed to master at https://github.com/mozilla/treeherder-service https://github.com/mozilla/treeherder-service/commit/38ccdb9c31ba0d0debd36cdba6b5bdb4d836533b Bug 1097090 - Re-add ref_data_name to jobs endpoints This field is required to filter in the ui by buildername. https://github.com/mozilla/treeherder-service/commit/88fc50e26765bb67faed19d6f9026ac055d7c6e0 Merge pull request #369 from mozilla/add-ref-data-name-to-jobs-endpoints Bug 1097090 - Re-add ref_data_name to jobs endpoints
Commits pushed to master at https://github.com/mozilla/treeherder-ui https://github.com/mozilla/treeherder-ui/commit/12040f5f60ec2b9500df5ec5c14c5ed8beea2a08 Bug 1097090 - Fix filter by buildername This had a delay in it due to not triggering the digest. And the code was overly heavyweight with a directive. This is much lighter and faster. https://github.com/mozilla/treeherder-ui/commit/dfc1c53797000c4dd978fbded4fa4001576e3343 Merge pull request #360 from mozilla/fix-delay-filter-by-buildername Bug 1097090 - Fix filter by buildername
I cannot verify the second issue in comment 19
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Unfortunately had to roll back the prod push (to 'production' branch) <RyanVM|sheriffduty> ugh, my inbound tab just got stuck in another "won't update itself" state <RyanVM|sheriffduty> so how much long do we tolerate these known regressions before we revert again? <camd> RyanVM|sheriffduty: by won't update itself, you mean not getting new jobs, right? (just making sure) <RyanVM|sheriffduty> yeah, new results not coming in <camd> or is it the starring thing <camd> ok... <RyanVM|sheriffduty> and in the case of a duplicated job, the oddball behavior not fully resolving itself <RyanVM|sheriffduty> i.e. when you star one, typically one of them goes away and one stays behind <RyanVM|sheriffduty> and then on the next data refresh, the other goes away <camd> RyanVM|sheriffduty: anything in your javascript console? just looking for something to go on <RyanVM|sheriffduty> nope <camd> RyanVM|sheriffduty: did you already refresh? <RyanVM|sheriffduty> yup <RyanVM|sheriffduty> and enjoyed the big pile of new failures that suddenly appeared <camd> I'm going to keep a tab open on it and try to reproduce. Perhaps I can tell something from the http requests that go out. <camd> RyanVM|sheriffduty: I'm loathe to revert until I understand a little better what's happening. but if it gets too shitty for you, then I will. <RyanVM|sheriffduty> i know mauro was able to reproduce the duplicated jobs issue <RyanVM|sheriffduty> not sure where he got with investigating, though <camd> true, yeah. I think he said he was working on a fix, iirc from the scrollback... <mdoglio|afk> I haven't gone too far yet, I'll keep working on it tomorrow morning <camd> mdoglio|afk: I wonder if the updating thing may be that a "last fetched" date gets set wrong somehow... <camd> hopefully I can reproduce that and see something in the url params... <mdoglio|afk> Yeah the date in the url params is human readable <•edmorley> RyanVM|sheriffduty: can you see if this repros on stage? If so, we can rollback the deploy and test on stage. I mean stage should display the issue, if not, we need to fix stage armenzg_brb → armenzg <•edmorley> s/fix stage/make stage _actually_ representative/ <RyanVM|sheriffduty> edmorley: https://treeherder.allizom.org/#/jobs?repo=mozilla-central shows dupe jobs on tip <RyanVM|sheriffduty> not sure about the rest, but those were visible immediately on load <•edmorley> RyanVM|sheriffduty: would you like me to roll back prod then - I'm happy to <RyanVM|sheriffduty> *sigh* I'd be OK if you guys wanted to work more on fixing it before reverting, but I know it's late <•edmorley> RyanVM|sheriffduty: it seems like we just need to get sheriffs (or us, but looking very closely next time) to check stage before we deploy again <RyanVM|sheriffduty> I think the tabs getting stuck issue is serious enough to warrant rolling back, though :( — RyanVM|sheriffduty just realized that another one of his was stuck <•edmorley> RyanVM|sheriffduty: it's 9pm, neither Mauro or I will be able to look at it today, let's just roll back. We have a shiny new stage, let's use it :-) <RyanVM|sheriffduty> yeah, agreed
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1131133
Just to update those following the bugs: today we tried to deploy master again, but unfortunately the sheriffs noticed that jobs were not being updated in the UI, unless the page was refreshed. The deploy was reverted for this and bug 1133910. Both Mauro and myself were unable to repro on stage - but Ryan was able to repro using this URL: https://treeherder.allizom.org/#/jobs?repo=mozilla-central&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=pending&filter-resultStatus=running&filter-classifiedState=unclassified I'm guessing there's some interaction with the filters going on.
Blocks: 1133837
Summary: move the jobs retrieval from the result-set endpoint to the jobs endpoint → Refactor the API for jobs retrieval & start using last_modified for improved performance
Attached file Github PR #373 on treeherder-ui (deleted) —
Attachment #8565983 - Flags: review?(emorley)
Comment on attachment 8565983 [details] Github PR #373 on treeherder-ui :-)
Attachment #8565983 - Flags: review?(emorley) → review+
Attached file Github PR #375 on treeherder-ui (deleted) —
Attachment #8566631 - Flags: review?(emorley)
Attachment #8566631 - Flags: review?(emorley) → review+
Commits pushed to master at https://github.com/mozilla/treeherder-ui https://github.com/mozilla/treeherder-ui/commit/053be84cebcc0ef05b0140b5d5c802359dd9fc5d Bug 1097090 - Set UTC timezone on lastModified date objects https://github.com/mozilla/treeherder-ui/commit/e64e0e100ed5ef0e88f7aa43c997a2328e0f6552 Merge pull request #375 from mozilla/fix-last-modified-tz Bug 1097090 - Set UTC timezone on lastModified date objects
Depends on: 1134916
Depends on: 1135093
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Depends on: 1136869
Resolution: --- → FIXED
Depends on: 1139941
Depends on: 1141063
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/808de144d7314c10ef707bb65852b748435b479a Bug 1097090 - jobs endpoint refactoring https://github.com/mozilla/treeherder/commit/34ff2f6a28478eee30570820ecf00b235630847e Bug 1097090 - fix exclusion profile toggle https://github.com/mozilla/treeherder/commit/9a62a5b31f628bc252ae110f92abe81051d0a617 Bug 1097090 - fix exclusion_profile querystring parameter https://github.com/mozilla/treeherder/commit/2ba244279b7230d864b603f5e190a446aa37ba4f Bug 1097090 - fix resultset push_timestamp in similar jobs panel https://github.com/mozilla/treeherder/commit/a1739b7ac8abe6183e9b9f977d275b94d38f2276 Bug 1097090 - fix regression on jobs update https://github.com/mozilla/treeherder/commit/fbb52f1dc8438ff7be984fd1c2571ff1e13b95d3 Bug 1097090 - fix jobs pagination This is to cover those cases where we have more than 2000 jobs either on a single push or on a periodic update; it also sync the number of jobs requested to the limit imposed by the service (again, 2000) https://github.com/mozilla/treeherder/commit/c27df3cd9d54543424eeb22f0fd40bd5b1a9e292 Bug 1097090 - Fix update of the job list stored https://github.com/mozilla/treeherder/commit/480b627bdbf6a750b064b3ff4062b5e1f0bf1073 Bug 1097090 - Fix filter by buildername This had a delay in it due to not triggering the digest. And the code was overly heavyweight with a directive. This is much lighter and faster. https://github.com/mozilla/treeherder/commit/2628873656f51a93c8e79fe77b70f767df569f0c Merge pull request #360 from mozilla/fix-delay-filter-by-buildername Bug 1097090 - Fix filter by buildername https://github.com/mozilla/treeherder/commit/ecf201be4d7e82b1952654e24e58ddbfbba8a482 Bug 1097090 - fix pagination config in JobsModel https://github.com/mozilla/treeherder/commit/b1edb9e24880435fcecd7cf7b1caeb82b63cb64b Bug 1097090 - use config.fetch_all to activate job pagination https://github.com/mozilla/treeherder/commit/91df94cc8089034d5db8374e5a9c7b2e14a36360 Bug 1097090 - update tests to use the new job flat structure https://github.com/mozilla/treeherder/commit/2c12c00d1284dc66881ab4a18487f69e8a33f5f6 Bug 1097090 - add tests for ThJobModel.get_list https://github.com/mozilla/treeherder/commit/813119fd2a11002917fd69947009ba6bb1529a37 Merge pull request #373 from mozilla/fix-fetch-all-param Bug 1097090 - use config.fetch_all to activate job pagination https://github.com/mozilla/treeherder/commit/398e563a2d3d43b13447cefadc525642beb85eb3 Bug 1097090 - Set UTC timezone on lastModified date objects https://github.com/mozilla/treeherder/commit/d509e7553d100eb19c03640badff849e5704d204 Merge pull request #375 from mozilla/fix-last-modified-tz Bug 1097090 - Set UTC timezone on lastModified date objects https://github.com/mozilla/treeherder/commit/5429e1912f37776632c0c5f49b64cf578d4c6dfe Remove now obsolete API option that was removed in bug 1097090.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: