Closed Bug 1306709 Opened 8 years ago Closed 8 years ago

Move SETA to Treeherder

Categories

(Tree Management Graveyard :: Treeherder: SETA, defect)

Product:

Component:

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

Attachments

(5 files, 6 obsolete files)

SQL statement 8 years ago Armen [:armenzg] (deleted), text/plain	jmaher : feedback+	Details
Diff showing current changes 8 years ago Armen [:armenzg] (deleted), text/plain	wlach : feedback+	Details
[treeherder] armenzg:seta > mozilla:master 8 years ago GitHub Autolander Bot (deleted), text/x-github-pull-request		Details
ouija/seta refactoring PR 8 years ago Armen [:armenzg] (deleted), text/x-github-pull-request	jmaher : review+	Details
[treeherder] armenzg:seta_rewrite > mozilla:master 8 years ago GitHub Autolander Bot (deleted), text/x-github-pull-request	jmaher : review+ wlach : review+ emorley : review+	Details
compare.py - Compare SETA's production endpoint with localhost's 8 years ago Armen [:armenzg] (deleted), text/x-python		Details
Task list 8 years ago Armen [:armenzg] (deleted), text/plain		Details
Commit for preseed support 8 years ago Armen [:armenzg] (deleted), text/plain		Details
Commits view 8 years ago Armen [:armenzg] (deleted), text/plain		Details
[treeherder] armenzg:sample_data > mozilla:master 8 years ago GitHub Autolander Bot (deleted), text/x-github-pull-request		Details
Commit with feedback from emorley 8 years ago Armen [:armenzg] (deleted), text/plain		Details

Armen [:armenzg]

Assignee

Description

•

8 years ago

I will be looking into turning SETA into a django app. We will need to define the models and views. As well as writing tests for it. I will start by setting up Treeherder locally, grabbing some Buildbot data from a push and adjust URLConf to point to the SETA app.

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Assignee: nobody → armenzg

Armen [:armenzg]

Assignee

Comment 1

•

8 years ago

I've gathered some notes in here: https://public.etherpad-mozilla.org/p/seta_in_treeherder WIP: https://github.com/mozilla/treeherder/compare/master...armenzg:seta

Armen [:armenzg]

Assignee

Comment 2

•

8 years ago

Attached file SQL statement (deleted) — Details

No rush on the feedback. My current django models.py generates this. Which changes would you like to see? jmaher: could you please attach the definition for job priorities? I don't think I completely set up ouija/seta locally. The tables are named starting with "seta_" to make it easier to determine which app they're associated to. We don't have to use it. I've also changed `branch` for `repo_name`. In taskrequests I've removed `id` and made `repo_name` the PK. I have this definition for taskrequests [1] CREATE TABLE `taskrequests` ( `id` int(11) NOT NULL AUTO_INCREMENT, `branch` varchar(128) NOT NULL, `counter` int(11) NOT NULL, `datetime` datetime NOT NULL, `reset_delta` int(11) NOT NULL, PRIMARY KEY (`id`), KEY `ix_taskrequests_branch` (`branch`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Attachment #8798063 - Flags: feedback?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 3

•

8 years ago

Comment on attachment 8798063 [details] SQL statement this looks correctly, is there something missing which is why you are asking about me to update job priorities?

Attachment #8798063 - Flags: feedback?(jmaher) → feedback+

Armen [:armenzg]

Assignee

Comment 4

•

8 years ago

I wanted an sql dump to compare. I'm following the same structure as models.py from ouija so it should be about the same.

Joel Maher ( :jmaher ) (UTC -8)

Comment 5

•

8 years ago

this is a new table for the taskcluster scenario- we do not have this table on alertmanager or mysql technically.

Armen [:armenzg]

Assignee

Comment 6

•

8 years ago

Attached file Diff showing current changes (obsolete) (deleted) — Details

Armen [:armenzg]

Assignee

Comment 7

•

8 years ago

Comment on attachment 8798531 [details] Diff showing current changes * Enabled a seta django app inside of Treeherder * Enabled seta specific url configurations * Models' migration * Added some models and tests for them ** Not all tables from SETA will move forward ** TaskRequests has no data on seta-dev on Heroku for me to check for data * Create a couple of hello world views

Attachment #8798531 - Flags: feedback?(wlachance)

GitHub Autolander Bot

Comment 8

•

8 years ago

Attached file [treeherder] armenzg:seta > mozilla:master (obsolete) (deleted) — Details

William Lachance (:wlach)

Comment 9

•

8 years ago

Comment on attachment 8798531 [details] Diff showing current changes In the future could you attach the pull request instead of the diff? Follow the procedure here: http://globau.wordpress.com/2013/10/21/github-pull-requests-and-bugzilla/ Anyhoo, good start. :) See PR for comments.

Attachment #8798531 - Flags: feedback?(wlachance) → feedback+

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Attachment #8798531 - Attachment is obsolete: true

Armen [:armenzg]

Assignee

Comment 10

•

8 years ago

Hi jmaher, From working on the code, I believe that the columns testtype, buildtype and platform fromjob priority is composite unique index. Is this accurate? I would like to update the models to reflect this.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 11

•

8 years ago

yes, this is correct!

Flags: needinfo?(jmaher)

Armen [:armenzg]

Assignee

Comment 12

•

8 years ago

Attached file ouija/seta refactoring PR (deleted) — Details

This change refactors the code to make it easier to hack. It also enforces the composite unique index to prevent bugs in the code. DO NOT MERGE. I want to rebase the code and squash the 3 commits into one single refactoring commit.

Attachment #8801857 - Flags: review?(jmaher)

Attachment #8801857 - Flags: feedback?

Armen [:armenzg]

Assignee

Comment 13

•

8 years ago

FYI I rebased and merge those 3 commits. There's no reason to wait.

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

Attachment #8801857 - Flags: review?(jmaher) → review-

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

Attachment #8801857 - Flags: feedback?

Armen [:armenzg]

Assignee

Comment 14

•

8 years ago

Comment on attachment 8801857 [details] ouija/seta refactoring PR Please do not merge if r+. Please let me edit the commits to bring it back to only 3 commits.

Attachment #8801857 - Flags: review- → review?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

Attachment #8801857 - Flags: review?(jmaher) → review+

Armen [:armenzg]

Assignee

Comment 15

•

8 years ago

wlach, I would like to gather few days worth of jobs that have failed and starred. I'm only interested on the ones "fixed by commit" but I want to have all of them to test against real data. How do I go about this?

William Lachance (:wlach)

Comment 16

•

8 years ago

(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #15) > wlach, I would like to gather few days worth of jobs that have failed and > starred. > I'm only interested on the ones "fixed by commit" but I want to have all of > them to test against real data. > > How do I go about this? Hmm, unfortunately this is more difficult than it should be at present. :( The data is distributed between the master treeherder database and the per-project ones. So for you to do this locally, we'd need to dump out the tables for those data ranges and import them into the vagrant instance. This will be a lot easier in a couple weeks, after bug 1178641 is fixed (it would be maybe a 3-4 hour job now, perhaps reduced to 1-2 then) Maybe the easiest thing would be for us to rebase the heroku prototype against a snapshot of what we have in production (I think we want to do this anyway), and then have you test your code against there (deploying to heroku is basically a one click operation). Does that sound reasonable?

Armen [:armenzg]

Assignee

Comment 17

•

8 years ago

Yes! That works for me. How long will this take? (just so I can prioritize)

Joel Maher ( :jmaher ) (UTC -8)

Comment 18

•

8 years ago

could you use the Ouija database for this?

Armen [:armenzg]

Assignee

Comment 19

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #18) > could you use the Ouija database for this? For the final product I don't want this. For now, for the prototype, I think I could use the failures endpoint from the seta-dev instance.

William Lachance (:wlach)

Comment 20

•

8 years ago

(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #17) > Yes! That works for me. > How long will this take? (just so I can prioritize) I am hoping to have most of bug 1178641 wrapped up by early November.

GitHub Autolander Bot

Comment 21

•

8 years ago

Attached file [treeherder] armenzg:seta_rewrite > mozilla:master (deleted) — Details

Armen [:armenzg]

Assignee

Comment 22

•

8 years ago

Comment on attachment 8798533 [details] [treeherder] armenzg:seta > mozilla:master https://github.com/mozilla/treeherder/pull/1957

Attachment #8798533 - Attachment filename: Github pull request #1900 → Github pull request #1957

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Attachment #8798533 - Attachment is obsolete: true

Armen [:armenzg]

Assignee

Comment 23

•

8 years ago

wlach: do you have time to meet on Tuesday/Wednesday? The way that SETA currently works is by having certain scripts run on certain schedule. There are mainly two concepts involved. 1) Create job priorities based on runnable api info (either with high value or low value) 2) Analyze failures (those marked as fixed by commit) in the last X days and update the jobs priority table I am currently working on porting the script for #1, however, I want to know if it is possible to run a command like on a schedule: > ./manage.py update_jobs_priority This script currently works by downloading the latest data from the runnable jobs api. It then determines which jobs are new and inserts them with a give set of defaults (including an expiration data). On the first run, all jobs are inserted with a different set of defaults. I could change how the logic of this script works to skip runnable jobs completely. To do this, I would need to be able to be notified when a new job is detected by Treeherder and create a job priority for it. Another way would be that every time a gecko decision task completes we update a runnable jobs table (to be created). If you prefer we can stick to running the command on a schedule and follow up with further optimizations later on. ---------- Without knowing enough about #2, I believe we would not need to run #1 on a schedule if we implement #2 like this: Once a day we analyze failures and update the jobs priority table as we would normally do. If a job has never been added we add it as a high value job priority. I might switch gears to this for a bit and see what is possible. For now I will query allizom.mozilla.org for failures data until there is a way for me to get a TH DB dump.

Armen [:armenzg]

Assignee

Comment 24

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master Since we're moving to django rest framework, I assume I don't need to keep treeherder/seta/{views.py,urls.py}. Do I even want a seta app enabled? Should models.py move somewhere else? I've also added some of the data for you to inspect [1] How can I run the tests quickly? ./runtests.sh runs everything. jmaher: I'm going to call the end point seta/high-value-jobs; works for you? [1] mysql> select * from seta_jobpriority where buildsystem='taskcluster' limit 5; +------+----------------------------+-----------+----------+----------+---------+-------------+-----------------+ | id | testtype | buildtype | platform | priority | timeout | buildsystem | expiration_date | +------+----------------------------+-----------+----------+----------+---------+-------------+-----------------+ | 1380 | xpcshell-1 | opt | linux64 | 5 | 5400 | taskcluster | NULL | | 1381 | reftest-6 | opt | linux64 | 5 | 5400 | taskcluster | NULL | | 1382 | jittest-6 | opt | linux64 | 5 | 5400 | taskcluster | NULL | | 1383 | mochitest-browser-chrome-6 | debug | linux64 | 5 | 5400 | taskcluster | NULL | | 1384 | xpcshell-2 | debug | linux64 | 5 | 5400 | taskcluster | NULL | +------+----------------------------+-----------+----------+----------+---------+-------------+-----------------+ 5 rows in set (0.00 sec) mysql> select * from seta_jobpriority limit 5; +----+----------------------+-----------+-------------+----------+---------+-------------+-----------------+ | id | testtype | buildtype | platform | priority | timeout | buildsystem | expiration_date | +----+----------------------+-----------+-------------+----------+---------+-------------+-----------------+ | 1 | web-platform-tests-1 | debug | windows8-64 | 5 | 5400 | buildbot | NULL | | 2 | web-platform-tests-3 | debug | windows8-64 | 5 | 5400 | buildbot | NULL | | 3 | web-platform-tests-2 | debug | windows8-64 | 5 | 5400 | buildbot | NULL | | 4 | web-platform-tests-5 | debug | windows8-64 | 5 | 5400 | buildbot | NULL | | 5 | web-platform-tests-4 | debug | windows8-64 | 5 | 5400 | buildbot | NULL | +----+----------------------+-----------+-------------+----------+---------+-------------+-----------------+ 5 rows in set (0.00 sec)

Attachment #8804853 - Flags: feedback?(wlachance)

Joel Maher ( :jmaher ) (UTC -8)

Comment 25

•

8 years ago

high-value-jobs is probably incorrect, technically we return the low value jobs by default and in practice.

Armen [:armenzg]

Assignee

Comment 26

•

8 years ago

I have a seta/v1/low-value-jobs endpoint which accepts build_system_type={buildbot,taskcluster}. Depending on the parameter it will show either Buildbot jobs or TaskCluster ones. I now have to focus on return the same data as our current production endpoints: > e.g. reftest-13 vs desktop-test-linux64-pgo/opt-reftest-13 > e.g. web-platform-tests-e10s vs Windows 7 VM 32-bit mozilla-inbound debug test marionette-e10s

William Lachance (:wlach)

Comment 27

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master Good progress!

Attachment #8804853 - Flags: feedback?(wlachance) → feedback+

Armen [:armenzg]

Assignee

Comment 28

•

8 years ago

I'm not requesting for feedback in this comment. I've pushed an administrative command which talks to alertmanager to import test failures. The current PR can be bootstrapped by simply calling this command: > ./manage.py analyze_failures I'm now going to fix this endpoint to return the data properly: http://local.treeherder.mozilla.org/api/seta/v1/low-value-jobs/?build_system_type=taskcluster I'm waiting on wlach's change where he will be unifying the Treeherder databases. At that point I could start using the new models. The following steps would be to write unit tests and look into the cerlery beats to schedule the analysis of failures. I also want to add data validators to the SETA models to ensure integrity.

Armen [:armenzg]

Assignee

Comment 29

•

8 years ago

Attached file compare.py - Compare SETA's production endpoint with localhost's (obsolete) (deleted) — Details

Hi jmaher, I've been able to modify the end point to return TaskCluster's data as returned in production. I've also modified the endpoint to allow to optionally pass priority as a value. This means that we can: * set no priority --> all jobs * priority={1,5} --> high or low value jobs Keeping the priority flag would obviously make us desire to rename the endpoint to something that does not say "low" in it. That's a discussion for later. The reason I want to bring this up is that I've noticed that the production data only gives 4 jobs with priority 1 out of a total of 670 jobs. IIUC it would mean that we only run 4 TaskCluster jobs per push when SETA is enabled. In other words, there might be a bug in the current production system and I can reproduce it on my local prototype. STR: git clone https://github.com/armenzg/treeherder.git cd treeherder git checkout seta_rewrite vagrant up vagrant ssh ./manage.py runserver Open http://localhost:8000/api/seta/v1/low-value-jobs/?build_system_type=taskcluster&priority=5

Armen [:armenzg]

Assignee

Comment 30

•

8 years ago

I also would like to discuss if we can shift away from testtype (manipulated data) and instead use ref_data_name and other values already in Treeherder. I think most of SETA's code deals with trying to keep these tied (besides fetching data from endpoints).

Comment 31

•

8 years ago

(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #30) > I also would like to discuss if we can shift away from testtype (manipulated > data) and instead use ref_data_name and other values already in Treeherder. Hey Armen, I think it's a great idea because all these stuff are fetch(directly or indirectly fetch from Treeherder).

Joel Maher ( :jmaher ) (UTC -8)

Comment 32

•

8 years ago

while I would like to have data from ref_data_name, we need to make sure that it is properly handled. If we have a bunch of random names, then SETA becomes ineffective. Can you give an example of linux64 opt mochitest-1 e10s ref_data_name for TC and BB? Possibly that will help. regarding the 4 jobs for priority1, I do not see that: http://seta.herokuapp.com/data/setadetails/?taskcluster=1&priority=1

Armen [:armenzg]

Assignee

Comment 33

•

8 years ago

Buildbot: [1] * "ref_data_name": "Ubuntu VM 12.04 x64 mozilla-inbound opt test mochitest-e10s-1", * "build_platform": "linux64", * "platform_option": "opt", TaskCluster: [2] * "ref_data_name": "desktop-test-linux64/opt-mochitest-e10s-1", * "build_platform": "linux64", * "platform_option": "opt", [1] { "option_collection_hash": "102210fe594ee9b33d82058545b1ed14f4c8206e", "ref_data_name": "Ubuntu VM 12.04 x64 mozilla-inbound opt test mochitest-e10s-1", "build_platform": "linux64", "job_type_name": "Mochitest e10s", "job_group_id": 9, "platform_option": "opt", "machine_platform_os": "linux", "machine_platform_id": 9, "job_coalesced_to_guid": null, "platform": "linux64", "result": "runnable", "state": "runnable", "job_group_name": "Mochitest e10s", "job_type_description": "multiprocess - integration tests", "machine_platform_architecture": "x86_64", "job_type_symbol": "1", "job_group_description": "", "build_system_type": "buildbot", "build_platform_id": 9, "build_architecture": "x86_64", "job_type_id": 152, "job_group_symbol": "M-e10s", "build_os": "linux" }, [2] { "state": "runnable", "job_group_name": "Mochitests executed by TaskCluster with e10s", "job_type_description": "Mochitest plain run", "ref_data_name": "desktop-test-linux64/opt-mochitest-e10s-1", "build_platform": "linux64", "job_type_symbol": "1", "build_system_type": "taskcluster", "job_type_name": "desktop-test-linux64/opt-mochitest-e10s-1", "platform_option": "opt", "job_group_symbol": "tc-M-e10s", "result": "runnable", "job_coalesced_to_guid": null, "platform": "linux64" },

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Depends on: 1178641

Armen [:armenzg]

Assignee

Comment 34

•

8 years ago

SETA needs to return ref_data_name information to Buildbot and TaskCluster. IIUC the runnable jobs api works is the following: * If a decision task ID is provided go and interact with the gecko decision task to determine what to return * For Buildbot jobs go and get the data from the runnable_job table It seems that the TC jobs are not kept on that table. I can create an administrative command that could keep the TaskCluster jobs up-to-date on that table (add and remove old jobs). I would like to know if we could call such command every time a gecko decision completes (for certain repositories) or on a schedule (every 4 hours would be sufficient).

Joel Maher ( :jmaher ) (UTC -8)

Comment 35

•

8 years ago

unless we depend more heavily on preseed.json or other override mechanisms not yet thought up, there is no need to keep the jobs up to date. But in the spirit of not coding assumptions in, we should work to update the table as frequently as possible keeping in mind our gecko decision task should be faster and we shouldn't be bogging down treeherder servers or locking the database.

Armen [:armenzg]

Assignee

Comment 36

•

8 years ago

wlach: I have this wrt to adding a task; I want to call: > ./manage.py analyze_failures Should I move most of the logic of the administrative command into a module? That way we can use it for the task as well as the administrative command; what do you think?

Flags: needinfo?(wlachance)

Armen [:armenzg]

Assignee

Comment 37

•

8 years ago

This URL: https://github.com/mozilla/treeherder/commit/c17cb0d4f68ebad60356cfda3c0f5f25f7764048

Armen [:armenzg]

Assignee

Comment 38

•

8 years ago

I forgot to mention that yesterday I added Buildbot support. I'm now focused on adding support taskcluster requests.

William Lachance (:wlach)

Comment 39

•

8 years ago

(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #36) > wlach: I have this wrt to adding a task; I want to call: > > ./manage.py analyze_failures > > Should I move most of the logic of the administrative command into a module? > That way we can use it for the task as well as the administrative command; > what do you think? Yep, this is a great idea.

Flags: needinfo?(wlachance)

Armen [:armenzg]

Assignee

Comment 40

•

8 years ago

I further improved the end point to support TaskRequest and other logic from the original source code. I'm focusing on writing tests for it.

Armen [:armenzg]

Assignee

Comment 41

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master This change creates a scheduled beat task that analyzes failures for the last 90 days. It then goes an adjusts the job priority table. This also adds an end point that returns what to schedule for Buildbot and for TaskCluster. My main focus while this is reviewed is to add more tests. Specially testing the endpoint added. Once wlach's work lands I will look into changing our usage of getting failures from alertamanager. Some of this code could also go away if we refactored the runnable jobs API to actually keep TaskCluster's data inside the runnable jobs table.

Attachment #8804853 - Flags: review?(wlachance)

Attachment #8804853 - Flags: review?(jmaher)

Attachment #8804853 - Flags: feedback+

Joel Maher ( :jmaher ) (UTC -8)

Comment 42

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master pretty close- a few issues left in the PR

Attachment #8804853 - Flags: review?(jmaher) → review-

Armen [:armenzg]

Assignee

Comment 43

•

8 years ago

Attached file Task list (deleted) — Details

This task list helps me keep an eye of big picture requirements.

William Lachance (:wlach)

Comment 44

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master Cancelling review for now.

Attachment #8804853 - Flags: review?(wlachance)

Armen [:armenzg]

Assignee

Comment 45

•

8 years ago

In my last changes I have worked on testing the endpoint which the Gecko decision task and Buildbot would be hitting. I've reworked the test infrastructure to store the job priorities directly in the test table. This make the tests slower but it makes writing tests easier and be more realistic. Also the fixtures are derived from a single data fixture. I've also squashed all the commits and rebased. Tomorrow and during the time in Hawaii I will focus on switching off requesting failures data from allizom. We will query instead directly from Treeherder. After that, we will need to add support for preseed.

Armen [:armenzg]

Assignee

Comment 46

•

8 years ago

jmaher: how does the setadetails endpoint know for Buildbot if to return low value jobs or not? I'm looking at this code (used in alertmanager): https://github.com/mozilla/ouija/blob/production-alertmanager/src/server.py#L438-L501 Is the consumer (the Buildbot master) in charge of keeping track of the counter and timeout?

Joel Maher ( :jmaher ) (UTC -8)

Comment 47

•

8 years ago

buildbot sends in inactive=1 (low value) in a url param and that is how we determine what to do. buildbot does its own coalescing and will call seta when it determines there is a need for coalescing.

Armen [:armenzg]

Assignee

Comment 48

•

8 years ago

I've addressed all comments by wlach from the review last Thursday. I will now be focusing on obtaining annotated failures directly from Treeherder.

Armen [:armenzg]

Assignee

Comment 49

•

8 years ago

My latest commit makes SETA to stop using alertmanager for failures[*]. We now use TH's JobNote model. We don't do 90 days but all job notes (which expires after 4 months). The remaining milestones are: * stop using Seta on Heroku's API * add preseed support jmaher: I've reduced the scope to everything needed to make TaskCluster work with SETA on Treeherder. Could you please review the stretch goals and see if there's any that you believe is required before going live with this? https://github.com/armenzg/treeherder/issues/1#issuecomment-263397421 I should have the PR ready for review early next week. Also, this can be tested locally with the STR [1] [*] I discovered that we still use more API of Heroku's SETA [1] STR: git clone https://github.com/armenzg/treeherder.git cd treeherder git checkout seta_rewrite vagrant up vagrant ssh ./manage.py runserver Open http://localhost:8000/api/project/mozilla-inbound/seta/v1/job-priorities/?build_system_type=taskcluster Open http://localhost:8000/api/seta/v1/failures-fixed-by-commit/ On another tab: vagrant ssh ./manage.py ingest_push mozilla-inbound d70bb3fba851 ./manage.py ingest_push mozilla-inbound 5a5d50943bbc Open http://localhost:8000/#/jobs?repo=mozilla-inbound Annotate few jobs with "fixed by commit" and the latest revision ./manage.py update_runnable_jobs ./manage.py analyze_failures

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 50

•

8 years ago

I think the stretch goals are nice- obviously all of them would be good to have, and I don't think we should stop working on SETA until that is the case. A few things I care about are the unittests- what coverage do we have? Also, what is the strategy for dealing with fixed_by_commit='' ?

Flags: needinfo?(jmaher)

Armen [:armenzg]

Assignee

Comment 51

•

8 years ago

I'm ignoring blank fixed_by_commit since I _assume_ it polutes the results. I don't know the right syntax to make the treeherder/etl/seta.py be include but exclude all other files in that directory, thus, running the report separately [1] from the main one [2]. 50% coverage for everything under treeherder/seta and 90% for treeherder/etl/seta.py [1] vagrant ~/treeherder $ coverage run --source=treeherder/etl -m py.test tests/seta ... treeherder/etl/seta.py 97 8 18 4 90% [2] vagrant ~/treeherder $ coverage run --source=treeherder/seta -m py.test tests/seta vagrant ~/treeherder $ coverage report Name Stmts Miss Branch BrPart Cover ----------------------------------------------------------------------------------------------- treeherder/seta/__init__.py 0 0 0 0 100% treeherder/seta/analyze_failures.py 38 12 10 1 65% treeherder/seta/common.py 9 0 2 0 100% treeherder/seta/job_priorities.py 71 11 28 7 78% treeherder/seta/management/__init__.py 0 0 0 0 100% treeherder/seta/management/commands/__init__.py 0 0 0 0 100% treeherder/seta/management/commands/analyze_failures.py 11 11 0 0 0% treeherder/seta/management/commands/update_runnable_jobs.py 8 8 0 0 0% treeherder/seta/models.py 60 16 16 1 67% treeherder/seta/runnable_jobs.py 28 15 4 0 41% treeherder/seta/seta.py 119 105 52 0 8% treeherder/seta/tasks.py 4 4 0 0 0% treeherder/seta/update_job_priority.py 75 10 22 3 85% ----------------------------------------------------------------------------------------------- TOTAL 423 192 134 12 50%

Joel Maher ( :jmaher ) (UTC -8)

Comment 52

•

8 years ago

one way to validate this is looking at the p1 tasks compared to heroku- we could adjust it to support 4 months of data (or hack treeherder seta to use 90 days) and compare the output- ideally it would be identical. The coverage data is looking great- I think our goal was 60% coverage.

Armen [:armenzg]

Assignee

Comment 53

•

8 years ago

I've fixed a small regression and I've fixed the last call to any other SETA/ouija server (AFAICT). I'm now moving on to the preseed support.

Armen [:armenzg]

Assignee

Comment 54

•

8 years ago

Attached file Commit for preseed support (obsolete) (deleted) — Details

jmaher: I've landed preseed support, however, I have difficult understanding some of the intentions of the expiration_date field. Why do we set it a year from now? Why do we clear the expiration a year later? Thinking of the talos rules, what do we gain by setting an expiration date or not? In general, what is the meaning for the values expiration date can have? A) a date is set - analyze failures will _not_ change the values of priority/timeout B) a date is not set (null) - analyze failures could change the values of priority/timeout Should we have a value for "never expire this job priority"? (e.g. set expiration_date to '*' or 'forever'. On a wider note, I don't think we need this preseed support. We could get away with allowing you to login and insert new job priorities or modify existing ones.

Attachment #8820092 - Flags: feedback?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 55

•

8 years ago

preseed.json is a hack, but the concept serves a purpose to allow us to override things temporarily or longer term. I am glad you brought up the question about setting the expiration for one year- that is something which was added in trying to support taskcluster and while it will work, there are probably many issues- anything that has an expiration of one year should be never expires. Instead of throwing away preseed support right now, I would prefer that we propose a better solution using in-tree configs or some other in-tree file- and I see that changing a lot in the world of per manifest reporting and scheduling.

Joel Maher ( :jmaher ) (UTC -8)

Comment 56

•

8 years ago

Comment on attachment 8820092 [details] Commit for preseed support this looks sane. One thing to consider on changing the date for preseed no expiration is that the jobpriorities field might be datetime and we could have restrictions on the data we put in there or odd behaviors of the default value we pull for null.

Attachment #8820092 - Flags: feedback?(jmaher) → feedback+

Armen [:armenzg]

Assignee

Comment 57

•

8 years ago

Comment on attachment 8820092 [details] Commit for preseed support https://github.com/mozilla/treeherder/pull/1957/commits/22cda37c35a49af5eb7a75a293b47f294f8b33a5

Attachment #8820092 - Flags: feedback+ → feedback?

Armen [:armenzg]

Assignee

Comment 58

•

8 years ago

I'm now using the 2100 as the expiration date.

Armen [:armenzg]

Assignee

Comment 59

•

8 years ago

Comment on attachment 8820092 [details] Commit for preseed support >https://github.com/mozilla/treeherder/pull/1957/commits/66deb3ac3ad7d04338bcfd0c7a2c6661202326a2

Attachment #8820092 - Attachment is obsolete: true

Attachment #8820092 - Flags: feedback?

Armen [:armenzg]

Assignee

Comment 60

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master This is ready to be reviewed.

Attachment #8804853 - Flags: review?(wlachance)

Attachment #8804853 - Flags: review?(jmaher)

Attachment #8804853 - Flags: review-

Joel Maher ( :jmaher ) (UTC -8)

Comment 61

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master I think this is really close- I would like to address the comments left on the PR before making this an r+. Great job on getting this into a reviewable state and so complete!

Attachment #8804853 - Flags: review?(jmaher) → review-

Armen [:armenzg]

Assignee

Comment 62

•

8 years ago

Thanks jmaher! I will look into it. wlach: I addressed what we discussed in my last commit https://github.com/mozilla/treeherder/pull/1957/commits/f7c1bcc55657c80fc6380206eb106a3cd547291b

William Lachance (:wlach)

Comment 63

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master A few small things I would like changed but didn't see any huge red flags (though I confess I don't understand seta all that well). I'd say we're pretty close to being able to merge this and seeing how it works. Please re-r? when you think you have something ready to go.

Attachment #8804853 - Flags: review?(wlachance) → review-

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Blocks: 1286358

Armen [:armenzg]

Assignee

Comment 64

•

8 years ago

I've addressed most of jmaher's feedback. I will be tackling wlach's feedback soon. Current coverage is at 64%. vagrant ~/treeherder $ coverage run --source=treeherder/seta,treeherder/etl/seta.py -m py.test tests/seta && coverage html -i && coverage report -i ============================================================================================================ test session starts ============================================================================================================= platform linux2 -- Python 2.7.12, pytest-3.0.3, py-1.4.31, pluggy-0.4.0 Django settings: tests.settings (from ini file) rootdir: /home/vagrant/treeherder, inifile: pytest.ini plugins: django-3.0.0 collected 16 items tests/seta/test_analyze_failures.py . tests/seta/test_high_value_jobs.py . tests/seta/test_job_priorities.py . tests/seta/test_models.py ..... tests/seta/test_update_job_priority.py ........ ========================================================================================================= 16 passed in 19.00 seconds ========================================================================================================= Name Stmts Miss Branch BrPart Cover ----------------------------------------------------------------------------------------------- treeherder/etl/seta.py 57 1 18 1 97% treeherder/seta/__init__.py 0 0 0 0 100% treeherder/seta/analyze_failures.py 32 10 10 1 64% treeherder/seta/common.py 9 0 2 0 100% treeherder/seta/high_value_jobs.py 72 30 36 4 50% treeherder/seta/job_priorities.py 69 9 26 7 81% treeherder/seta/management/__init__.py 0 0 0 0 100% treeherder/seta/management/commands/__init__.py 0 0 0 0 100% treeherder/seta/management/commands/analyze_failures.py 11 11 0 0 0% treeherder/seta/management/commands/load_preseed.py 8 8 0 0 0% treeherder/seta/management/commands/update_runnable_jobs.py 8 8 0 0 0% treeherder/seta/models.py 60 16 16 1 67% treeherder/seta/preseed.py 39 39 20 0 0% treeherder/seta/runnable_jobs.py 33 6 6 2 74% treeherder/seta/tasks.py 7 7 0 0 0% treeherder/seta/update_job_priority.py 76 10 22 3 85% ----------------------------------------------------------------------------------------------- TOTAL 481 155 156 19 64%

Armen [:armenzg]

Assignee

Comment 65

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master I've addressed all raised concerns. I've squashed older commits but left the ones since you had a look. I need to look at a test failure on Travis and the fix should be a small delta.

Attachment #8804853 - Flags: review?(wlachance)

Attachment #8804853 - Flags: review?(jmaher)

Attachment #8804853 - Flags: review-

William Lachance (:wlach)

Comment 66

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master Please let me know when travis is happy so I can test this on stage before we actually merge the PR. Congrats on getting this large chunk of work completed!

Attachment #8804853 - Flags: review?(wlachance) → review+

Armen [:armenzg]

Assignee

Comment 67

•

8 years ago

Thank you Will! As I was suspecting we have an _intermittent_ orange. I've been suspecting of this for a while but I now have proof of it. The fixture has 39643b5073cfb9473042884bfd3ced0289b3d7dd twice while the produced data has 773fa0d78b75d1d129646649ff6c5896c8e4edbf twice. wlach: should I ignore this test failure? https://travis-ci.org/mozilla/treeherder/jobs/186172233 (Pdb) pp(failures_fixed_by_commit) {u'you look like a man-o-lantern': [(u'1824f3350e2152d3272de09c8ec9b103cd3ad667', u'debug', u'osx-10-6'), (u'2c057fb1cd2b6f8bd74121238d0287063d8f5562', u'opt', u'b2g-device-image'), (u'39643b5073cfb9473042884bfd3ced0289b3d7dd', u'debug', u'b2g-emu-jb'), (u'39643b5073cfb9473042884bfd3ced0289b3d7dd', u'debug', u'b2g-emu-jb'), (u'58cab069cf08211159774de948094dd963fb9d44', u'debug', u'osx-10-7'), (u'8d63b645c221e45ccf4a2fbc087fa0088535042c', u'debug', u'windowsxp'), (u'93e11a88ea92cfa4d930072b22afca002c53f249', u'opt', u'b2g-device-image'), (u'df822a71f7727bf3cb6d3b2aeba970d8a6d33ea1', u'opt', u'b2g-emu-jb'), (u'f2e06a487b304c16843fe53782a5888e5582be6b', u'debug', u'windows7-32'), (u'fe762310e2f921b93df828ac68ee958f19e2d759', u'debug', u'windows8-32')]} (Pdb) pp(data) {u'you look like a man-o-lantern': [(u'1824f3350e2152d3272de09c8ec9b103cd3ad667', u'debug', u'osx-10-6'), (u'2c057fb1cd2b6f8bd74121238d0287063d8f5562', u'opt', u'b2g-device-image'), (u'58cab069cf08211159774de948094dd963fb9d44', u'debug', u'osx-10-7'), (u'773fa0d78b75d1d129646649ff6c5896c8e4edbf', u'debug', u'b2g-emu-jb'), (u'773fa0d78b75d1d129646649ff6c5896c8e4edbf', u'debug', u'b2g-emu-jb'), (u'8d63b645c221e45ccf4a2fbc087fa0088535042c', u'debug', u'windowsxp'), (u'93e11a88ea92cfa4d930072b22afca002c53f249', u'opt', u'b2g-device-image'), (u'b386afd1afbc09b5f7f13af8607a65a87e105002', u'opt', u'b2g-emu-jb'), (u'f2e06a487b304c16843fe53782a5888e5582be6b', u'debug', u'windows7-32'), (u'fe762310e2f921b93df828ac68ee958f19e2d759', u'debug', u'windows8-32')]}

Joel Maher ( :jmaher ) (UTC -8)

Comment 68

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master great work Armen!

Attachment #8804853 - Flags: review?(jmaher) → review+

Armen [:armenzg]

Assignee

Comment 69

•

8 years ago

Attached file Commits view (obsolete) (deleted) — Details

Hi Will, signatures were not being generated properly inside of the tests because in the sample data we were using 'buildername' instead of 'reference_data_name'. I describe this in the commit message. I addressed the data fixture properly. No code changes. I don't know why the signatures were not bothering me more. https://github.com/mozilla/treeherder/pull/1957/commits/95a4c849434b8182fb3b3199ede6428bf7f54ea9 Do you want me to fix this commit in a different PR? Or simply land 2 commits, one for SETA and one for the sample data fix? I'm now going to look into a bunch of test failures that touching the data is causing :/

Attachment #8821647 - Flags: feedback?(wlachance)

Armen [:armenzg]

Assignee

Comment 70

•

8 years ago

Comment on attachment 8821647 [details] Commits view Hi Will, This is ready to go to stage. All tests are green: https://travis-ci.org/mozilla/treeherder/builds/187054914 I worked a bit more on the job_data.txt and adjusted the tests that depend on it. The first commit shows everything you've already reviewed. The second commit shows the sample data and test changes. The third commit shows a fixture change. This commit should be folded with the first. In other words, commit 1 & 3 would be squashed.

Attachment #8821647 - Attachment description: Fix sample data → Commits view

Attachment #8821647 - Flags: feedback?(wlachance) → review?(wlachance)

GitHub Autolander Bot

Comment 71

•

8 years ago

Attached file [treeherder] armenzg:sample_data > mozilla:master (deleted) — Details

Treeherder GitHub Bugbot

Comment 72

•

8 years ago

Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/3a128e455dc2e301f77064b6ca02b431e1cdc6e7 Bug 1306709 - Fix sample data to make signatures be buildernames rather than a hash (#2044) _load_jobs() from treeherder/model/derived/jobs.py will add a signature hash to signatures instead of the buildername if we don't use 'reference_data_name' for the buildername. This change does the following: * Replace 'buildername' for 'reference_data_name' * Replace 'b2g26_v1_2' for 'release'

William Lachance (:wlach)

Comment 73

•

8 years ago

Comment on attachment 8821647 [details] Commits view As discussed on irc, the fixture/test changes are fine, we've merged them in. In the future, please don't attach github commits to bugs manually. Autolander will do that for you. :)

Attachment #8821647 - Flags: review?(wlachance)

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Attachment #8808981 - Attachment is obsolete: true

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Attachment #8821647 - Attachment is obsolete: true

Armen [:armenzg]

Assignee

Comment 74

•

8 years ago

What happened today: * We merge to master the code which fixes TH's BB sample data * We rebased the branch and squash the small fixture fix * We deployed the code to Heroku's treeherder-stage app * We noticed a DoesNotExist error * wlach gave me write access to the treeherder repo and taught me how to deploy changes * It seems that job notes on production can have no associated job (very odd - bug 1326090) * A fix has been commited on the seta_rewrite branch * jmaher noticed some issues in the priorities for the API What's next? * In January I will address the issue from jmaher & the comments from emorley * Once these are addressed I don't think there will be anymore issues before merging to masters * Look into enabling consuming this data from Buildbot and TaskCluster * Smaller items listed in https://github.com/armenzg/treeherder/issues/1#issuecomment-263397421

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Blocks: 1326102

Armen [:armenzg]

Assignee

Comment 75

•

8 years ago

Today I will be focusing on addressing emorley's concern. I will address the issue raised by jmaher in a follow up PR as I'm more interested seeing this merged and running in production. I want to polish the product for bugs that show up with real production data.

Armen [:armenzg]

Assignee

Comment 76

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master 2nd commit for emorley - addressing your feedback 3rd commit for wlach - adding back eleven jobs with notes fixture

Attachment #8804853 - Flags: review?(wlachance)

Attachment #8804853 - Flags: review?(emorley)

Attachment #8804853 - Flags: review+

William Lachance (:wlach)

Comment 77

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master lgtm!

Attachment #8804853 - Flags: review?(wlachance) → review+

Armen [:armenzg]

Assignee

Comment 78

•

8 years ago

I would like to file a new bug with a new PR for any discovered issues on stage. Unless I hear any objections I will be filing it sometime today.

Armen [:armenzg]

Assignee

Comment 79

•

8 years ago

I've created a Papertrail alert for treeherder-stage to let me know of any issues. "((api AND seta) OR "seta ") AND (error OR warning OR traceback)" I hope I got the syntax right and I will add the same in production if it works as expected.

Armen [:armenzg]

Assignee

Comment 80

•

8 years ago

Attached file Commit with feedback from emorley (obsolete) (deleted) — Details

The branch is currently messed up. Here's the commit that I want emorley to look at. I will fix "seta_rewrite" to contain the 3 commits I want merged (We will squash them before merging). "seta_followup" contains all commits.

Attachment #8823435 - Flags: review?(emorley)

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Attachment #8804853 - Flags: review?(emorley)

Armen [:armenzg]

Assignee

Comment 81

•

8 years ago

Comment on attachment 8823435 [details] Commit with feedback from emorley My apologies for the unnecesary noise. I've fixed the branch.

Attachment #8823435 - Attachment is obsolete: true

Attachment #8823435 - Flags: review?(emorley)

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Attachment #8804853 - Flags: review?(emorley)

Ed Morley [:emorley]

Comment 82

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master Great work with this! I've left some comments on the PR - the ones that I'd like to see addressed (or at least discussed if not changing): * Why load_preseed runs every 24 hours and not just on deploy (https://github.com/mozilla/treeherder/pull/1957#discussion_r94605121) * Using fetch_json rather than make_request (https://github.com/mozilla/treeherder/pull/1957#discussion_r94604223) * Switching self-API calls to ORM usages to avoid regressing bug 1211414 (https://github.com/mozilla/treeherder/pull/1957#discussion_r94604390) * Clarifying why behaviour varies according to User-Agent and possibly switching to an API parameter/different endpoint name instead (https://github.com/mozilla/treeherder/pull/1957#discussion_r94606474) * Clarifying what happens to taskcluster/buildbot CI uptime if there are issues with accessing the SETA Treeherder API endpoint (https://github.com/mozilla/treeherder/pull/1957#discussion_r94606474) Almost there! :-)

Attachment #8804853 - Flags: review?(emorley) → review-

Armen [:armenzg]

Assignee

Comment 83

•

8 years ago

I've filed a follow up bug 1328618.

Armen [:armenzg]

Assignee

Comment 84

•

8 years ago

I've moved most of the logic behind the runnable jobs API into treeherder/etl/runnable_jobs.py (formerly known as treeherder/etl/allthethings.py): https://github.com/mozilla/treeherder/pull/1957/commits/6a3efb12293e3e98d8a2e96f2a6a9083a6337c64 Tomorrow I will spend some time to completely decouple it from SETA and address your remaining comments in the PR.

Armen [:armenzg]

Assignee

Comment 85

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master I've addressed all feedback. The following commits can be squashed with the main SETA commit: * Load preseed.json on start up rather than on a schedule * Stop calling runnable jobs API but use list_runnable_jobs * The Gecko decision task should call the API with increase_counter The other commits can land separatedely. I can fix the commits when I get an r+.

Attachment #8804853 - Flags: review- → review?(emorley)

Ed Morley [:emorley]

Comment 86

•

8 years ago

Comment on attachment 8804853 [details] [treeherder] armenzg:seta_rewrite > mozilla:master Awesome! Thank you for your persistence :-)

Attachment #8804853 - Flags: review?(emorley) → review+

Treeherder GitHub Bugbot

Comment 87

•

8 years ago

Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/d40b042baea2a0fc96c9dcd865d10de08e94694e Bug 1306709 - Move runnable jobs api logic into the runnable_jobs module Until now, the only way to get runnable jobs information is by querying the Treeherder runnable jobs api. After this change, Treeherder modules won't need to call the API but import the same function the API calls (list_runnable_jobs()). https://github.com/mozilla/treeherder/commit/a62081f823deca9366c62d37a331e9910215aca6 Bug 1306709 -Job objects can now determine the platform option We have few places that try to calculate the platform_option of a job. This change adds a method to the Job class and an OptionCollectionManager to help it. https://github.com/mozilla/treeherder/commit/ed8e39221da464af01dadc238886da077fe9f9ca Bug 1306709 - Add SETA to Treeherder

Armen [:armenzg]

Assignee

Comment 88

•

8 years ago

This has now been merged to master which means that treeherder-stage has the changes live (as long as anyone deploying their own branch is based off master). I will be looking at treeherder-stage for valid results and follow up with any follow ups via bug 1328618.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

William Lachance (:wlach)

Updated

•

8 years ago

Depends on: 1330015

Ed Morley [:emorley]

Updated

•

8 years ago

Depends on: 1335151

Ed Morley [:emorley]

Updated

•

8 years ago

Component: Treeherder → Treeherder: SETA

Updated

•

4 years ago

Product: Tree Management → Tree Management Graveyard

You need to log in before you can comment on or make changes to this bug.