Closed Bug 1214531 Opened 9 years ago Closed 9 years ago

Autophone - submitting Talos jobs to Treeherder causes 500 Server errors.

Categories

(Testing Graveyard :: Autophone, defect)

defect
Not set
normal

Tracking

(firefox44 affected)

RESOLVED FIXED
Tracking Status
firefox44 --- affected

People

(Reporter: bc, Assigned: jmaher)

References

Details

Attachments

(1 file)

We have been getting 500 Server Errors when submitting Talos jobs to Treeherder. It appears to be related to the job_group_id as some of the errors returned a response {"detail": "job_group_id"}. The two Talos jobs use the following: job_name = Autophone Tp4m job_symbol = tpn group_name = Autophone group_symbol = A [treeherder] job_name = Autophone Tsvg job_symbol = svg group_name = Autophone group_symbol = A emorley: Do we need to register these job/groups names and symbols before we begin submitting jobs to Treeherder?
Flags: needinfo?(emorley)
This is presumably something weird with the job signatures/job groups - Cameron knows this code better than I - could you take a look? :-)
Flags: needinfo?(emorley) → needinfo?(cdawson)
There's a conflict with the symbols/names: Jobs: id job_group_id symbol name 153 7 tpn "Talos tp nochrome" 4284 24 tpn "Autophone Tp4m" Groups: id symbol name 7 T "Talos Performance" 24 A Autophone Though this SHOULD be just cause it to choose the wrong group (of T). Not sure yet where the 500 is coming from. I'm trying to build a test for that.
Flags: needinfo?(cdawson)
Interesting. The entry for "svg" in the job_type table has a job_group_id of NULL. This really shouldn't be possible. It should have the id for "?" group, if there was no group. So I'm trying to create a test case that can cause this, but no luck yet. Here's what I'm seeing on stage, however: id job_group_id symbol name 4343 NULL ym1-60 "MSE YouTube Playback Medium1 60min" 4344 NULL svg "Autophone Tsvg" 4345 NULL m-e10s "MSE Video Playback (e10s)" Interesting that they're all consecutive ids. Hrm...
On production, the only record that has a NULL job_group_id is "X19" [TC] Mulet XPCShell I went ahead and fixed that record on stage. Perhaps it was a fluke that data was being ingested right when we pushed new code and it got vexed? Really hard to say. I'm not sure what group the other symbols should belong to, so I didn't fix those. Ahh data integrity errors. What fun...
lastly there are 2 other changes which allow submission to allizom: 1) "./manage.py add_perf_framework autophone" on the server 2) adjusting the json object sent from autophone to have "framework: {name: autophone}" instead of what was defined in https://bugzilla.mozilla.org/show_bug.cgi?id=1175295#c3 as "framework: autophone" I still have to verify svg and rck3 work- it sounds like the investigation into the data integrity that camd did will help reduce one more headache!
Blocks: 1170685
Assignee: bob → jmaher
Attachment #8675044 - Flags: review?(bob)
Attachment #8675044 - Flags: review?(bob) → review+
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
pulled on the server though not activated yet.
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: