Closed Bug 1306707 Opened 8 years ago Closed 2 years ago

Remove support for 12 character ('short') SHA revisions

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: wlach, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(5 files, 2 obsolete files)

I'd like us to consider unsupporting submitting <40 character revisions. Is there any reason we shouldn't? We have a bunch of code associated with keeping the short and long revision fields in sync that I'd like not to have to rewrite when moving stuff into the main db (bug 1306674).
Do we have any submitters who are giving us 12 character revisions but not 40? (or 12 then 40 later?)
Flags: needinfo?(emorley)
Flags: needinfo?(cdawson)
There are still some job types (eg l10n jobs) that use the 12 character SHA. Though: (a) I think we should just give them a hard deadline and then leave it up to them (b) Even if we have to support 12 character jobs, perhaps we can remove support from eg push ingestion / the resultset API etc New Relic attribute recording is likely a good way forwards here.
Flags: needinfo?(emorley)
(In reply to Ed Morley [:emorley] from comment #2) > There are still some job types (eg l10n jobs) that use the 12 character SHA. > ... Ok, that's annoying but I guess we'll just live with it in the short term. At least the business logic to handle this will be easier to write with the Django ORM.
Flags: needinfo?(cdawson)
I think we should at least identify which jobs still are using 12 characters and file bugs for them. The l10n jobs (and I imagine any/most other culprits) are in builds-4hr - I'll have a shot at analysing now.
Even builds-pending/builds-running contain 12 character SHAs :-( Builds-pending: { "pending": { "try": { "1a855f92cc45": [ { "submitted_at": 1475064907, "id": 125865687, "buildername": "Windows 10 64-bit try debug test mochitest-gl-1" }, Builds-running (at least it has the full SHA in there): { "running": { "autoland": { "cdf154426881": [ { "submitted_at": 1475263520, "buildername": "Android armv7 API 15+ autoland build", "start_time": 1475263723, "number": 320, "claimed_by_name": "buildbot-master74.bb.releng.usw2.mozilla.com:/builds/buildbot/build1/master", "request_ids": [ 126258716 ], "last_heartbeat": 1475264067, "id": 127438343, "revision": "cdf154426881032934224b0ad60bc366afebff41" } ],
Attached file Builds-4hr jobs with 12 char SHA on 2016-09-29 (obsolete) (deleted) —
Main candidates are: * PGO builds plus their corresponding tests * Various jobs run on comm-* (though these are < tier-3) * Desktop+Android l10n nightlies
Summary: Consider unsupporting submitting 12 character ('short') revisions → Remove support for 12 character ('short') SHA revisions
Depends on: 1306718
Depends on: 1306720
Depends on: 1306722
Depends on: 1306723
Some places are still using 12 character SHAs in UI URLs too. eg on: https://hg.mozilla.org/mozilla-central/rev/af6e01b8574b "[default view]" -> https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=87cd291d2db6 "[failures only]" -> https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=87cd291d2db6&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception (the perfherder links on that page use 40 character SHAs however) Unfortunately we can't use server logs to see how many people are doing this, since the fragment part of the URL is not sent in the HTTP request. Think we'll have to resort to GitHub code search etc.
Depends on: 1306834
Depends on: 1306835
Depends on: 1306837
Depends on: 1306839
Will, I still think we can simplify things on our side somewhat, ie we can remove support for 12 character SHAs from: * Resultsets ingestion via our API * Resultsets ingestion via json-pushes polling * Resultsets ingestion via Pulse (if not already done; I can't find the schema for it) * [DONE] Jobs ingestion via Pulse (see pulse-job.yml) * The `revision` table (since I'm pretty sure for buildbot job ingestion we only need short_revision on the result_set table, right?) Similarly, whilst the revision hash removal (bug 1257602) is blocked on some older repos (though we should double-check which - given B2G EOL and ESRs only run taskcluster as tier <1 so can probably just have those jobs hidden), we can probably remove support from many other places.
I mean we could even remove `short_revision` from the `result_set` table if we were prepared to do LIKE queries against `long_revision` for these legacy buildbot job types and/or UI URLs that are still using 12 character SHAs.
Blocks: 1306674
Attachment #8796693 - Attachment is obsolete: true
Checking again shows many cases fixed. Left are some PGO jobs that are instances of bug 1313274, plus aurora instances of bug 1306722 (needs backporting).
Attachment #8796694 - Attachment is obsolete: true
Depends on: 1340265
Now that bug 1306718 is deployed, builds-{pending,running} use the full 40 character SHAs. This just leaves the self-serve triggered builds (eg sheriffs triggering PGO runs but only pasting in 12 character SHAs), which is bug 1313274. An alternative and possibly quicker fix for that is to make it easier to copy the full SHA from Treeherder (bug 1340265).
Assignee: wlachance → emorley
Blocks: 1178227
I'm not going to wait on bug 1313274 or bug 1340265 any longer. Let's just do this.
Priority: -- → P2
Depends on: 1427311
Status: NEW → ASSIGNED
Blocks: 1498644
No longer depends on: 1340265
Assignee: emorley → cdawson

After deploying, New Relic reported instances of:

treeherder.model.models:MultipleObjectsReturned: get() returned more than one Push -- it returned 2!

and

treeherder.model.models:DoesNotExist: Push matching query does not exist.

I believe this is since repository is no longer in a few of the queries, eg:
https://github.com/mozilla/treeherder/pull/4534/files#diff-d1c18844859a68f2ac3f6aa32603cdbeL54

I've rolled back the prod deploy for now - master will need a fix before we deploy again.

Flags: needinfo?(cdawson)

This new PR should fix the issue.

Flags: needinfo?(cdawson)
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Depends on: 1525789

Actually, reopening since the API still supports 12 character revisions and their usage is being tracked via:
https://github.com/mozilla/treeherder/pull/4534/files#diff-d1c18844859a68f2ac3f6aa32603cdbeR272

The API now does the optimal MySQL query if provided with a 40 character SHA, however the fallback should be removed to prevent people from accidentally relying on it (eg when writing tools that comment on bugs etc).

When doing so it may be worth making the API explicitly return a JSON error message and HTTP 400 for SHAs that are not 40 characters (and also ensuring it's bubbled up to the jobs-view and Perfherder UIs) to prevent confusing UX from 404s from seemingly valid SHAs.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

This would be good to get done. But I'm not going to get to it anytime soon.

Assignee: cdawson → nobody
Priority: P2 → P3

cleaning up old bugs we haven't fixed

Status: REOPENED → RESOLVED
Closed: 6 years ago2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: