Closed Bug 1310386 Opened 8 years ago Closed 8 years ago

Perfherder's signatures should always be unique based on user-facing properties

Categories

(Tree Management :: Perfherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: rwood)

References

Details

Attachments

(2 files, 1 obsolete file)

Bug 1309294 took days to recover from. I think I've fixed our signature generation algorithm to be less brittle but between this and other problems I've become convinced we need: (1) To ensure referential integrity of perfherder signatures across their actual *properties* (test name, platform, etc.) (2) To de-emphasize the use of signature hashes to identify signatures In principle (1) should be easy, except we currently use a json field for some additional signature properties (e10s, shell) which can't be indexed. I think we can move to a simple char field for these without any issue though. For (2), I'd like to move to a model where we re-use an old signature hash for an old signature if none of the significant properties have actually changed. i.e. we use it once to identify what's coming in if it's new, otherwise don't touch it. I'd love to just remove the signature hashes altogether (they've never been that useful), but that's a lot harder.
Summary: Perfherder's signatures should always be unique for properties → Perfherder's signatures should always be unique based on user-facing properties
May I suggest adding a table to explicitly map signatures to the compound keys they represent? Call this the "signature table". This table will require an extra join, but will serve as a step in the direction of [2]. * Multiple signatures can map to the same properties (and provide backwards compatibility with old links). * Using a join, it can be used to perform signature lookup, as the current API requires. * It will remove signatures from the main table, preparing for the eventual removal of signatures. Maybe this is already part of your plan. As I think about this more: PerfHerder's *signature table* may be more complex than a simple signature<->compound key lookup. The realized signature table is still possible, but may have a bit of opaque logic in the background that makes it feel unclean. There may be a better way: PerfHerder's crash signatures are just one way of defining a signature, if we reify how PerfHerder arrives at its signatures, and we call them "rules", then we can use a more general rule system instead of an explicit table, or hashing technique. I am proposing that crash-stats use an identical rule system [3]: A reified-rule system that allows us (and our machines) to manage multiple crash "signatures" across disparate systems based on more than just compound keys. These signatures are no longer strings, rather mini JSON documents that clearly describe what the signature represents. PerfHerder need not implement, or import, a general rule system like crash-stats: PerfHerder signatures have a common pattern; they can be implemented in code, limited to handling the particular class of rules it actually uses. Please forgive my naivety when it comes to PerfHerder's important properties, and their names, but an example will help here: Here is an example signature (a rule), which is the equality operator on a block of property:value pairs: > {"eq":{ > "suite":"kraken", > "build_platform":"win32", > "build_options":"debug", > "branch":"mozilla-inbound" > }} This rule is data, so we can name it ("kraken on windows debug on inbound"), and we can give it an id (the hash value Perfherder uses) and even add alias ids if we want (previous hash values used for that same rule). The rule can be translated to SQL, and provides hints to a possible database index required to make it fast. Primarily, I am suggesting making PerfHerder signatures explicit so it is easier for both internal and external automation to use. [3] https://bugzilla.mozilla.org/show_bug.cgi?id=1280658#c12
Hey Kyle, Perfherder's signatures are pretty simple. The properties that distinguish each one are limited: framework id (talos, awsy, etc), platform id, option collection id, suite name, test name, extra labels. You could *almost* fit these properties into 40 characters (the current length of the hash) with no loss of information. In theory, test/suite names could exceed that length, but not often. So I'm not sure if a generalized solution is really necessary here: I'd rather just let MySQL take care of ensuring uniqueness via a unique index. That said, I kind of like the idea of pulling the signatures into a separate table as long as we have to maintain them, as that might make the system easier to reason about. I don't think the extra join is a big deal at all. Will have to think about it. :) I think the new signature hashing algorithm in Perfherder doesn't have any huge bugs in it, so we're not in a huge rush.
Hey Rob, would this be something you're interested in working on? (when you're back from pto, of course)
Flags: needinfo?(rwood)
(In reply to William Lachance (:wlach) from comment #4) > Hey Rob, would this be something you're interested in working on? (when > you're back from pto, of course) Sure!
Assignee: wlachance → rwood
Status: NEW → ASSIGNED
Flags: needinfo?(rwood)
Attachment #8815307 - Flags: review?(wlachance)
Comment on attachment 8815307 [details] [treeherder] rwood-moz:bug1310386-part1 > mozilla:master This looks fine, however, I'd prefer to land this as part of a larger patch series.
Attachment #8815307 - Flags: review?(wlachance)
(In reply to William Lachance (:wlach) from comment #7) > Comment on attachment 8815307 [details] > [treeherder] rwood-moz:bug1310386-part1 > mozilla:master > > This looks fine, however, I'd prefer to land this as part of a larger patch > series. Ok, wasn't sure if it was best to land the DB changes alone first or not. So shall I include the changes to the data ingestion code here too then? And then land that, then part 2 will be a patch for the existing data migration, is that the right approach here? Thanks for the feedback!
Flags: needinfo?(wlachance)
(In reply to Robert Wood [:rwood] from comment #8) > (In reply to William Lachance (:wlach) from comment #7) > > Comment on attachment 8815307 [details] > > [treeherder] rwood-moz:bug1310386-part1 > mozilla:master > > > > This looks fine, however, I'd prefer to land this as part of a larger patch > > series. > > Ok, wasn't sure if it was best to land the DB changes alone first or not. So > shall I include the changes to the data ingestion code here too then? And > then land that, then part 2 will be a patch for the existing data migration, > is that the right approach here? Thanks for the feedback! I think we can get away with a single commit which incorporates: * the db schema changes * ingestion changes to update or create signatures with the new property * a migration script to update legacy data Thanks!
Flags: needinfo?(wlachance)
Attachment #8815307 - Attachment is obsolete: true
Attachment #8821260 - Flags: review?(wlachance)
Comment on attachment 8821260 [details] [treeherder] rwood-moz:bug1310386 > mozilla:master Looks really good, see PR for some requested changes.
Attachment #8821260 - Flags: review?(wlachance)
Comment on attachment 8821260 [details] [treeherder] rwood-moz:bug1310386 > mozilla:master Thanks Will. PR updated, added second commit to update the webapi accordingly.
Attachment #8821260 - Flags: review?(wlachance)
Comment on attachment 8821260 [details] [treeherder] rwood-moz:bug1310386 > mozilla:master Looks good! One issue that needs to be fixed. We need to land this with some care, let's do that together sometime this week when we both have time. Let me know when you've fixed the one issue I found and we can schedule something.
Attachment #8821260 - Flags: review?(wlachance) → review+
Comment on attachment 8825468 [details] [treeherder] rwood-moz:bug1310386-part2 > mozilla:master This is the final part, removing the deprecated extra_properties field from the PerformanceSignatures table, and add the new extra_options field to the fields ensuring the signature is unique.
Attachment #8825468 - Flags: review?(wlachance)
Comment on attachment 8825468 [details] [treeherder] rwood-moz:bug1310386-part2 > mozilla:master Looks great, thanks Rob!
Attachment #8825468 - Flags: review?(wlachance) → review+
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/1ef230624aac01aabfa45fe63a2ff8c97a6a0961 Bug 1310386 - part two: remove extra_properties, and use extra_options to make signature unique (#2075)
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Depends on: 1331745
Depends on: 1336706
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: