Closed
Bug 1277575
Opened 8 years ago
Closed 8 years ago
Autoclassification exceptions "Duplicate entry '4744784-1752-1' for key 'failure_match_failure_line_id_7aeee577_uniq'"
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P2)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jgraham, Assigned: jgraham)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
This should only happen if we try to create multiple failure matches for a single line using the same matcher with the same classified failure. That could happen if the matching process aborts half way through for some reason, and we go through a retry in which the same matches are created. The simplest solution might just be to wrap matching an a transaction so if any of it fails the whole process fails.
https://rpm.newrelic.com/accounts/677903/applications/5585473/filterable_errors#/show/556198-80638235-2817-11e6-b947-b82a72d22a14/stack_trace?top_facet=transactionUiName&primary_facet=error.class&barchart=barchart&filters=%5B%7B%22key%22%3A%22error.message%22%2C%22value%22%3A%22failure_match_failure_line_id_7aeee577_uniq%22%2C%22like%22%3Atrue%7D%5D&_k=ouo5xc
Updated•8 years ago
|
Blocks: treeherder-nr-exceptions
Comment 1•8 years ago
|
||
This is occurring quite a lot on the new stage Heroku instance (more so than elsewhere, strangely):
https://rpm.newrelic.com/accounts/677903/applications/14179733/traced_errors/e6f381-e89cd98b-291b-11e6-b947-b82a72d22a14/similar_errors?original_error_id=e6f381-e89cd98b-291b-11e6-b947-b82a72d22a14
Comment 2•8 years ago
|
||
(In reply to James Graham [:jgraham] from comment #0)
> This should only happen if we try to create multiple failure matches for a
> single line using the same matcher with the same classified failure. That
> could happen if the matching process aborts half way through for some
> reason, and we go through a retry in which the same matches are created.
Ah I think the reason for more of these on Heroku is bug 1277726. However there are likely other causes for a job being interrupted, so worth using a transaction even if that gets fixed.
Comment 3•8 years ago
|
||
It looks like this may have contributed bug 1278433 (since the repeated celery task combined with upto 80s failure_line SELECTs caused the DB node disk to be pegged at 100%).
Please can you take a look soon?
Flags: needinfo?(james)
Updated•8 years ago
|
Comment 4•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(james)
Attachment #8760684 -
Flags: review?(emorley)
Updated•8 years ago
|
Attachment #8760684 -
Flags: review?(emorley) → review+
Comment 5•8 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/ea35e0dde612d3ae78d6329c64ae390ab2ea9763
Bug 1277575 - Don't fail the task when autoclassification encounters a duplicate FailureMatch
Refactor the code slightly by forward-porting the changes from the
elasticsearch branch that cause us to optimise out further matching when
we find a match that is considered good enough. Also wrap the creation
of a FailureMatch in a try/except block so that we don't fail if there
is an existing FailureMatch for the same line/classification which could
happen if the task previously failed for an unrelated reason. This gives
higher granularity than just putting the entire task in a transaction.
https://github.com/mozilla/treeherder/commit/688ed058dd7588867ff58f28c6a88ba840004332
Merge pull request #1569 from mozilla/autoclassify_update_robustness
Bug 1277575 - Don't fail the task when autoclassification encounters a duplicate FailureMatch
Comment 6•8 years ago
|
||
James, please don't forget to close out your bugs when they land :-)
(At least until we have a more sensible GitHub webhook to do this for us)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•