1183396 - Analyze recent unlisted validation results

Reporter

Description

•

9 years ago

We need to collect data on the recent results of the unlisted validator over the past month so that we know where to focus our efforts. In particular we need:

* Over the entire time period:

  - The number of add-ons affected by each error ID with a signing_severity property.

  - For the most common errors, the value of the `context` property for the messages.

* Per day, and totals for the entire time period:

  - The number of add-ons which:

    a) Were submitted for automated signing.
    b) Which passed validation on initial submission.
    c) Which failed validation on initial submission, but resubmitted a modified version and passed.
    d) Which failed validation on submission and were submitted for manual review.

  - The mean/quartiles for number of submissions before passing automated review.

  - The mean/quartiles for number of submissions before submitting for manual review.

  - The mean/quartiles for number of messages of each signing severity per add-on.

Mark Striemer [:mstriemer]

Assignee

Updated

•

9 years ago

Assignee: nobody → mstriemer

Lisa Brewster [:adora]

Updated

•

9 years ago

Whiteboard: [june-launch] → [june-launch][validator-phase-1]

Mark Striemer [:mstriemer]

Assignee

Comment 1

•

9 years ago

(In reply to Kris Maglione [:kmag] from comment #0)
...snip...
> * Per day, and totals for the entire time period:
> 
>   - The number of add-ons which:
> 
>     a) Were submitted for automated signing.
This sounds good, but we should also track pass/fail.
>     b) Which passed validation on initial submission.
This sounds misleading, how do we know it really is the initial submission and not the first in the timeframe?
>     c) Which failed validation on initial submission, but resubmitted a
> modified version and passed.
Same as above.
>     d) Which failed validation on submission and were submitted for manual
> review.
Is this needed in addition to "number of submissions before submitting for manual review"?

If we limit our scope to a day (or any period really) then we'll see on Monday a submission that fails (failed_first_submission += 1) and then on Wednesday when the developer spent two days fixing bugs we'll see a pass (passed_first_submission += 1) and we won't see any failed, but resubmitted and passed.

This makes me think that we should do most of our tracking on an addon-by-addon basis instead of day-to-day. Perhaps we can do these things but with a sliding window of 30 days. So we can see it improve as data falls off the end and new submissions get added in.

>   - The mean/quartiles for number of submissions before passing automated
> review.

I'm assuming this is for new and updated addons. So for the following sequence (F: fail, P: pass): "FFPPFP" we'd say # of submissions before passing is 3,1,2.

>   - The mean/quartiles for number of submissions before submitting for
> manual review.

How do we know if the addon was submitted for manual review? Can you submit for manual review if you pass? How do we track this (x/M: submitted for manual review)? "FFP(P/M)FP(F/M)"

>   - The mean/quartiles for number of messages of each signing severity per
> add-on.

Should be straightforward.

My proposal:

For each day: # submitted, # passed, # failed, # manual reviews.
For each 30-day sliding window: # of submissions before passing, # of submissions before manual review, # of messages by severity

[github robot]

Comment 2

•

9 years ago

Commit pushed to master at https://github.com/mozilla/olympia

https://github.com/mozilla/olympia/commit/81a97dc1f0d808b13f096444a1c238e170536c98
Scripts to analyze validation results (bug 1183396)

Fixes #613

Kris Maglione [:kmag]

Reporter

Comment 3

•

9 years ago

Sorry for the absurd delay. I kept getting sidetracked before I could finish replying.

(In reply to Mark Striemer [:mstriemer] from comment #1)
> If we limit our scope to a day (or any period really) then we'll see on
> Monday a submission that fails (failed_first_submission += 1) and then on
> Wednesday when the developer spent two days fixing bugs we'll see a pass
> (passed_first_submission += 1) and we won't see any failed, but resubmitted
> and passed.
> 
> This makes me think that we should do most of our tracking on an
> addon-by-addon basis instead of day-to-day. Perhaps we can do these things
> but with a sliding window of 30 days. So we can see it improve as data falls
> off the end and new submissions get added in.

I was intending that we track what happens to versions submitted within the sliding window, but taking into account the entire history of the add-on.

For a given version submitted in the given sliding window:

 * If an add-on passed validation at all in the sliding window, its previous attempt count is the total number of previous submissions with the same add-on ID and version number, regardless of timeframe.
   There is the possibility that a previous submission with the same version number will have been approved previously, but I'd suggest ignoring that possibility for the moment.
 * It passed validation on initial submission if no version with the same add-on ID and version number has been previously submitted, regardless of time frame.
 * An unlisted add-on submitted for review shows up as:

     Addon.status in (amo.STATUS_LITE, amo.STATUS_UNREVIEWED) (8, 1)
     File.status = amo.STATUS_UNREVIEWED (1)
     File.is_signed = False

   Once they've been reviewed, they'll have an activity log entry of either amo.LOG.REJECT_VERSION or amo.LOG.PRELIMINARY_VERSION.

   Unfortunately, we don't have log entries for when add-ons where submitted for review, which complicates things.

Given the above, the breakdown of in the window is:

 * Passed validation on initial submission: The first item above, where the previous attempt count is 0.
 * Failed validation on initial submission, but passed: The first item above where the previous attempt account is greater than 0.
 * Failed validation and were submitted for manual review: Any unlisted, preliminary version reviewed within the time period, or submitted within the time period and awaiting review, as defined above.

That leaves versions which have yet to pass auto-validation or be submitted for manual review, which we should probably track in a similar way.

> >   - The mean/quartiles for number of submissions before passing automated
> > review.
> 
> I'm assuming this is for new and updated addons. So for the following
> sequence (F: fail, P: pass): "FFPPFP" we'd say # of submissions before
> passing is 3,1,2.

Ideally, I'd like two separate sets of numbers, one for all new versions, and one for all new add-ons. The numbers should be as above, the only difference is that for new add-ons, we'd only take into account the add-on ID, and ignore the version number.

But for a start, just tracking new versions should be fine.

> >   - The mean/quartiles for number of submissions before submitting for
> > manual review.
> 
> How do we know if the addon was submitted for manual review? Can you submit
> for manual review if you pass? How do we track this (x/M: submitted for
> manual review)? "FFP(P/M)FP(F/M)"

As above. Ideally, we should add more logging to simplify this, at some point.

> My proposal:
> 
> For each day: # submitted, # passed, # failed, # manual reviews.

That sounds like it would be useful information.

> For each 30-day sliding window: # of submissions before passing, # of
> submissions before manual review, # of messages by severity

I'd prefer a 7-day sliding window, with previous submissions tracked as above. Otherwise, this sounds good.

Andreas Wagner [:TheOne] [use NI]

Comment 4

•

9 years ago

> For each day: # submitted, # passed, # failed, # manual reviews.

I have fairly complicated and probably non-optimal sql queries for this data. Let me know if you need them.

Mark Striemer [:mstriemer]

Assignee

Comment 5

•

9 years ago

Daily automated and manual pass/fail rate for add-ons with is_listed=0 in the last 30 days.

https://amo-review-reports.herokuapp.com/

Andy McKay

Updated

•

9 years ago

Blocks: 1202063

Mark Striemer [:mstriemer]

Assignee

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Andy McKay

Updated

•

9 years ago

Whiteboard: [june-launch][validator-phase-1] → [june-launch][validator-phase-1][metrics]

Kumar McMillan [:kumar]

Comment 6

•

9 years ago

This patch starts collecting some stats so we can use the graphite dashboard (http://dashboard.mktadm.ops.services.phx1.mozilla.com/graphite?site=addons&graph=all-responses) https://github.com/mozilla/olympia/pull/753

[github robot]

Comment 7

•

9 years ago

Commits pushed to master at https://github.com/mozilla/olympia

https://github.com/mozilla/olympia/commit/fe2e300de9170aefbbcfea2c9dca9960e8e2edc1
Automated and manual unlisted pass/fail daily counts (bug 1183396)

https://github.com/mozilla/olympia/commit/749d2304428cdfbdca4bbaffea998f46361fc8eb
Merge pull request #716 from mstriemer/more-reporting-1183396

Automated and manual unlisted pass/fail daily counts (bug 1183396)

Nobody; OK to take it and work on it

Updated

•

9 years ago

Product: addons.mozilla.org → addons.mozilla.org Graveyard

Bugzilla

Quick Search

Analyze recent unlisted validation results

Categories

(addons.mozilla.org Graveyard :: Add-on Validation, defect)

Tracking

(Not tracked)

People

(Reporter: kmag, Assigned: mstriemer)

References

Details

(Whiteboard: [june-launch][validator-phase-1][metrics])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated

Comment 6

Comment 7

Updated