Closed
Bug 1260824
Opened 9 years ago
Closed 9 years ago
Autophone - submit s1s2 data to perfherder
Categories
(Testing Graveyard :: Autophone, defect)
Testing Graveyard
Autophone
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bc, Assigned: bc)
References
Details
Attachments
(21 files, 5 obsolete files)
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
feedback+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
feedback+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bc
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bc
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
wlach
:
feedback+
|
Details |
(deleted),
patch
|
bc
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details |
In today's mobile testing meeting, we discussed getting Autophone to begin submitting performance data to perfherder. Initially we can do this while also submitting the data to phonedash while we evaluate the usefulness of the test data on perfherder.
This will work with the perfherder's grouping of the data by platform as long as we keep the 1-1 mapping between Android version and device type.
The original work to get the autophone talos tests to submit to perfherder was in bug 1192069.
Comment 1•9 years ago
|
||
we should first decide on what data we want to submit, getting it into treeherder should be easy.
We will just have to make sure each platform maps to a device- so the nexus 6p's will be "android 6.0 ...", etc. No simple way to distinguish 6p-1, 6p-2, etc.
right now we are looking at s1/s2 only. lets list out the platforms and subtests or variants we care about.
Assignee | ||
Comment 2•9 years ago
|
||
We have a 1-1 mapping from device type to android version now. We will just need to make sure to preserve that for the devices submitting to perfherder going forward.
s1s2 is the only performance test we have other than the talos tests so it is the goal here.
I don't see the usefulness of trying to limit anything at this stage. The interesting part will be if submitting the mean instead of all of the data points for each measurement will allow us to see regressions. Another possibility is to submit all of the data points but perfherder might not support that much data.
We will find out what appears to be useful by reporting to phonedash and perfherder simultaneously and comparing the results.
Comment 3•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #2)
> We have a 1-1 mapping from device type to android version now. We will just
> need to make sure to preserve that for the devices submitting to perfherder
> going forward.
>
> s1s2 is the only performance test we have other than the talos tests so it
> is the goal here.
>
> I don't see the usefulness of trying to limit anything at this stage. The
> interesting part will be if submitting the mean instead of all of the data
> points for each measurement will allow us to see regressions. Another
> possibility is to submit all of the data points but perfherder might not
> support that much data.
How many datapoints are we talking about? Generally we take the median, not the mean, in Talos as that's more resilient to outliers. Kyle can also give helpful advice on this.
Assignee | ||
Comment 4•9 years ago
|
||
The raw data consists of first and second page loads x start and stop times x number of iterations for each test... 2 * 2 * 8 = 32.
We currently have 6 variations of s1s2 (local, remote) x (blank, twitter, nytimes).
Comment 5•9 years ago
|
||
I humbly suggest Autophone submits all data to Perfherder: Then Perfherder can change its ingestion to use mean, or median, or all samples, or some other statistic, without impacting the submission code.
Comment 6•9 years ago
|
||
(In reply to Kyle Lahnakoski [:ekyle] from comment #5)
> I humbly suggest Autophone submits all data to Perfherder: Then Perfherder
> can change its ingestion to use mean, or median, or all samples, or some
> other statistic, without impacting the submission code.
Perfherder currently relies on the client to send an aggregated summary of each test, and to be honest, I'm not keen on changing this. I'm already pretty busy with trying to maintain the existing codebase without adding in a per-framework ETL framework into the mix. I think it's better to delegate that responsibility to the test framework maintainers, though of course I'm happy to give advice.
If no one has a better idea, I'd suggest starting with median and going from there, that's the closest thing we have to a best practice.
Assignee | ||
Comment 7•9 years ago
|
||
add median choice to existing graph.
Attachment #8741803 -
Flags: review?(jmaher)
Assignee | ||
Comment 8•9 years ago
|
||
fix tooltip revision for release branches.
Attachment #8741804 -
Flags: review?(jmaher)
Assignee | ||
Comment 9•9 years ago
|
||
remove unused errorbar vars in handlers.py
Attachment #8741805 -
Flags: review?(jmaher)
Assignee | ||
Comment 10•9 years ago
|
||
Not ready for review.
Comment 11•9 years ago
|
||
Comment on attachment 8741803 [details] [diff] [review]
bug-1260824-add-median.patch
Review of attachment 8741803 [details] [diff] [review]:
-----------------------------------------------------------------
nice and simple!
Attachment #8741803 -
Flags: review?(jmaher) → review+
Comment 12•9 years ago
|
||
Comment on attachment 8741804 [details] [diff] [review]
bug-1260824-tooltip.patch
Review of attachment 8741804 [details] [diff] [review]:
-----------------------------------------------------------------
good fix.
Attachment #8741804 -
Flags: review?(jmaher) → review+
Updated•9 years ago
|
Attachment #8741805 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 13•9 years ago
|
||
Deployed to phonedash-dev for testing. phonedash-dev has a copy of production data up through 2016-04-09.
http://phonedash-dev.allizom.org/all.html#/2016-04-03/2016-04-09
Attachment #8741806 -
Attachment is obsolete: true
Attachment #8743304 -
Flags: review?(jmaher)
Assignee | ||
Comment 14•9 years ago
|
||
The previous patch didn't hook up the filter for the product. When attempting to fix that it turned out that the jquery selectors containing periods, e.g. #org.mozilla.fennec, didn't work. jgraham posits it was due to the periods being treated as class selectors. Since product and repo are really the same thing, I've dropped product altogether as a filter.
Attachment #8743304 -
Attachment is obsolete: true
Attachment #8743304 -
Flags: review?(jmaher)
Attachment #8743386 -
Flags: review?(jmaher)
Comment 15•9 years ago
|
||
Comment on attachment 8743386 [details] [diff] [review]
bug-1260824-all-v3.patch
Review of attachment 8743386 [details] [diff] [review]:
-----------------------------------------------------------------
a few nits to consider, nothing critical
::: html/scripts/allapp.js
@@ +348,5 @@
> + vars: series.vars,
> + label: plot_series_key,
> + data: [],
> + data_hash: {},
> + counts: [],
nit: these are justified to match counts:, but data_hash breaks that, maybe just leave them normal with a single space
@@ +412,5 @@
> + }
> + if (a.label < b.label) {
> + return -1;
> + }
> + if (a.label > b.label) {
can this be an else if?
@@ +508,5 @@
> + start = next_day(start);
> + if (start < end) {
> + getData(start, end, params, day+1);
> + }
> + else {
can this be |} else {|, bring it on the same line? I see mixed styles in this code, I prefer same line, but at the very least pick one for the entire file and project.
@@ +523,5 @@
> + $.makeArray($('#controls select').each(function(i, e) { params[e.name] = e.value; }));
> + var startdatestr = $('#startdate').attr('value');
> + var enddatestr = $('#enddate').attr('value');
> +
> + var hash = '#/' + startdatestr + '/' + enddatestr;
I could see a need for making the url params more typical rather than just a date. startDate=xx-xx-xxx&endDate=xx-xx-xxxx&...
Something to consider later, it would help with cut/paste urls for bugs.
@@ +532,5 @@
> +
> + if (load != "load") {
> + makePlot(params, all_data);
> + }
> + else {
another case of else on the new line.
::: html/scripts/plot.js
@@ +145,5 @@
> + url: '',
> + count: count
> + };
> + if (typeof(valueerr) != 'undefined') {
> + params.valueerr = '±' + Math.floor(valueerr);
what is plusmn?
::: server/handlers.py
@@ +298,5 @@
> + startdate = tz_pac.localize(startdate)
> + enddate = tz_pac.localize(enddate)
> + # convert the datetimes to utc.
> + startdate = startdate.astimezone(tz_utc)
> + enddate = enddate.astimezone(tz_utc)
this is confusing as we assing startdate/enddate 3 times, but reading it a few times I understand what is going on. I don't know if simplifying to a more complex line of code would help...more of a comment than a request.
Attachment #8743386 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 16•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #15)
> Comment on attachment 8743386 [details] [diff] [review]
> bug-1260824-all-v3.patch
>
> Review of attachment 8743386 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> a few nits to consider, nothing critical
>
> ::: html/scripts/allapp.js
> @@ +348,5 @@
> > + vars: series.vars,
> > + label: plot_series_key,
> > + data: [],
> > + data_hash: {},
> > + counts: [],
>
> nit: these are justified to match counts:, but data_hash breaks that, maybe
> just leave them normal with a single space
>
Ok.
> @@ +412,5 @@
> > + }
> > + if (a.label < b.label) {
> > + return -1;
> > + }
> > + if (a.label > b.label) {
>
I normally don't do the else if thingy if the earlier branch was an early return.
> can this be an else if?
>
> @@ +508,5 @@
> > + start = next_day(start);
> > + if (start < end) {
> > + getData(start, end, params, day+1);
> > + }
> > + else {
>
> can this be |} else {|, bring it on the same line? I see mixed styles in
> this code, I prefer same line, but at the very least pick one for the entire
> file and project.
>
Ok. I'm not a fan of } else {, but will change the project to match this style.
> @@ +523,5 @@
> > + $.makeArray($('#controls select').each(function(i, e) { params[e.name] = e.value; }));
> > + var startdatestr = $('#startdate').attr('value');
> > + var enddatestr = $('#enddate').attr('value');
> > +
> > + var hash = '#/' + startdatestr + '/' + enddatestr;
>
> I could see a need for making the url params more typical rather than just a
> date. startDate=xx-xx-xxx&endDate=xx-xx-xxxx&...
>
> Something to consider later, it would help with cut/paste urls for bugs.
>
The whole Router thing I inherited from the original makes this troublesome. I didn't like losing the bookmarkability but didn't spend any time thinking about it. I could probably work around it though and add the query string parameters. Let me think about it.
> @@ +532,5 @@
> > +
> > + if (load != "load") {
> > + makePlot(params, all_data);
> > + }
> > + else {
>
> another case of else on the new line.
>
Ok.
> ::: html/scripts/plot.js
> @@ +145,5 @@
> > + url: '',
> > + count: count
> > + };
> > + if (typeof(valueerr) != 'undefined') {
> > + params.valueerr = '±' + Math.floor(valueerr);
>
> what is plusmn?
>
This is the +- thingy in the tooltip.
> ::: server/handlers.py
> @@ +298,5 @@
> > + startdate = tz_pac.localize(startdate)
> > + enddate = tz_pac.localize(enddate)
> > + # convert the datetimes to utc.
> > + startdate = startdate.astimezone(tz_utc)
> > + enddate = enddate.astimezone(tz_utc)
>
> this is confusing as we assing startdate/enddate 3 times, but reading it a
> few times I understand what is going on. I don't know if simplifying to a
> more complex line of code would help...more of a comment than a request.
Yeah. I barely understand the timezone stuff. This was a clone from the original data api. I think it would be worth revisiting but not in this bug.
Assignee | ||
Comment 17•9 years ago
|
||
* address review comments (querystring deferred)
* fix stddev calc in allapp.js
* fix binning thinko
* fix phone options when only one of a phone type is available.
* make sure values are sorted before getting median in handlers.py
* default valuetype to median in all.html
* add throbbertime metric (throbberstop - throbberstop)
Attachment #8743952 -
Flags: review?(jmaher)
Assignee | ||
Comment 18•9 years ago
|
||
I've been running variations of this patch against treeherder.allizom.org and phonedash-dev.allizom.org and finding/fixing issues.
This probably collects too many measurements. Feedback requested.
Note: phonedash-dev.allizom.org *appears* to be dropping some of the data points without adding them to the database. I'm currently submitting the latest runs to my local instance. I'll upload to phonedash-dev later.
Ignore the runs before today: https://treeherder.allizom.org/perf.html
Attachment #8743954 -
Flags: feedback?(jmaher)
Comment 19•9 years ago
|
||
Comment on attachment 8743952 [details] [diff] [review]
bug-1260824-all-v4.patch
Review of attachment 8743952 [details] [diff] [review]:
-----------------------------------------------------------------
thanks for the updated patch!
Attachment #8743952 -
Flags: review?(jmaher) → review+
Comment 20•9 years ago
|
||
Comment on attachment 8743954 [details] [diff] [review]
bug-1260824-perfherder-s1s2-v1.patch
Review of attachment 8743954 [details] [diff] [review]:
-----------------------------------------------------------------
just a suggestion about the name to use.
::: tests/s1s2test.py
@@ +286,5 @@
> + perfherder_suite = PerfherderSuite(name=testname, value=utils.geometric_mean(test_values))
> + for metric_key in metric_keys:
> + for cache_key in cache_keys:
> + cache_name = cache_names[cache_key]
> + subtest_name = "%s %s median" % (metric_key, cache_name)
I am not sure if we want 'median' as part of the test name
@@ +288,5 @@
> + for cache_key in cache_keys:
> + cache_name = cache_names[cache_key]
> + subtest_name = "%s %s median" % (metric_key, cache_name)
> + perfherder_suite.add_subtest(subtest_name, perfherder_values[metric_key][cache_key]['median'])
> + subtest_name = "%s geometric mean" % metric_key
same here for geometric mean
Attachment #8743954 -
Flags: feedback?(jmaher) → feedback+
Assignee | ||
Comment 21•9 years ago
|
||
instead of local-twitter throbberstart second median opt
local-twitter throbberstart second opt
but what about the ones that are essentially binned and that i'm using geometric mean on? Would calling them summary be too confusing? As in:
instead of local-blank throbberstart geometric mean opt
local-blank throbberstart summary opt
or should i leave these binned values out?
Flags: needinfo?(jmaher)
Comment 22•9 years ago
|
||
as discussed on irc, lets do this for staging, using 'binned' as the keyword.
Flags: needinfo?(jmaher)
Assignee | ||
Comment 23•9 years ago
|
||
https://github.com/markrcote/phonedash/commit/d702aa2639d1158e4e7f00205fbbe8e704003521
https://github.com/markrcote/phonedash/commit/671a528b4c62f47587e29b8dc8c1aebda860459e
https://github.com/markrcote/phonedash/commit/8bc64bcb1e72d9bcc6b41154bf41868704fe8ab0
https://github.com/markrcote/phonedash/commit/2b5fc7ffde1ec534885e9e8bdca9620b80e6503c
deployed 2016-04-21 18:37 PDT
new page available at http://phonedash.mozilla.org/all.html
Assignee | ||
Comment 24•9 years ago
|
||
checking my results locally I found that not all data points are being retrieved by api/s1s2/alldata. Looking into it.
Assignee | ||
Comment 25•9 years ago
|
||
When reviewing data, I realized that alldata wasn't returning every measurement for a run. I made a mistake in thinking runstamp was set by Autophone when in reality it is set by Phonedash when it saves the data. This corrects the order of how the rows are presented and included rejected in the runs key.
Attachment #8744436 -
Flags: review?(jmaher)
Comment 26•9 years ago
|
||
Comment on attachment 8744436 [details] [diff] [review]
bug-1260824-fix-alldata-collection.patch
Review of attachment 8744436 [details] [diff] [review]:
-----------------------------------------------------------------
the big old rubber stamp comes out!
Attachment #8744436 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 27•9 years ago
|
||
deployed to phonedash-dev:
Example graph with an apparent regression:
http://phonedash-dev.allizom.org/all.html#/2016-03-20/2016-03-26/binning=repo-phonetype-phoneid-test_name&rejected=norejected&errorbars=noerrorbars&errorbartype=standarderror&valuetype=median&local-blank=on&throbberstart=on&first=on&fx-team=on&mozilla-inbound=on&nexus-4=on&nexus-4-1=on&nexus-4-2=on&nexus-4-3=on&nexus-4-4=on&nexus-4-5=on&nexus-4-6=on&nexus-5=on&nexus-5-1=on&nexus-5-2=on&nexus-5-3=on&nexus-5-4=on&nexus-5-5=on&nexus-6=on&nexus-6-1=on&nexus-6-2=on&nexus-6p-6=on&nexus-9=on&nexus-9-1=on&nexus-9-2=on&nexus-s=on&nexus-s-3=on&nexus-s-8=on&nexus-s-9=on&samsung-gs3=on&samsung-gs3-3=on
Now with progressbar notification and a different layout. Note you can select x-ranges, drag tooltips, bookmark and reload urls.
If you load the base page on phonedash-dev, there will be no data displayed since there is no data for today. If you expand the date range to include 4/9/2016 the page will reload but the checkboxes for the new data will not be checked automatically.
I'll finish this off later today.
Attachment #8745241 -
Flags: feedback?(jmaher)
Assignee | ||
Comment 28•9 years ago
|
||
PS. I submitted mozilla-inbound for 2016-03-23T01:30:00 to 2016-03-23T09:40:00 with my gs3 and 6p reporting to treeherder staging. The data was submitted to my local phonedash so you can't see it on phonedash-dev yet, but the perfherder data was submitted to treeherder staging. Hopefully there was enough data for alerting. Not sure if the data is clean enough to generate an alert probably at least partially due to the fact that I was using my workstation at the same time and I think it has affected at least the remote measurements.
Comment 29•9 years ago
|
||
I see some s1s2 data on treeherder.allizom.org:
https://treeherder.allizom.org/perf.html#/graphs?series=%5Bmozilla-inbound,fe6d0c64ee0f4cfbb7b4b073423163d223f8fb05,1%5D
that is cool to see, I just added the summary data there- but it appears there is a lot of choices.
Looking at android 4, I see more summary data and on things like local-blank and remote-blank, I can see many revisions:
https://treeherder.allizom.org/perf.html#/graphs?series=%5Bmozilla-inbound,d109471419b3f69da4dae10a05699f07343c48b1,1%5D&series=%5Bmozilla-inbound,a08c494bbfb265619c075fb53dd1854800ceac37,1%5D&series=%5Bmozilla-inbound,31af9529d5771ce861e18e88b2ec721621fd5a8c,1%5D&series=%5Bmozilla-inbound,cd8f7d8edc66ccea3f351060c129f18d329d091c,1%5D
it appears that the android 4 data came in two parts- maybe we adjusted the number of tests or test names and that forced us to generate a new "signature" causing two different graphs. Having the data split between a couple of signatures, does result in no alerts- although I am not sure we would end up with alerts- maybe a different signature or two.
Comment 30•9 years ago
|
||
Comment on attachment 8745241 [details] [diff] [review]
WIP bug-1260824-selections-tooltips-drag-querystring.patch
Review of attachment 8745241 [details] [diff] [review]:
-----------------------------------------------------------------
overall this looks good- there are a lot of changes as you warned me of! looking at this on phonedash-dev looks great and seems fairly easy to understand. +1 for making shareable links work:)
::: html/scripts/allapp.js
@@ +857,2 @@
> }).init('/');
> + // });
can we just delete this line that is commented out?
Attachment #8745241 -
Flags: feedback?(jmaher) → feedback+
Assignee | ||
Comment 31•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #29)
> I see some s1s2 data on treeherder.allizom.org:
Yeah, we changed the names by removing median, adding binning to some of the binned results which resulted in multiple names for many things. It seems impossible to do any date ranged except those permitted in the select box so it is difficult to pick the exact date range as far as I know.
I think the new names are in affect for after Thursday Noon: https://treeherder.allizom.org/#/jobs?repo=mozilla-inbound&revision=d2670d2a634a3ee685da69395d43a18ebc965575
I think this shows a partial view of the summary tests:
<https://treeherder.allizom.org/perf.html#/graphs?timerange=604800&series=%5Bmozilla-inbound,a08c494bbfb265619c075fb53dd1854800ceac37,1%5D&series=%5Bmozilla-inbound,d109471419b3f69da4dae10a05699f07343c48b1,1%5D&series=%5Bmozilla-inbound,72346dd0ddf4c4d6a3094b3f41ef97b1e17acfdb,1%5D&series=%5Bmozilla-inbound,4ed63842d23b513721c5ab65e1fda64289a800f1,1%5D&series=%5Bmozilla-inbound,336c083daa3cff0d82515bc1645f9576d03e1fb2,1%5D&series=%5Bmozilla-inbound,d1d19256902dfd166cfbac390b5623c9ff4bae5b,1%5D&series=%5Bmozilla-inbound,606758e66355f38cb3ab87ea18c9adef779546e6,1%5D&series=%5Bmozilla-inbound,4f6caa88ea415fa44ee65ff83c4c1b1884419ece,1%5D&series=%5Bmozilla-inbound,31af9529d5771ce861e18e88b2ec721621fd5a8c,1%5D&series=%5Bmozilla-inbound,0b4d56abd4c3579748e3846f4eb27179ac44e73b,1%5D&series=%5Bmozilla-inbound,cd8f7d8edc66ccea3f351060c129f18d329d091c,1%5D&series=%5Bmozilla-inbound,213311b66166247a829098ed46af16cb2ac3f5be,1%5D&series=%5Bmozilla-inbound,7aa02785947d9a106ba8a3cbca85e8e49ce7d201,1%5D&series=%5Bmozilla-inbound,ea6f4e8ad202e747291e5336c92e5de63f63bdc7,1%5D>
Assignee | ||
Comment 32•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #30)
> Comment on attachment 8745241 [details] [diff] [review]
> WIP bug-1260824-selections-tooltips-drag-querystring.patch
>
> can we just delete this line that is commented out?
done.
ekyle ran into a non-standard use of exponent &&. changed to use Math.pow not **
I added a patch to translate old phonedash urls to new versions, and remove the remove unnecessary comment and another. Deployed to phonedash-dev.
try this:
http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstart/local-blank/norejected/2016-03-21/2016-03-24/notcached/noerrorbars/standarderror/notry/median
http://phonedash-dev.allizom.org/all.html#/org.mozilla.fennec/throbberstart/local-blank/norejected/2016-03-21/2016-03-24/notcached/noerrorbars/standarderror/notry/median
Then select the phones you want and click Apply.
If this passes muster, I would like to do one more patch before landing which would make this the default page and get rid of the old style page altogether. Is that ok with you?
Attachment #8745880 -
Flags: review?(jmaher)
Comment 33•9 years ago
|
||
this isn't working for me:
http://phonedash-dev.allizom.org/all.html#/2016-03-21/2016-03-24/binning=repo-phonetype-phoneid-test_name-cached_label-metric&rejected=norejected&errorbars=noerrorbars&errorbartype=standarderror&valuetype=median&local-blank=on&first=on&mozilla-inbound=on&nexus-4=on&nexus-4-1=on&nexus-4-2=on&nexus-4-5=on&nexus-4-6=on&nexus-5=on&nexus-5-1=on&nexus-5-2=on&nexus-5-3=on&nexus-5-4=on&nexus-6=on&nexus-6-1=on&nexus-6-2=on&nexus-9=on&nexus-9-1=on&nexus-9-2=on&nexus-s=on&nexus-s-3=on&nexus-s-8=on&nexus-s-9=on
I clicked the second link and tried adding phones, then clicking apply. I would like a sanity check if there is no data displayed and no phones selected to then have a message similar to "there are no phones selected, that might be why there is no data"
Comment 34•9 years ago
|
||
Comment on attachment 8745880 [details] [diff] [review]
bug-1260824-translate-urls.patch
Review of attachment 8745880 [details] [diff] [review]:
-----------------------------------------------------------------
::: html/scripts/allapp.js
@@ +801,5 @@
> + * where querystring is the url encoded values of the form elements.
> + */
> +
> + var parts = document.location.hash.split('/');
> + if (parts && parts.length == 12) {
are we guaranteed to have all old style urls having parts.length=12
Attachment #8745880 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 35•9 years ago
|
||
No. Some of the older urls will have different sets of parameters. Rather than try to cover all of the bases in the url translation, I will review the outstanding bugs and post new urls to the relevant graphs.
Assignee | ||
Comment 36•9 years ago
|
||
Oh, what the hell. There aren't that many url versions in phonedash. I'll work up the complete set this afternoon after I get back and we'll have all of the versions covered.
As part of the replacement of index.html with all.html I would also like to add a bit of help text which will be displayed instead of the singularly unhelpful "No data" message we currently display.
Assignee | ||
Comment 37•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #33)
>
> I clicked the second link and tried adding phones, then clicking apply. I
> would like a sanity check if there is no data displayed and no phones
> selected to then have a message similar to "there are no phones selected,
> that might be why there is no data"
I must have left out the metric in the link:
http://phonedash-dev.allizom.org/all.html#/2016-03-21/2016-03-24/binning=repo-phonetype-phoneid-test_name-cached_label-metric&rejected=norejected&errorbars=noerrorbars&errorbartype=standarderror&valuetype=median&local-blank=on&throbberstart=on&first=on&mozilla-inbound=on&nexus-4=on&nexus-4-1=on&nexus-4-2=on&nexus-4-5=on&nexus-4-6=on&nexus-5=on&nexus-5-1=on&nexus-5-2=on&nexus-5-3=on&nexus-5-4=on&nexus-6=on&nexus-6-1=on&nexus-6-2=on&nexus-9=on&nexus-9-1=on&nexus-9-2=on&nexus-s=on&nexus-s-3=on&nexus-s-8=on&nexus-s-9=on
Note I am loading a copy of my local database so we can see the nexus-6p and samsung-gs3 testing for this range so phonedash-dev will be unresponsive until it finishing importing.
I think updating the No data message and including a Help link will cover this. It has been worrying me as well.
Comment 38•9 years ago
|
||
sounds like a plan!
Assignee | ||
Comment 39•9 years ago
|
||
I reviewed the existing bugs and found 4 versions of the urls with various parts.
http://phonedash-dev.allizom.org/all.html#/org.mozilla.fennec/throbberstart/local-blank/2013-07-15/2013-07-22/notcached/noerrorbars/standarderror
http://phonedash-dev.allizom.org/all.html#/org.mozilla.fennec/throbberstart/local-blank/norejected/2014-08-06/2014-08-20/notcached/noerrorbars/standarderror
http://phonedash-dev.allizom.org/all.html#/org.mozilla.fennec/throbberstop/remote-blank/norejected/2015-10-06/2015-10-07/notcached/noerrorbars/standarderror/notry
http://phonedash-dev.allizom.org/all.html#/org.mgozilla.fennec/throbberstop/local-blank/norejected/2015-04-12/2015-04-13/notcached/errorbars/standarderror/notry/mean
While updating, I found a couple of issues related to checking the metric and the repo(s), expanded the repos for each product and added a way to default all of the phones from a translated url to be automatically checked.
Deployed to phonedash-dev for your viewing pleasure.
Attachment #8746360 -
Flags: review?(jmaher)
Assignee | ||
Comment 40•9 years ago
|
||
I realized I can use the same trick to help people if their first landing page doesn't have data. If the first page load has no data and they then change the dates, force the new options to be checked when the new data is loaded. /me drops phone. ;-)
Comment 41•9 years ago
|
||
Comment on attachment 8746360 [details] [diff] [review]
bug-1260824-translate-urls-v2.patch
Review of attachment 8746360 [details] [diff] [review]:
-----------------------------------------------------------------
nice solution!
Attachment #8746360 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 42•9 years ago
|
||
One final patch ( i hope ) to handle no data situations:
* reinitializes META when initializing.
* optimizes some property accesses getDataPoints.
* added some console.time and console.timeEnd calls for use in performance
profiling.
* removed the filtering of build times with the current selection in
getDataPoints.
* clear the dynamically generated checkboxes and legend when no data is
displayed.
* when the document.location.hash is changed, the Router will fire setControls.
Since we modify the document.location.hash in getDataPoints after we have
already scheduled a plot, added the ability to suppress the redundant plot
request due to the hash modification.
Attachment #8746955 -
Flags: review?(jmaher)
Assignee | ||
Comment 43•9 years ago
|
||
this removes the old version of phonedash's app and replaces it with the new 'all' version. I haven't done the help page yet, but want to get this landed on phonedash production so I can start using it to triage results. I'll do a follow up patch at a more convenient time to add the help.
Both patches have been deployed to phonedash-dev.
Attachment #8746956 -
Flags: review?(jmaher)
Assignee | ||
Comment 44•9 years ago
|
||
Comment on attachment 8746955 [details] [diff] [review]
bug-1260824-force-checkboxes-v1.patch
Obsoleting this since I broke bookmarking.
Attachment #8746955 -
Attachment is obsolete: true
Attachment #8746955 -
Flags: review?(jmaher)
Assignee | ||
Updated•9 years ago
|
Attachment #8746956 -
Attachment is obsolete: true
Attachment #8746956 -
Flags: review?(jmaher)
Comment 45•9 years ago
|
||
Comment on attachment 8746955 [details] [diff] [review]
bug-1260824-force-checkboxes-v1.patch
Review of attachment 8746955 [details] [diff] [review]:
-----------------------------------------------------------------
what is here is nice
::: html/scripts/allapp.js
@@ +610,5 @@
>
> setPlotHeight();
>
> + if (console) {
> + console.time('Plotting...');
can we make this "Phonedash: Plotting...", likewise other console messages
Attachment #8746955 -
Flags: review+
Comment 46•9 years ago
|
||
Comment on attachment 8746956 [details] [diff] [review]
bug-1260824-replace-original.patch
Review of attachment 8746956 [details] [diff] [review]:
-----------------------------------------------------------------
just some nits, but this is a good patch, rs=me
::: html/scripts/app.js
@@ +255,5 @@
> + }
> + INITIALIZING = false;
> +
> + if (console) {
> + console.time('getDataPoints: all_series...');
as the other patch, lets add a phonedash flag here and all other console messages.
@@ +488,3 @@
> }
> if (a.label > b.label) {
> + return +1;
why +1 and -1? is +1 == 1 ?
Attachment #8746956 -
Flags: review+
Assignee | ||
Comment 47•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #45)
> Comment on attachment 8746955 [details] [diff] [review]
> bug-1260824-force-checkboxes-v1.patch
>
> Review of attachment 8746955 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> what is here is nice
>
> ::: html/scripts/allapp.js
> @@ +610,5 @@
> >
> > setPlotHeight();
> >
> > + if (console) {
> > + console.time('Plotting...');
>
> can we make this "Phonedash: Plotting...", likewise other console messages
Try this out for yourself in Developer Tools -> Performance in the Waterfall. The text is used as the label for timed used in a section of code. It is usually pretty short and truncated. If we go with a Phonedash: prefix, I think the Waterfall will show mostly the Phonedash: part of the label. I'll look again to make sure it isn't already shortened too much in the Waterfall, but do please check it out since providing the labels and times for the Performance Waterfall is the sole purpose for the output.
I considered removing the console.time, console.timeEnd since I only wanted them when investigating the performance. I decided to leave them in for others to use if they were interested in helping improve the graph performance.
(In reply to Joel Maher (:jmaher) from comment #46)
> Comment on attachment 8746956 [details] [diff] [review]
> bug-1260824-replace-original.patch
>
> Review of attachment 8746956 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> just some nits, but this is a good patch, rs=me
>
> ::: html/scripts/app.js
> @@ +255,5 @@
> > + }
> > + INITIALIZING = false;
> > +
> > + if (console) {
> > + console.time('getDataPoints: all_series...');
>
> as the other patch, lets add a phonedash flag here and all other console
> messages.
>
See above.
> @@ +488,3 @@
> > }
> > if (a.label > b.label) {
> > + return +1;
>
> why +1 and -1? is +1 == 1 ?
It's redundant but it's always been that way.
https://github.com/markrcote/phonedash/blob/master/html/scripts/app.js#L120
Comment 48•9 years ago
|
||
ok, I get the concern with too much text being hard to figure out- my main goal for the prefix was to understand where the message was coming from.
Assignee | ||
Comment 49•9 years ago
|
||
carrying forward r+ with minor tweaks.
Attachment #8747593 -
Flags: review+
Assignee | ||
Comment 51•9 years ago
|
||
I've updated Mark's repo and gotten the mozilla repo up to date. I'll start treating the mozilla version as the official repo from now on though phonedash.mozilla.org and phonedash-dev.allizom.org still pull from mark's repo by default.
merge
https://github.com/mozilla/phonedash/commit/34943e3ef8277f80b1f3680b73c39db64c28949f
https://github.com/mozilla/phonedash/commit/52e2473fb7491a45f2498200d75b8013b9f1393e
https://github.com/mozilla/phonedash/commit/674da92d42dc3e58ba808ff3521c4e57d50bcc1d
https://github.com/mozilla/phonedash/commit/ec54f3e55582c4347976dd57dcf8cbc92b651edf
https://github.com/mozilla/phonedash/commit/120be1b22b95d1122328dfadf6642cc47ccfecba
https://github.com/mozilla/phonedash/commit/33d657e3e02e2b8e1aed7b0e8a64f330fa087e50
deployed to phonedash.mozilla.org and phonedash-dev.allizom.org and autophone-{1,2,3}.
Assignee | ||
Updated•9 years ago
|
Blocks: autophone-tier2
Assignee | ||
Comment 53•9 years ago
|
||
From what I can tell, we have the following properties that can be set for the summary and subtests being submitted to perfherder.
boolean shouldAlert # if false don't generate alerts for the test
default True
int alertThreshold # percentage change
default 2
int minBackWindow # minimum backward looking window in # of builds
default 12
int maxBackWindow # maximum backward looking window in # of builds
default 24
int foreWindow # forward looking window in # of builds.
default 12
We should load these possible values from the test's config file. e.g. https://github.com/mozilla/autophone/blob/master/configs/s1s2-blank-local.ini . The [treeherder] section is probably the most appropriate.
It may be the case that a single global value is inappropriate for all device classes. For example, we might want to set different values depending on the device class. It isn't clear to me the best approach to support per device class settings however.
wlach, do have any concerns or suggestions about customizing these properties? Will the 'signatures' change if these properties are changed in the future?
Flags: needinfo?(wlachance)
Assignee | ||
Comment 54•9 years ago
|
||
actually, i think a separate [perfherder] section would be more appropriate.
Comment 55•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #53)
> From what I can tell, we have the following properties that can be set for
> the summary and subtests being submitted to perfherder.
> ...
> wlach, do have any concerns or suggestions about customizing these
> properties? Will the 'signatures' change if these properties are changed in
> the future?
The default settings work well in most cases, I wouldn't change them unless it's shown to be needed. To answer your question though, no, the signatures won't change if these properties are altered.
Flags: needinfo?(wlachance)
Assignee | ||
Comment 56•9 years ago
|
||
Right. No need to fiddle the dials unless necessary. I'm just trying to make the adaptations possible in the future without having to write more code.
And we'll probably need to have separate values for each metric: throbberstart, throbberstop and throbbertime.
Assignee | ||
Comment 57•9 years ago
|
||
I am in the process of completing a test run on mozilla-beta with my local devices reporting to treeherder staging for the range 2016-02-22T00:00:00 to 2016-03-16T00:00:00. There is what appears to me to be a significant improvement for throbberstart for the 2016-03-07 08:02:31 merge.
This isn't quite completed yet. The devices on are 2016-03-14 12:04:02 builds atm.
You can see the phonedash-dev graph at:
http://phonedash-dev.allizom.org/#/2016-02-20/2016-03-16/binning=repo-phonetype-phoneid-test_name-cached_label-metric&rejected=norejected&errorbars=noerrorbars&errorbartype=standarderror&valuetype=median&local-nytimes=on&throbberstart=on&first=on&mozilla-beta=on&nexus-6p-6=on&samsung-gs3-3=on
and the perfherder graphs for local-nytimes throbberstart first visit at
https://treeherder.allizom.org/perf.html#/graphs?timerange=7776000&series=%5Bmozilla-beta,47c9411524c6b1d75b3c9079f34c34e8112ea89d,1%5D&series=%5Bmozilla-beta,b2a8741e0c6dc8d909dccf861f91b3aec3ca8440,1%5D&series=%5Bmozilla-beta,58b18f2720ec016758f6b3af9b85e3a40ca99bc3,1%5D&series=%5Bmozilla-beta,fc2622b7d1af66a1564ee76ed42f9e276e95669a,1%5D
I would have thought that perfherder would generate an improvement alert for the merge on 2016-03-07 but so far I don't see anything at
https://treeherder.allizom.org/perf.html#/alerts?status=-1&framework=3
Do I just not have enough data? Is it too noisy? Am I doing something else wrong?
Flags: needinfo?(wlachance)
Comment 58•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #57)
> Do I just not have enough data? Is it too noisy? Am I doing something else
> wrong?
To keep computation costs reasonable, we don't generate alerts for data more than 2 weeks old:
https://github.com/mozilla/treeherder/blob/706014481e7b9e0e55335a04f0b9ff75924b082b/treeherder/perf/alerts.py#L18
Flags: needinfo?(wlachance)
Assignee | ||
Comment 59•9 years ago
|
||
crap. Ok, so I guess we'll have to test live. thanks!
Assignee | ||
Comment 60•9 years ago
|
||
We will have to adjust our windows for beta and and release. For '2016-04-18' and '2016-05-02' I have 20 mozilla-beta builds and 15 mozilla-release builds which isn't enough with the default back and fore windows.
Comment 61•9 years ago
|
||
one trick is to retrigger (do >1 job) for each build- that will get the 12+ data points much faster.
Comment 62•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #59)
> ****. Ok, so I guess we'll have to test live. thanks!
If you'd like I could run the alert generation stuff manually on your series to see what the results would have been...
Assignee | ||
Comment 63•9 years ago
|
||
(In reply to William Lachance (:wlach) from comment #62)
> (In reply to Bob Clary [:bc:] from comment #59)
> > crap. Ok, so I guess we'll have to test live. thanks!
>
> If you'd like I could run the alert generation stuff manually on your series
> to see what the results would have been...
That would be totally AWESOME! Thanks!
Assignee | ||
Comment 64•9 years ago
|
||
Attachment #8748104 -
Flags: review?(jmaher)
Comment 65•9 years ago
|
||
Comment on attachment 8748104 [details] [diff] [review]
bug-1260824-perfherder-s1s2-v2.patch
Review of attachment 8748104 [details] [diff] [review]:
-----------------------------------------------------------------
this looks really straightforward, nice patch!
Attachment #8748104 -
Flags: review?(jmaher) → review+
Comment 66•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #63)
> (In reply to William Lachance (:wlach) from comment #62)
> > (In reply to Bob Clary [:bc:] from comment #59)
> > > ****. Ok, so I guess we'll have to test live. thanks!
> >
> > If you'd like I could run the alert generation stuff manually on your series
> > to see what the results would have been...
>
> That would be totally AWESOME! Thanks!
Ok results here:
http://nbviewer.jupyter.org/url/people.mozilla.org/%7Ewlachance/autophone%20perfalerts.ipynb
The output of the alert generation algorithm is a "t value" indicating our confidence that there was a regression (that's the first value in each tuple in the output). To reduce the number of false positives we have to deal with, we set a pretty high threshold for this (7) by default.
It looks like using the default settings only the series `local-nytimes throbberstart second opt` (fc2622b7d1af66a1564ee76ed42f9e276e95669a) showed a regression. In other cases we weren't quite confident enough to be sure.
You can play with the software by installing the following in a virtualenv:
* treeherder-client
* ipython[notebook]
* pip install -e treeherder/treeherder/perfalert # local install of performance alert generator
Assignee | ||
Comment 67•9 years ago
|
||
simple script to generate data for autophone.
Attachment #8748729 -
Flags: feedback?(wlachance)
Assignee | ||
Comment 68•9 years ago
|
||
alerts.py mozilla-inbound 60 days
Assignee | ||
Comment 69•9 years ago
|
||
alert.py mozilla-beta 60 days
Assignee | ||
Comment 70•9 years ago
|
||
Two things come to mind:
1. It is not clear to me that the binned measurements provide any value.
2. Currently we are only reporting the first major digit of the Android version. So all 4.x.x is reported as 4.0, etc. It should report at least the first decimal, so 4.0.4->4.0, 4.2.2 -> 4.2, 4.4.2 > 4.4, etc.
Thoughts?
Assignee | ||
Updated•9 years ago
|
Attachment #8748729 -
Attachment mime type: text/x-python → text/plain
Assignee | ||
Comment 71•9 years ago
|
||
fixed a hard coded 7 in the test on whether to print.
Attachment #8748729 -
Attachment is obsolete: true
Attachment #8748729 -
Flags: feedback?(wlachance)
Attachment #8748761 -
Flags: feedback?(wlachance)
Comment 72•9 years ago
|
||
Comment on attachment 8748761 [details]
alerts.py
This looks good! At some point we should add a wrapper to perfherder client to access this API:
https://treeherder.mozilla.org/api/performance/framework/
It would be added to this file:
https://github.com/mozilla/treeherder/blob/master/treeherder/client/thclient/perfherder.py
If you'd be willing to file a bug and attach a PR, I'd be eternally grateful. :) Otherwise I'm sure I'll get to it eventually.
Attachment #8748761 -
Flags: feedback?(wlachance) → feedback+
Assignee | ||
Comment 73•9 years ago
|
||
sure. will do.
Assignee | ||
Comment 74•9 years ago
|
||
final patch with binned measurements removed. carrying forward r+.
Attachment #8749761 -
Flags: review+
Assignee | ||
Comment 75•9 years ago
|
||
Assignee | ||
Comment 76•9 years ago
|
||
Assignee | ||
Comment 77•9 years ago
|
||
I ran a test for mozilla-release from 2016-01-01 to 2016-05-03.
https://treeherder.allizom.org/perf.html#/graphs?timerange=31536000&series=%5Bmozilla-release,1a000f992e97cade46c54e8d492716f1319d9047,1%5D&series=%5Bmozilla-release,4ed63842d23b513721c5ab65e1fda64289a800f1,1%5D&series=%5Bmozilla-release,322cb76e7673ed5f0ba86b136792c0fdf4aa4151,1%5D&series=%5Bmozilla-release,7fdd81419a8aa8b33e5f063037fe60d2f083c35f,1%5D&series=%5Bmozilla-release,ff2a6a538cd5d68ece026c561d7e93c621df5072,1%5D
Assignee | ||
Comment 78•9 years ago
|
||
python alerts.py --host=treeherder.allizom.org --repo=mozilla-release --interval=one_year > /tmp/treeherder.allizom-mozilla-release.txt
Assignee | ||
Comment 79•9 years ago
|
||
python phonedash-alerts.py --phonedash-host=phonedash-dev.allizom.org --start=2016-01-01 --end=2016-05-06 --product=org.mozilla.firefox --repo=mozilla-release > /tmp/phonedash-dev-mozilla-release.txt
Assignee | ||
Comment 80•9 years ago
|
||
You can see the graph on phonedash-dev at:
http://phonedash-dev.allizom.org/#/2016-01-01/2016-05-06/binning=repo-phonetype-phoneid-test_name-cached_label-metric&rejected=norejected&errorbars=errorbars&errorbartype=standarderror&valuetype=median&local-nytimes=on&throbberstart=on&throbberstop=on&first=on&mozilla-release=on&nexus-6p=on&nexus-6p-6=on&nexus-one=on&nexus-one-3=on&samsung-gs3=on&samsung-gs3-3=on
It will take quite a while to load that full range from a t1.micro so be patient.
Assignee | ||
Comment 81•9 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 82•9 years ago
|
||
deployed 2016-05-06 12:55
Updated•3 years ago
|
Product: Testing → Testing Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•