Closed
Bug 1168643
Opened 10 years ago
Closed 9 years ago
Automate the test cases for Telemetry data QA
Categories
(Toolkit :: Telemetry, defect, P4)
Toolkit
Telemetry
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: areinald.bug, Assigned: areinald.bug)
References
(Blocks 1 open bug)
Details
(Whiteboard: [measurement:client])
Attachments
(2 files, 9 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
application/x-sh
|
Details |
Write tests to execute plan described in bug 1156718.
Assignee | ||
Updated•10 years ago
|
Attachment #8610896 -
Flags: feedback?(ato)
Assignee | ||
Updated•10 years ago
|
Version: unspecified → Trunk
Assignee | ||
Comment 2•10 years ago
|
||
(In reply to André Reinald from comment #0)
> Write tests to execute plan described in bug 1156718.
This bug is about translating the test plan from plain English (bug 1156718) to Python-Marionette, so it can be machine executed repeatedly instead of requiring manual QA.
Comment 3•10 years ago
|
||
Comment on attachment 8610896 [details]
test-telemetry.py
I’m not really sure what you want me to give feedback on here?
The program implements functions for starting and stopping Firefox, which the Marionette test runner already supplies. There are also various other libraries such as mozprofile, to provision profiles, and mozrunner, to manage the binary lifetime.
Attachment #8610896 -
Flags: feedback?(ato)
Assignee | ||
Comment 4•10 years ago
|
||
WIP
Comment 5•10 years ago
|
||
Comment on attachment 8613978 [details] [diff] [review]
bugzilla-1168643-1.patch
Review of attachment 8613978 [details] [diff] [review]:
-----------------------------------------------------------------
That looks like a good start.
I don't know about marionette, but this seems to implement some things manually that should already be provided (as pointed out above).
#automation might offer some help or point to examples.
Installing addons seems like a more common task too, maybe that functionality already exists for marionette?
::: .gitignore
@@ +52,5 @@
> .settings/
>
> # Python virtualenv artifacts.
> +bin/
> +_virtualenv/
Why is that change needed now?
That seems like it should be at least in a different patch.
Comment 6•10 years ago
|
||
Did the test plan look ok to you?
Was there anything standing out or missing?
Comment 7•10 years ago
|
||
If this test is meant to run in CI you will need to integrate it with structural logging (provided by the mozlog.structured module).
It also seems to rely rather heavily on system specific characteristics (such as the path to profiles, binaries, &c.) which isn’t ideal.
It spins up the pingserver on a static port, which might conflict with existing used ports on the system. It’s better to bind to port 0 and query the assigned port later when needed.
There is use of local IO through a temporary folder that is removed after the program is done that doesn’t look safe at all.
As mentioned earlier you should also consider reusing existing libraries in mozbase, such as for profile management, preferences writing, extension injection and so on.
Generally the program should conform to the PEP8 coding style and not rely on hard coded waits because we need programs to be reproducible regardless of the system it runs on.
I hope this is useful.
Assignee | ||
Comment 8•10 years ago
|
||
(In reply to Andreas Tolfsen (:ato) from comment #7)
> I hope this is useful.
I first wanted to have a few tests coded and running, as a proof of concept that marionette could automate the test plan. I'm well aware that there is lots of room for improvement, and this will be part 2. I'll try to reach you on IRC, so I have more interactive help. Thanks Andreas.
Assignee | ||
Comment 9•10 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> Did the test plan look ok to you?
> Was there anything standing out or missing?
I don't yet feel qualified enough to point out if there is anything missing in the test plan itself. You've all been working in this area much longer that myself, so I basically have to trust you. I guess that if there are inconsistencies remaining, they will show up while I'm "translating" English to Python.
Assignee | ||
Comment 10•10 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #5)
> > # Python virtualenv artifacts.
> > +bin/
> > +_virtualenv/
>
> Why is that change needed now?
> That seems like it should be at least in a different patch.
bin is not needed.
_virtualenv is for my own setup (if absent, we get a much larger patch).
I will remove that from the final patch.
Assignee | ||
Comment 11•9 years ago
|
||
WIP
solved issues with xpi installation
starting every test from a new profile
tests are enumerated on command line
added lz4 decompression and json parsing
Attachment #8613978 -
Attachment is obsolete: true
Assignee | ||
Comment 12•9 years ago
|
||
WIP
Added sleep after launch to allow ping sending.
Added relaunch to allow previously unsent pings.
Added manual launch (FF + ping server, with same options).
Testing ping coherence after automated and manual tests:
assert (receivedPings + savedPings == archivedPings)
It seems that in some cases (TBD) pings get lost if server is not running.
Attachment #8617300 -
Attachment is obsolete: true
Comment 13•9 years ago
|
||
(In reply to André Reinald from comment #12)
> It seems that in some cases (TBD) pings get lost if server is not running.
That's interesting, there may be a bug lurking around in our code there then. It would be great if the conditions triggering those cases could be pinned down!
Assignee | ||
Comment 14•9 years ago
|
||
WIP:
- integrated ping server
- same file
- different process -> different thread
- clean up and factorization
- split ARCH5 test into ARCH5 (no expiring pings) and ARCH6 (expiring pings).
- implemented above + ARCH4.
Attachment #8623018 -
Attachment is obsolete: true
Assignee | ||
Comment 15•9 years ago
|
||
WIP:
- code factorization
- portable path constructions
- more meaningful messages
- more thorough tests
- archiving of ping folders after each test for later examination
ARCH1 is failing:
- pings are sent and saved even when telemetry is disabled
Attachment #8630176 -
Attachment is obsolete: true
Assignee | ||
Comment 16•9 years ago
|
||
WIP:
- GEN1 test failing: ping is sent on addon installation but it contains the addon (it should not)
Attachment #8635502 -
Attachment is obsolete: true
Comment 17•9 years ago
|
||
Thanks André! As we discussed over Vidyo:
- ARCH test should not really test if pings get sent to the server or not. Moreover, due to the introduction of the "UNIFIED" telemetry pref the "toolkit.telemetry.enabled" pref behaves a bit differently from before. Please see the in-tree docs [1] for an up-to-date overview of the expected behaviour.
- The QA document I wrote a while back contains the wrong preference name: "toolkit.telemetry.telemetry.enabled", but it really is just "toolkit.telemetry.enabled"
- I'll check and manually confirm the GEN1 test.
[1] - https://ci.mozilla.org/job/mozilla-central-docs/Tree_Documentation/toolkit/components/telemetry/telemetry/preferences.html
Comment 18•9 years ago
|
||
(In reply to André Reinald from comment #16)
> Created attachment 8636100 [details] [diff] [review]
> bugzilla-1168643-6.patch
>
> WIP:
> - GEN1 test failing: ping is sent on addon installation but it contains the
> addon (it should not)
I can't seem to be able to reproduce this issue: would you mind double checking that manually?
Assignee | ||
Comment 19•9 years ago
|
||
Marionette stopped working. Python issue I guess. Will investigate this week (though I'm on PTO officially, I want to understand). I wanted to re-run the automated tests with unchanged delays see if they are the cause.
Assignee | ||
Comment 20•9 years ago
|
||
WIP:
- ARCH6 became obsolete because changes in ping sending policy (no more expiring)
- GEN1 still failing
- GEN3 added
Fixed bugs and improved code.
New reference doc at:
https://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/index.html
Attachment #8636100 -
Attachment is obsolete: true
Comment 21•9 years ago
|
||
Comment on attachment 8643009 [details] [diff] [review]
bugzilla-1168643-7.patch
Review of attachment 8643009 [details] [diff] [review]:
-----------------------------------------------------------------
This is coming along, nice.
One major request that i have is to put focus on landing a minimal version of this in the tree first and make sure it can be run straight from it (i.e. no hard-coded locations or anything).
That way we would be easily able to reproduce issues and could start using it as a sanity check before checkin etc.
We could also start to add test cases too.
Also, what do you think about breaking the work down into multiple bugs with smaller scope, to make it a bit more manageable?
::: toolkit/components/telemetry/tests/marionette/test-telemetry.py
@@ +16,5 @@
> +import stat
> +import string
> +
> +
> +FIREFOX_BINARY_HOME_RELATIVE = os.path.join('Programmes', 'mozilla-central', 'obj-firefox.noindex', 'dist', 'Nightly.app', 'Contents', 'MacOS', 'firefox')
Can we launch this from the obj dir, so we don't need to put Firefox in a special location?
@@ +26,5 @@
> +USER_PROFILE_PATH = os.path.join(HOME_PATH, 'TelemetryTestProfile')
> +SAVED_PINGS_PATH = os.path.join(USER_PROFILE_PATH, 'saved-telemetry-pings')
> +DATAREPORTING_PATH = os.path.join(USER_PROFILE_PATH, 'datareporting')
> +ARCHIVED_PINGS_PATH = os.path.join(DATAREPORTING_PATH, 'archived', date.today().strftime('%Y-%m'))
> +XPI_PATH = os.path.join(HOME_PATH, 'Desktop', 'restartless.xpi')
We generate a restartless.xpi for our unit tests, you could just use that one and avoid depending on paths on your machine.
@@ +51,5 @@
> + ('setIntPref', 'toolkit.telemetry.min-subsession-length', '10'),
> + ('setIntPref', 'toolkit.telemetry.tick.interval', '15'),
> + ('setIntPref', 'toolkit.telemetry.tick.idle-interval', '120'),
> + ('setIntPref', 'toolkit.telemetry.scheduler.midnight-tolerance', '900'),
> + ('setIntPref', 'toolkit.telemetry.scheduler.coalesce-threshold', '120')
Can you put the patch that implements those prefs up too?
@@ +123,5 @@
> + else:
> + plainData = postData
> +
> + jsonData = json.loads(plainData)
> + fileName = jsonData["id"] + '.json'
It would be nice to prefix the filenames with a running counter or the current timestamp.
That way, if we ever need to inspect them manually, they naturally sort in received-order.
@@ +187,5 @@
> +def readPing(fileName, compressed):
> + file = open(fileName, 'rb')
> + if compressed:
> + file.seek(8)
> + jsonData = lz4.decompress(file.read())
Ah, cool, its as simple as that?
Good find.
@@ +202,5 @@
> + pingId = pingJson['id']
> + diskPings[pingId] = pingJson
> + return diskPings
> +
> +def nestedEqual(a, b):
This is for deep-equal comparisons of objects?
I think most functions could use a short comment on what they are supposed to do.
@@ +226,5 @@
> + if isinstance(a, set):
> + return a == b
> + return a == b
> +
> +def altern(cond, valYes, valNo):
We don't need that, Python has:
valYes if cond else valNo
@@ +253,5 @@
> +
> +def findAddon(ping, label):
> + addonFound = False
> + try:
> + addons = ping[u'payload'][u'info'][u'addons'].split(',')
This is an old field in Telemetry that is left in for backwards compability for now.
We should instead look at: ping.environment.addons.activeAddons/.activePlugins/etc.
See: https://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/environment.html
@@ +301,5 @@
> +
> +def installXPI(client, xpiPath):
> + print ' Installing xpi: ' + xpiPath
> + client.set_context(client.CONTEXT_CHROME)
> + scriptString = str(FILE_INSTANCE_JS + '\n'
I think we should move all the JavaScript into a separate JS file, then import it using import_script:
http://marionette-client.readthedocs.org/en/latest/reference.html#marionette_driver.marionette.Marionette.import_script
Its pretty hard to read and maintain as embedded scripts.
Attachment #8643009 -
Flags: feedback+
Assignee | ||
Comment 22•9 years ago
|
||
Not yet taking into account previous comments by Georg, but a step to make the script runnable on other machines (as talked with Stuart). Constant paths are read from environment vars.
Attachment #8643009 -
Attachment is obsolete: true
Assignee | ||
Comment 23•9 years ago
|
||
Shell script which sets up environment vars.
Explanations to setup in script comments.
Assignee | ||
Comment 24•9 years ago
|
||
This patch allows telemetry delays to be adjusted by prefs rather than hardcoded. Current marionette telemetry tests rely on shorter delays than the default ones.
Attachment #8610896 -
Attachment is obsolete: true
Comment 25•9 years ago
|
||
Comment on attachment 8644447 [details] [diff] [review]
telemetry-delays-prefs.patch
Review of attachment 8644447 [details] [diff] [review]:
-----------------------------------------------------------------
We should also document the prefs in toolkit/components/telemetry/docs/preferences.rst
... probably in a separate "testing" section.
I've left some comments on changing pref names etc., but otherwise this looks good.
::: toolkit/components/telemetry/TelemetryController.jsm
@@ +51,5 @@
>
> const PING_FORMAT_VERSION = 4;
>
> // Delay before intializing telemetry (ms)
> +const TELEMETRY_DELAY = Preferences.get("toolkit.telemetry.delay", 60) * 1000;
This pref should be more explicit about what the delay is for, e.g.:
toolkit.telemetry.initDelay
::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +43,5 @@
>
> const ENVIRONMENT_CHANGE_LISTENER = "TelemetrySession::onEnvironmentChange";
>
> const MS_IN_ONE_HOUR = 60 * 60 * 1000;
> +const MIN_SUBSESSION_LENGTH_MS = Preferences.get("toolkit.telemetry.min-subsession-length", 10 * 60) * 1000;
Pref-naming is usually camelCase.
@@ +75,5 @@
> // Maximum number of content payloads that we are willing to store.
> const MAX_NUM_CONTENT_PAYLOADS = 10;
>
> // Do not gather data more than once a minute
> +const TELEMETRY_INTERVAL = Preferences.get("toolkit.telemetry.interval", 60) * 1000;
This is odd and only used for the cycle collector - do we need to change it?
@@ +80,2 @@
> // Delay before intializing telemetry (ms)
> +const TELEMETRY_DELAY = Preferences.get("toolkit.telemetry.delay", 60) * 1000;
toolkit.telemetry.initDelay
@@ +82,4 @@
> // Delay before initializing telemetry if we're testing (ms)
> const TELEMETRY_TEST_DELAY = 100;
> // Execute a scheduler tick every 5 minutes.
> +const SCHEDULER_TICK_INTERVAL_MS = Preferences.get("toolkit.telemetry.tick.interval", 5 * 60) * 1000;
toolkit.telemetry.scheduler.tickInterval
@@ +88,1 @@
> // The maximum number of times a scheduled operation can fail.
toolkit.telemetry.scheduler.idleTickInterval
@@ +88,5 @@
> // The maximum number of times a scheduled operation can fail.
> const SCHEDULER_RETRY_ATTEMPTS = 3;
>
> // The tolerance we have when checking if it's midnight (15 minutes).
> +const SCHEDULER_MIDNIGHT_TOLERANCE_MS = Preferences.get("toolkit.telemetry.scheduler.midnight-tolerance", 15 * 60) * 1000;
toolkit.telemetry.scheduler.midnightTolerance
@@ +93,4 @@
>
> // Coalesce the daily and aborted-session pings if they are both due within
> // two minutes from each other.
> +const SCHEDULER_COALESCE_THRESHOLD_MS = Preferences.get("toolkit.telemetry.scheduler.coalesce-threshold", 2 * 60) * 1000;
We don't need this - i killed the coalescing logic in bug 1187339, apparently i missed removing this const though.
@@ +97,5 @@
>
> // Seconds of idle time before pinging.
> // On idle-daily a gather-telemetry notification is fired, during it probes can
> // start asynchronous tasks to gather data.
> +const IDLE_TIMEOUT_SECONDS = Preferences.get("toolkit.telemetry.idle.timeout", 5) * 60;
toolkit.telemetry.idleTimeout
Attachment #8644447 -
Flags: feedback+
Assignee | ||
Comment 26•9 years ago
|
||
Comment on attachment 8644447 [details] [diff] [review]
telemetry-delays-prefs.patch
Attached updated telemetry-delays-prefs.patch to bug 1197292, and added dependency accordingly. Marking current attachment in this bug as obsolete.
Attachment #8644447 -
Attachment is obsolete: true
Updated•9 years ago
|
Priority: -- → P3
Whiteboard: [measurement:client]
Updated•9 years ago
|
Priority: P3 → P4
Comment 27•9 years ago
|
||
The work here was completed but efforts have moved to bug 1191324.
You need to log in
before you can comment on or make changes to this bug.
Description
•