Make it really easy to run only tests in try
Categories
(Developer Infrastructure :: Try, enhancement, P2)
Tracking
(Not tracked)
People
(Reporter: nalexander, Unassigned)
References
(Depends on 2 open bugs, Blocks 1 open bug)
Details
I need to iterate on tests. Some of that iteration must happen in automation. That means I need to do a bunch of try builds, most of which are only tests. All those builds produce a "build" to target. So, because I have relatively rich knowledge of automation, I will use try_task_config.json
and re-use the build task.
Is there a better way? Can we pave this cowpath so that we just use a Nightly or whatever rather than requiring this rigmarole, which 9/10 engineers won't know?
Reporter | ||
Comment 1•5 years ago
|
||
ahal: tomprince: I figure y'all will know of better ways if they exist. Can you dupe this to an existing feature request?
Comment 2•5 years ago
|
||
I'll be honest I wasn't really sure what you were talking about, but I guess you mean the existing_tasks
parameter. Yes that's a good idea and I don't know of any other bugs on file for that.
Comment 3•5 years ago
|
||
We use existing_tasks for our release promotion action, and it works well for us. I absolutely think being able to schedule a bunch of tests, and base them on existing tasks, is a good path forward.
I imagine this will also help us schedule n
test runs against existing builds when we're trying to track down intermittents, for example.
Comment 4•5 years ago
|
||
I've got some support for doing this for staging releases (mach try release
) here and here. Something similar could be done for the the other mach try
selectors, though it would require a little bit of work to switch them to using the v2 try_task_config.json
format.
Updated•5 years ago
|
Comment 5•5 years ago
|
||
if we are iterating on tests, that requires us to have the tests packages generated which is part of the build step. I really like the idea of re-using a build, but this seems a bit more difficult to get all the other binaries/packages.
Reporter | ||
Comment 6•5 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #5)
if we are iterating on tests, that requires us to have the tests packages generated which is part of the build step. I really like the idea of re-using a build, but this seems a bit more difficult to get all the other binaries/packages.
I have a little experience with this now. I've done two things:
-
make Raptor tasks (on Win 10
releng-hardware
) do source checkouts. It's too slow to use, for two reasons: thehg clone
has no cache and therefore takes 8-20 minutes. The disk itself is very slow, so thehg checkout
part also appears to take 5-20 minutes. -
make Raptor tasks take
mozharness.zip
from a Linux build rather than the Windows build, and make the mozharness framework downloadtesting/raptor/raptor/**
fromhg.mozilla.org
as an archive. This avoids a ridiculous 40-50 minute Windows build to zip upmozharness.zip
andtarget.raptor.tests.tar.gz
(which is not platform specific anyway).
The former should be better than it actually is. The latter is needlessly complicated because the way we create test archives is needlessly complicated.
Based on my experiences, I can only say that this whole area needs significant investment, because the development experience is abysmal: without my work-arounds a single cycle would take more than 60 minutes, and that's with dedicated Win 10 releng-hardware
devices!
(See https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce9d9941a6b758dee38a69906fa2e55d5c938b24 for a try push with those work-arounds layered in.)
Comment 7•5 years ago
|
||
what are you testing though? with a static raptor.zip, it would be like a status talos.zip or mochitest.zip file- if you are cleaning up test manifests, trying to fix a test case, or just testing new tests written, we still need to package things. Possibly this is where the source clone comes from.
I do artifact builds on try and on windows those are <20 minutes, so the time from push to job starting is <30 minutes, here is an example from last week:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f3d60d8caec1d5e61e57cd583a535660ec85aaf
I think in the case for testing new pages where the recorded pages live outside of the tree, then yes- a workaround like this makes sense to reuse existing .zip files.
Reporter | ||
Comment 8•5 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #7)
what are you testing though? with a static raptor.zip, it would be like a status talos.zip or mochitest.zip file- if you are cleaning up test manifests, trying to fix a test case, or just testing new tests written, we still need to package things. Possibly this is where the source clone comes from.
I'm working on the actual Raptor harness, so it's not a static target.raptor.tests.tar.gz
. That part is doing what VCS is supposed to do (deliver an updated version of code), but my experience is that both solutions are doing it badly right now. The source clone is in an intermediate Windows build task, and even such an artifact build is very slow.
I do artifact builds on try and on windows those are <20 minutes, so the time from push to job starting is <30 minutes, here is an example from last week:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f3d60d8caec1d5e61e57cd583a535660ec85aaf
I was seeing consistent 40-50 minutes yesterday, 50%+ of which is cloning. I can link but what's the point? Some are fast, some are slow -- the structural problem. remains.
I think in the case for testing new pages where the recorded pages live outside of the tree, then yes- a workaround like this makes sense to reuse existing .zip files.
My end game for new page recordings will pull them from tasks (and/or the TC index), avoiding both the need for VCS and explicit package management like we do for mozharness.zip
and friends. Were I bumping recorded pages as part of this loop, it would be as bad as what we have now -- there's no difference between Python code and in-tree recordings at this time.
Updated•5 years ago
|
Comment 9•5 years ago
|
||
Just commenting to make a few statements..
Seems like there are two issues here that can be tackled independently (and are each useful on their own).
- It should be possible to somehow define
existing_tasks
in the |mach try| interface (likely by pointing to a previous push). - Tests should use a srcdir checkout rather than tests.zip.
The former is comparatively easy. Though I'd insist that we make it generic enough to work for non-test related use cases (which will add some complexity).
The latter is something we've wanted to do for at least 8 years now but is pretty complicated and never gets prioritized.
Comment 10•5 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #9)
- Tests should use a srcdir checkout rather than tests.zip.
The latter is something we've wanted to do for at least 8 years now but is pretty complicated and never gets prioritized.
Could (2) instead be,
2a. Allow for tests to download the tests.zip from an alternative location, other than from the same task as the build?
2b. Create a task that creates tests.zip on Try
2c. Point your Try test at the existing build, and at the tests.zip from the task in (2b)?
We could even change this to be the default behavior on Central:
push ------> build ------------> test
\-----> test_zip_task ---/
Comment 11•5 years ago
|
||
Comment #10 could be combined with moving some or all test scheduling out of the push graph logic, and into some external service. This external scheduling could allow for:
- test coalescing scheduling logic to be elsewhere,
- test backfill scheduling logic to be elsewhere,
- scheduling certain tests early, and more tests based on the results (e.g. only run B tests if A tests fail; only run C tests if A tests pass),
possibly more.
Allowing for scheduling tests against any existing build would allow for
- running unit test changes against the last known good (e.g., previously shipped nightly or beta), to reduce variables and noise,
- running talos changes against the last known good build. This would especially help reduce performance noise,
- reducing the number of builds we schedule for test-only changes,
possibly more.
We should probably allow for both in-tree and out-of-tree scheduling, rather than removing all test logic from the tree.
Updated•5 years ago
|
Comment 12•5 years ago
|
||
Bug 1628981 isn't a hard blocker, but it will allow us to create test zips on Try, while pointing at an existing other build (Nightly? Release? random other Try build?) as the build to test.
Updated•5 years ago
|
Comment 13•5 years ago
|
||
This bug is out of scope for the 'smart-scheduling' project, cost-reduction seems appropriate.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•