[meta] Test overhead reduction
Categories
(Testing :: General, task, P3)
Tracking
(Not tracked)
People
(Reporter: egao, Unassigned)
References
(Depends on 3 open bugs, Blocks 1 open bug)
Details
(Keywords: meta)
This meta-bug deals specifically with the test overhead side of things for test/task efficiencies project. For more information on the overall project, please refer to the parent bug.
3. Reduce task overhead
- Can some tasks be combined?
- Improve setup/teardown efficiency
All bugs that block this bug will be related to the overhead work.
Ideas that have not been fully formulated will be saved in this top-level comment.
Ideas
virtualenv
what if we pre-package a virtualenv? Lots of time is spent creating the virtualenv, installing dependencies, and downloading necessary files.
quick check with https://taskcluster-artifacts.net/L9mCyDr5RI2tnohPimuVGA/0/public/logs/live_backing.log
- begin time: 17:32:31
- end time: 17:33:57
- time spent creating virtualenv: 0:01:26
- what if most (if not all) of the pip packages were preinstalled, stored somewhere, and downloaded to the machine? It would have to be faster than downloading individual packages and installing.
pip
what can we do to speed up pip installation process? This route may be feasible over re-packaging virtualenv, as it would require only incremental changes to the current system (not an overhaul).
quick check with https://taskcluster-artifacts.net/L9mCyDr5RI2tnohPimuVGA/0/public/logs/live_backing.log
- each
pip install
takes anywhere between 1 to 15 seconds (longer ones install multiple dependencies) - there are numerous pip commands
- what if we parallelize pip installation process? is this feasible?
- yes, this is feasible. Looking at https://stackoverflow.com/questions/11021130/parallel-pip-install, this takes the approach of downloading each package in parallel fashion, then installing in a sequential manner. This avoids read-write concurrency, but is faster than sequential download and installation. I have done a comparison:
normal method | concurrent download, sequential install | concurrent download, concurrent install |
---|---|---|
real 0m6.392s | real 0m3.831s | real 0m5.679s |
user 0m3.151s | user 0m7.389s | user 0m3.080s |
sys 0m1.971s | sys 0m2.357s | sys 0m1.213s |
concurrent download, sequential install PoC: time /bin/sh -c 'cat requirements.txt | xargs -t -n1 -P9 pip install -q download dist && pip install --no-index --find-links=dist -r requirements.txt'
concurrent download, concurrent install PoC: time xargs -P 8 pip install --no-cache-dir < requirements.txt
normal method: time pip install -r requirements.txt
Tests were conducted on Macbook Pro, 3.5GHz Core i7, 16GB DDR3.
Contents of example requirements.txt
:
six
pytest
pyyaml
simplejson
websocket-client
pexpect
nose
requests
- what if we install some of the common dependencies of various packages first? perhaps this will cut down time later on.
android reftest startup
something with the Android reftest startup is consuming almost a minute, for uncertain reasons:
[task 2019-04-29T18:14:51.602Z] 18:14:51 INFO - REFTEST INFO | Running with e10s: False
[task 2019-04-29T18:14:52.112Z] 18:14:52 INFO - launch_application: am start -W -n org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp -a android.intent.action.VIEW --es env8 R_LOG_LEVEL=6 --es args "-no-remote -profile /sdcard/tests/reftest/profile/" --es env3 MOZ_DISABLE_NONLOCAL_CONNECTIONS=1 --es env2 R_LOG_VERBOSE=1 --es env1 XPCOM_DEBUG_BREAK=stack --es env0 MOZ_CRASHREPORTER=1 --es env7 MOZ_CRASHREPORTER_NO_REPORT=1 --es env6 R_LOG_DESTINATION=stderr --es env5 MOZ_CRASHREPORTER_SHUTDOWN=1 --es env4 MOZ_IN_AUTOMATION=1
[task 2019-04-29T18:15:03.542Z] 18:15:03 INFO - remoteautomation.py | Application pid: 821
[task 2019-04-29T18:15:36.900Z] 18:15:36 INFO - REFTEST INFO | Reading manifest http://10.0.2.2:8854/tests/testing/crashtest/crashtests.list
[task 2019-04-29T18:15:36.901Z] 18:15:36 INFO - REFTEST INFO | Dumping JSON representation of sandbox
[task 2019-04-29T18:15:36.901Z] 18:15:36 INFO - REFTEST INFO | {"isDebugBuild":false,"xulRuntime":{"widgetToolkit":"android","OS":"Android","XPCOMABI":"arm-eabi-gcc3"},"smallScreen":false,"d2d":false,"dwrite":false,"gpuProcess":false,"azureCairo":false,"azureSkia":true,"skiaContent":true,"azureSkiaGL":false,"contentSameGfxBackendAsCanvas":true,"layersGPUAccelerated":true,"d3d11":false,"d3d9":false,"layersOpenGL":true,"webrender":false,"layersOMTC":true,"advancedLayers":false,"layerChecksEnabled":true,"retainedDisplayList":true,"Android":true,"cocoaWidget":false,"gtkWidget":false,"qtWidget":false,"winWidget":false,"is64Bit":false,"transparentScrollbars":false,"AndroidVersion":18,"AddressSanitizer":false,"webrtc":true,"retainedDisplayLists":true,"compareRetainedDisplayLists":false,"skiaPdf":false,"release_or_beta":false,"http":{"userAgent":"Mozilla/5.0 (Android 4.3.1; Mobile; rv:68.0) Gecko/68.0 Firefox/68.0","appName":"Mozilla","appVersion":"5.0","platform":"Android 4.3.1","oscpu":"Linux armv7l","misc":"rv:68.0"},"haveTestPlugin":false,"windowsDefaultTheme":false,"nativeThemePref":false,"gpuProcessForceEnabled":false,"prefs":{},"browserIsRemote":false,"asyncPan":true,"usesRepeatResampling":false,"verify":false,"serviceWorkerE10s":false}
[task 2019-04-29T18:16:08.250Z] 18:16:08 INFO - REFTEST INFO | Running chunk 1 out of 4 chunks. tests 1-941/3725
- why is
remoteautomation.py
pid detection taking 30 seconds? - why is dumping of the sandbox configuration also taking approximately 30 seconds?
android mozharness download
something is causing mozharness on Android tests to take a long time to download.
quick check with https://taskcluster-artifacts.net/dRr86h91TmqDtd_NFVu-kQ/0/public/logs/live_backing.log
[task 2019-04-29T18:10:01.624Z] + curl --fail -o mozharness.zip --retry 10 -L https://queue.taskcluster.net/v1/task/ZRyfMlZRQ0mGVPbVaLnA6w/artifacts/public/build/mozharness.zip
[task 2019-04-29T18:10:01.630Z] % Total % Received % Xferd Average Speed Time Time Time Current
[task 2019-04-29T18:10:01.630Z] Dload Upload Total Spent Left Speed
[task 2019-04-29T18:10:01.630Z]
[task 2019-04-29T18:10:02.454Z] 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
[task 2019-04-29T18:10:02.454Z] 100 222 100 222 0 0 269 0 --:--:-- --:--:-- --:--:-- 269
[task 2019-04-29T18:10:02.454Z] 100 222 100 222 0 0 269 0 --:--:-- --:--:-- --:--:-- 269
[task 2019-04-29T18:10:03.815Z]
[task 2019-04-29T18:10:04.816Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0
[task 2019-04-29T18:10:05.817Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:03 --:--:-- 0
[task 2019-04-29T18:10:06.819Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0
[task 2019-04-29T18:10:07.820Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:05 --:--:-- 0
[task 2019-04-29T18:10:08.821Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:06 --:--:-- 0
[task 2019-04-29T18:10:09.822Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:07 --:--:-- 0
[task 2019-04-29T18:10:10.823Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:08 --:--:-- 0
[task 2019-04-29T18:10:11.824Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:09 --:--:-- 0
[task 2019-04-29T18:10:12.826Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0
[task 2019-04-29T18:10:13.827Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:11 --:--:-- 0
[task 2019-04-29T18:10:14.828Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:12 --:--:-- 0
[task 2019-04-29T18:10:15.829Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:13 --:--:-- 0
[task 2019-04-29T18:10:16.830Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:14 --:--:-- 0
[task 2019-04-29T18:10:17.018Z] 0 0 0 0 0 0 0 0 --:--:-- 0:00:15 --:--:-- 0
[task 2019-04-29T18:10:17.018Z] 100 241 100 241 0 0 15 0 0:00:16 0:00:15 0:00:01 57
[task 2019-04-29T18:10:17.138Z]
[task 2019-04-29T18:10:17.138Z] 100 2281k 100 2281k 0 0 147k 0 0:00:15 0:00:15 --:--:-- 147k
Download speed appears to be very slow at ~147kBps.
test-linux compiz issue
an issue with compiz complaining about stacking windows that it does not own consumes a lot of precious time.
quick analysis of https://taskcluster-artifacts.net/LDowPchDTJCNcHNUkp8kxw/0/public/logs/live_backing.log
[task 2019-04-29T17:56:07.297Z] 17:56:07 INFO - Fetch https://queue.taskcluster.net/v1/task/Hr49YPvqT82IbHAV--2CYw/artifacts/public/build/target.common.tests.tar.gz into memory
[task 2019-04-29T17:56:08.573Z] 17:56:08 INFO - Content-Length response header: 39769688
[task 2019-04-29T17:56:08.574Z] 17:56:08 INFO - Bytes received: 39769688
[task 2019-04-29T17:56:09.627Z] compiz (core) - Warn: Attempted to restack relative to 0x1400006 which is not a child of the root window or a window compiz owns
[task 2019-04-29T17:56:21.743Z] 17:56:21 INFO - Downloading and extracting to /builds/worker/workspace/build/tests these dirs * from https://queue.taskcluster.net/v1/task/Hr49YPvqT82IbHAV--2CYw/artifacts/public/build/target.mochitest.tests.tar.gz
[task 2019-04-29T17:56:21.743Z] 17:56:21 INFO - retry: Calling fetch_url_into_memory with args: (), kwargs: {'url': u'https://queue.taskcluster.net/v1/task/Hr49YPvqT82IbHAV--2CYw/artifacts/public/build/target.mochitest.tests.tar.gz'}, attempt #1
[task 2019-04-29T17:56:21.744Z] 17:56:21 INFO - Fetch https://queue.taskcluster.net/v1/task/Hr49YPvqT82IbHAV--2CYw/artifacts/public/build/target.mochitest.tests.tar.gz into memory
Amount of time not spent executing tasks range appears to vary from platform to platform.
further chunk reductions
other tests that have not yet been targeted for a chunk revisit are:
- reftests
- mochitest-chrome
extraction of mozinfo.json on windows7-32 platforms
for some reason, the download and extraction of mozinfo.json
from the web-platform-tests
archive takes minutes on windows7-32:
18:00:52 INFO - Reading from file Z:\task_1556559288\build\target.test_packages.json
18:00:52 INFO - Downloading packages: [u'target.web-platform.tests.tar.gz', u'target.common.tests.tar.gz'] for test suite categories: ['web-platform']
18:00:52 INFO - Downloading and extracting to Z:\task_1556559288\build\tests these dirs mach, bin/*, config/*, mozbase/*, marionette/*, tools/*, web-platform/*, mozpack/*, mozbuild/* from https://queue.taskcluster.net/v1/task/PSe05NHCSmSUzDB0cuSHsQ/artifacts/public/build/target.web-platform.tests.tar.gz
18:00:52 INFO - retry: Calling fetch_url_into_memory with args: (), kwargs: {'url': u'https://queue.taskcluster.net/v1/task/PSe05NHCSmSUzDB0cuSHsQ/artifacts/public/build/target.web-platform.tests.tar.gz'}, attempt #1
18:00:52 INFO - Fetch https://queue.taskcluster.net/v1/task/PSe05NHCSmSUzDB0cuSHsQ/artifacts/public/build/target.web-platform.tests.tar.gz into memory
18:01:24 INFO - Content-Length response header: 54812554
18:01:24 INFO - Bytes received: 54812554
18:06:22 INFO - Adding 'mozinfo.json' for extraction from common.tests archive
18:06:22 INFO - Downloading and extracting to Z:\task_1556559288\build\tests these dirs mach, bin/*, config/*, mozbase/*, marionette/*, tools/*, web-platform/*, mozpack/*, mozbuild/*, mozinfo.json from https://queue.taskcluster.net/v1/task/PSe05NHCSmSUzDB0cuSHsQ/artifacts/public/build/target.common.tests.tar.gz
similar issue is not observed on windows10-64 platforms. However, for both platforms the extraction time is consistent:
- windows7: approx. 5 minutes
- windows10: approx. 30 seconds
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Updated•6 years ago
|
Comment 1•6 years ago
|
||
from comment 0:
virtualenv- seems hard, but will yield wins on ALL test jobs, I think worth investigating
pip - similar to virtualenv, maybe we don't need to do this if we can download a pre-packaged virtualenv
android-reftest-startup- when we are running in fewer chunks on android 7.0, this will have less impact, I vote for holding off on this unless we see similar issues on other platforms
android mozharness download- I suspect mozharness.zip doesn't change much, is this a download at bitbar or emulators- either way we should ensure the artifact download is the same speed on all platforms; maybe we need file caching at bitbar or packet.net- probably too hard to work on, but more investigation would help quantify the problem.
compiz window stacking - this could be another 15 second win for 50%+ of the jobs we run, I vote for working on this as well.
Comment 2•6 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #1)
android-reftest-startup- when we are running in fewer chunks on android 7.0, this will have less impact, I vote for holding off on this unless we see similar issues on other platforms
More generally, I wouldn't worry about any optimizations for Android 4.3, since that platform has a limited life span now.
Comment 3•6 years ago
|
||
Also, be wary of timestamps in Android test logs: keep in mind that the test harness periodically polls the device and dumps results. Cross-reference with the logcat timestamps to see what is happening in real-time on the device.
Comment 4•6 years ago
|
||
Simon did some digging into the Android startup, and discovered that the manifest parsing/chunking happens inside the emulator. This is particularly slow in debug builds.
Comment 5•6 years ago
|
||
re: pip
in some cases we're running pip twice for the same requirements file. the first time with dependency resolution disabled, and the second time with it enabled. we also have multiple requirements.txt files that each get processed.
I wonder if we could combine the requirements file and do a single pip invocation that installs all of our requirements?
Comment 6•6 years ago
|
||
one other idea we had was to move away from .tar.bz2 to .tar.zst or .tar.xz. Either is significantly faster than .tar.bz2 to decompress.
Comment 7•6 years ago
|
||
This needs more testing, but it may be possible to avoid the functools install:
Reporter | ||
Comment 8•6 years ago
|
||
Lots of great ideas - thanks!
Comment 9•6 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #4)
Simon did some digging into the Android startup, and discovered that the manifest parsing/chunking happens inside the emulator. This is particularly slow in debug builds.
The main culprits here, and their average time spent generating their chunk
Task Name | Task Count | Mean Delay |
---|---|---|
test-android-em-4.3-arm7-api-16/debug-jsreftest | 100 | 0 days 00:18:07.000000000 |
test-android-em-4.3-arm7-api-16/debug-reftest | 56 | 0 days 00:09:17.000000000 |
test-android-em-4.3-arm7-api-16/opt-jsreftest | 40 | 0 days 00:05:32.000000000 |
test-android-em-4.3-arm7-api-16/opt-reftest | 28 | 0 days 00:02:49.000000000 |
Reporter | ||
Comment 10•6 years ago
|
||
(In reply to Simon Fraser [:sfraser] ⌚️GMT from comment #9)
(In reply to Chris AtLee [:catlee] from comment #4)
Simon did some digging into the Android startup, and discovered that the manifest parsing/chunking happens inside the emulator. This is particularly slow in debug builds.
The main culprits here, and their average time spent generating their chunk
Task Name Task Count Mean Delay test-android-em-4.3-arm7-api-16/debug-jsreftest 100 0 days 00:18:07.000000000 test-android-em-4.3-arm7-api-16/debug-reftest 56 0 days 00:09:17.000000000 test-android-em-4.3-arm7-api-16/opt-jsreftest 40 0 days 00:05:32.000000000 test-android-em-4.3-arm7-api-16/opt-reftest 28 0 days 00:02:49.000000000
Would these no longer be a concern once we remove Android 4.3 emulator tests? Or does the delay in manifest parsing also occur on newer, hardware based tests as well?
Comment 11•6 years ago
|
||
Right, they would no longer be a concern. On Android x86 7.0, reftest manifest parsing is fast (5 to 10 seconds). And we don't usually run reftests on hardware.
Comment 12•6 years ago
|
||
unrelated to "overhead", web-platform-tests have some tests that are |expected: TIMEOUT|. I often see these as 3 minute timeouts- could we not skip these or put them in a different job that runs once/day on limited platforms? I see 460 .ini files with TIMEOUT in it, that would be a lot of runtime saved given that we run wpt on just about every config.
Reporter | ||
Comment 13•6 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #12)
unrelated to "overhead", web-platform-tests have some tests that are |expected: TIMEOUT|. I often see these as 3 minute timeouts- could we not skip these or put them in a different job that runs once/day on limited platforms? I see 460 .ini files with TIMEOUT in it, that would be a lot of runtime saved given that we run wpt on just about every config.
This would be addressed under the item Reduce task timeouts for this project. I'll go ahead and create a meta bug for that now.
Comment 14•6 years ago
|
||
one random idea for pip/virtualenv might be to use virtualenv-clone or look for similar tools.
Reporter | ||
Comment 15•6 years ago
|
||
Some proof of concept work for virtualenv
can be seen in this push: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=db724848d1564eeafb1b1b07e1bcd2310c376fd9
I have two avenues to explore.
Approach 1
When the docker containers that run the tests (eg. desktop1604-test
) is being generated, have a step to:
- create the virtualenv
- install every possible dependency required by all suites of tests
I would envision it slotting in after node is installed, here: https://searchfox.org/mozilla-central/source/taskcluster/docker/desktop1604-test/Dockerfile#35
The idea is that when the tests pull down the docker container and do further setup for the test, the virtualenv is already present and merely needs to be invoked.
Pros
- eliminate the approximately 1:30 spent on setting up virtual environment per chunk (linux64/opt figures, typical test)
- reduce the frequency of virtualenv generation (currently runs per chunk, this will eliminate that)
Cons
- if generated docker container archive grows much larger, increased bandwidth requirement will reduce savings
- depending on how often docker test images are regenerated, may not offer tangible savings
Approach 2
Run a separate task at specified frequency to:
- create a virtualenv
- archive into a tarball
- upload to storage
This would be run at weekly frequency (example), and test harness will be rewritten to simply download and extract the virtualenv.
Pros
- can customize the virtualenv to the specific test suite which reduces transfer size and time
- can be run even less frequently than docker image regeneration, at much less overhead
Cons
- would involve modifying a critical portion of mozharness and/or test environment setup code
- need to work out a storage solution for the archived virtualenv
- each test will require network transfer, which (depending on the size of compressed virtualenv) may reduce savings
Reporter | ||
Updated•6 years ago
|
Comment 16•6 years ago
|
||
Could we do something like your "Approach 2", except make them like toolchain tasks? The decision task would look at the relevant files in-tree to figure out if the virtualenv archives need to be rebuilt.
Reporter | ||
Comment 17•6 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #16)
Could we do something like your "Approach 2", except make them like toolchain tasks? The decision task would look at the relevant files in-tree to figure out if the virtualenv archives need to be rebuilt.
:catlee - thanks for feedback - interesting that you prefer the second approach. I was personally leaning towards the first approach; I rationalized it thinking it would eliminate the bandwidth and time requirement for each chunk to download the generated virtualenv. Not to mention that it would be much easier to implement.
To implement the second approach, my experience tinkering and adding new tasks is limited; is it possible to run an arbitrary script (eg. create virtualenv) using an exiting docker image, and how would I interface with whatever is used to store the compressed archives? Those are the two areas of challenge for me to implement the change.
Comment 18•6 years ago
|
||
I think the first approach is nice, but it works only for tests that run in docker? That wouldn't help us with macos or windows tests.
Reporter | ||
Comment 19•6 years ago
|
||
Right - it escaped my attention that macosx and windows tests do not run in a container.
I will create a new bug to continue investigation of this task. It would be nice to have this done and save ourselves quite a bit of money and time.
Reporter | ||
Updated•5 years ago
|
Comment 20•3 years ago
|
||
The bug assignee didn't login in Bugzilla in the last 7 months.
:ahal, could you have a look please?
For more information, please visit auto_nag documentation.
Comment 21•3 years ago
|
||
Lot's of stuff got fixed here, going to call it done. No one's looking at this specifically anymore so tracking bug probably isn't necessary.
Description
•