Closed Bug 1405987 Opened 7 years ago Closed 7 years ago

it takes about an hour to setup a new machine with a docker image

Categories

(Taskcluster Graveyard :: Docker Images, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jmaher, Unassigned)

References

(Blocks 1 open bug)

Details

I was looking at some new jobs and evaluating the runtime, I found 3 out of 10 jobs taking longer than I thought they should and upon further investigation, I see docker image download and uncompress. here is a log: https://public-artifacts.taskcluster.net/VRfTAZlNS3Gb4QylG-90rg/0/public/logs/live_backing.log here is >1 hour of time used in the log file: [taskcluster 2017-10-03 19:43:51.469Z] Download Progress: 97.85% [taskcluster 2017-10-03 19:43:53.429Z] Downloaded artifact successfully. [taskcluster 2017-10-03 19:43:53.429Z] Downloaded 1566.006 mb [taskcluster 2017-10-03 19:43:53.430Z] Decompressing downloaded image [taskcluster 2017-10-03 19:55:11.199Z] Loading docker image from downloaded archive. [taskcluster 2017-10-03 20:49:42.840Z] Image 'public/image.tar.zst' from task 'bhzq2365TzWKfzL6R-4ZZQ' loaded. Using image ID sha256:02a69b4551bea290b1d3e814414d40fce9e1cdff080b2fd731e492de24a97d16. I recall this taking ~20 minutes before, I suspect we have change a lot since then.
I think bug 1349261 will help a bit here. That being said, 1 hour to import a Docker image is absurd. I wonder if this has to do with bug 1305174?
Blocks: fastci
Depends on: 1349261
I did chat with :garndt about this and he mentioned that this is only the case on the m1.medium machines (which oddly enough runs all the mochitests, so 1/2 the load). I have moved over everything but browser-chrome and asan-devtools to the default instance size of m3.large (or c3.large?). To add to the mix, linux64-debug takes ~700 minutes to run all the browesr-chrome tests, and win10-debug takes ~150 minutes- I suspect it is less now that I moved over most of the mochitests to the default instance type.
check out bug 1281241 for getting tests to run on multi core machines.
m<N>.medium machines are not very powerful. Given the amount of processing the Docker image import performs, I'm not surprised that it takes >10 minutes to import a large image on a medium instance.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Product: Taskcluster → Taskcluster Graveyard
You need to log in before you can comment on or make changes to this bug.