Closed Bug 1759864 Opened 3 years ago Closed 3 years ago

GCP worker-pool configs define SSDs from the wrong zone

Categories

(Release Engineering :: Firefox-CI Administration, defect, P3)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ahal, Assigned: masterwayz)

References

Details

Attachments

(1 file)

Our GCP worker pools (like this one) define a list of regions to use (currently just us-central1). We then define which zones get used in a region here:
https://hg.mozilla.org/ci/ci-configuration/file/0df1eb2b6e985b623152060139f4ebb701dfd021/environments.yml#l116

This appears to work, but I noticed that the SSDs are always set to the first zone in the list. For example, here's the config for the zone us-central1-b:

              {
                  "capacityPerInstance": 1,
                  "disks": [
                      {
                          "autoDelete": true,
                          "boot": true,
                          "initializeParams": {
                              "diskSizeGb": 20,
                              "sourceImage": "projects/taskcluster-imaging/global/images/fxci-level1-gcp-0y9sk1q1wfj5d06y0nea"
                          },
                          "type": "PERSISTENT"
                      },
                      {
                          "autoDelete": true,
                          "initializeParams": {
                              "diskType": "zones/us-central1-a/diskTypes/local-ssd"
                          },
                          "interface": "NVME",
                          "type": "SCRATCH"
                      },
                      {
                          "autoDelete": true,
                          "initializeParams": {
                              "diskType": "zones/us-central1-a/diskTypes/local-ssd"
                          },
                          "interface": "NVME",
                          "type": "SCRATCH"
                      },
                      {
                          "autoDelete": true,
                          "initializeParams": {
                              "diskType": "zones/us-central1-a/diskTypes/local-ssd"
                          },
                          "interface": "NVME",
                          "type": "SCRATCH"
                      },
                      {
                          "autoDelete": true,
                          "initializeParams": {
                              "diskType": "zones/us-central1-a/diskTypes/local-ssd"
                          },
                          "interface": "NVME",
                          "type": "SCRATCH"
                      }
                  ],
                  "machineType": "zones/us-central1-b/machineTypes/n2-custom-16-73728",
                  "minCpuPlatform": "Intel Cascadelake",
                  "networkInterfaces": [
                      {
                          "accessConfigs": [
                              {
                                  "type": "ONE_TO_ONE_NAT"
                              }
                          ]
                      }
                  ],
                  "region": "us-central1",
                  "scheduling": {
                      "automaticRestart": false,
                      "onHostMaintenance": "terminate",
                      "preemptible": true
                  },
                  "workerConfig": {
                      "capacity": 1,
                      "deviceManagement": {
                          "hostSharedMemory": {
                              "enabled": false
                          },
                          "kvm": {
                              "enabled": false
                          }
                      },
                      "shutdown": {
                          "afterIdleSeconds": 900,
                          "enabled": true
                      }
                  },
                  "zone": "us-central1-b"
              },

Notice the diskType key has us-central1-a listed under it. This seems to be happening because we're hardcoding the us-central1-a into the SSD configs directly in worker-pools.yml.

I'm not sure if this is a problem or not, but I'm assuming (without looking it up) that we'll want to use disks from the same zone we're running tasks in.

Assignee: nobody → mgoossens
Severity: -- → S3
Status: NEW → ASSIGNED
Priority: -- → P3
Pushed by mgoossens@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/51f31139faaa Fix GCP worker-pool configs define SSDs from the wrong zone r=ahal

Should be all fixed now per the most recent FxCI worker pool config.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: