Closed Bug 1639662 Opened 4 years ago Closed 3 years ago

Setup R8 minis pool for QA testing

Categories

(Infrastructure & Operations :: RelOps: Hardware, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dividehex, Unassigned)

References

Details

QA needs a pool of 10 r8 mac minis running mojave in MDC1 to conduct performance testing on. I've setup the following hardware in MDC1:

t-mojave-r8-031.test.releng.mdc1.mozilla.com
t-mojave-r8-032.test.releng.mdc1.mozilla.com
t-mojave-r8-033.test.releng.mdc1.mozilla.com
t-mojave-r8-034.test.releng.mdc1.mozilla.com
t-mojave-r8-035.test.releng.mdc1.mozilla.com
t-mojave-r8-036.test.releng.mdc1.mozilla.com
t-mojave-r8-037.test.releng.mdc1.mozilla.com
t-mojave-r8-038.test.releng.mdc1.mozilla.com
t-mojave-r8-039.test.releng.mdc1.mozilla.com
t-mojave-r8-040.test.releng.mdc1.mozilla.com

WorkerType is: gecko-t-osx-1014-r8-qa

https://firefox-ci-tc.services.mozilla.com/provisioners/releng-hardware/worker-types/gecko-t-osx-1014-r8-qa

I have a lot of tests running here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b132052bca38cc6c0ea4eda2e0a42a67d8a5ee7a&selectedTaskRun=dZKHHGrDRROS8GDSRVViZA-0

many unittests are problematic, if we decide to do the performance pool first, possibly those will be green and we can move forward. Either way with only updating 1/2, we need to be intentional and have a split pool of r7 and r8. That would work for either unittest or perftests.

:davehunt, there are a lot of changes to the osx values for perf tests- most of them positive a few regressions. Either way this isn't a reflection on our product, just an infra change. Can you look over the changes and determine if there are any things we should investigate/change prior to going forward?

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1d1ed5ebec8f2879f13ca387c117546ee3c118e3&newProject=try&newRevision=baeb45aa5c5e311c27b1d58da88a5a207f7d8dad&framework=10

Flags: needinfo?(dave.hunt)

:dividehex- what is the resolution of the mac minis? I am looking into some unittest failures and wonder if resolution is the problem, or focus/background apps. I think this is important because if we need to change things for unittests we should change them for the imaging process for the perf tests. I believe now it is one large pool with the same imaging for unittest/perf. I would prefer to have the same image where possible. I don't want that to detract from deploying the perf tests onto new machines now.

Flags: needinfo?(jwatkins)

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #2)

:davehunt, there are a lot of changes to the osx values for perf tests- most of them positive a few regressions. Either way this isn't a reflection on our product, just an infra change. Can you look over the changes and determine if there are any things we should investigate/change prior to going forward?

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1d1ed5ebec8f2879f13ca387c117546ee3c118e3&newProject=try&newRevision=baeb45aa5c5e311c27b1d58da88a5a207f7d8dad&framework=10

Most regressions look like they're related to the request replay rates from mitmproxy, which could be due to pages loading faster and not something concerning.

The one concern I do have is the JetStream2 regression, although I wonder if this is actually an improvement as the score appears to be higher on the new hardware. :bebe could you check if we're marking JetStream2 results correctly?

Flags: needinfo?(dave.hunt) → needinfo?(fstrugariu)

took a look over the jetstream 2 tests and they are maked as lower_is_better = true

opening the test locally it states:

JetStream 2 is a JavaScript and WebAssembly benchmark suite focused on the most advanced web applications. It rewards browsers that start up quickly, execute code quickly, and run smoothly. For more information, read the in-depth analysis. Bigger scores are better.

Opened Bug 1640882 - Update jetstream 2 alert marking

Flags: needinfo?(fstrugariu)

so overall we are good to go- :dividehex, if you want to schedule creating a pool of these we can set it up for perf tests (unittests need attention still)

so overall we are good to go- :dividehex, if you want to schedule creating a pool of these we can set it up for perf tests (unittests need attention still). If this pool has 100 devices we will have enough capacity, we could stand 50 up at once, get it smooth then another 50?

we have migrated to R8's

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(jwatkins)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.