Closed
Bug 1431161
Opened 7 years ago
Closed 7 years ago
run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests
Categories
(Testing :: Talos, defect)
Testing
Talos
Tracking
(firefox60 fixed, firefox61 fixed)
RESOLVED
FIXED
mozilla61
People
(Reporter: jmaher, Assigned: rwood)
References
Details
(Whiteboard: [PI:March])
Attachments
(2 files)
Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests;
(deleted),
text/x-review-board-request
|
jmaher
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
pmoore
:
review+
|
Details |
as we migrate to new hardware we will not be installing windows 7 as an option on the new hardware. This means we will not have 32 bit coverage and there is still a need to ensure we don't have regressions.
Once we have windows10 running on the new hardware and not in buildbot, we can easily turn on 32 bit binaries testing on the windows10-64 os/hardware. This is not needed for every push, so we will only run this on autoland.
My thoughts are that we will do:
* autoland only
* only bisect/investigate if 5% regression and a reported regression is not seen on 64 bit windows10
Reporter | ||
Updated•7 years ago
|
Whiteboard: [PI:February]
Reporter | ||
Comment 1•7 years ago
|
||
lets make this bug track the work to make this official.
here is a patch that I have been using to test:
https://hg.mozilla.org/try/rev/a104cd781adbf49649662bdfeb73f417b221eb4c
I suspect it won't be backwards compatible, we should:
* fix windows7 xperf jobs to run properly (they run on a VM)
* ensure mozharness config changes are in the right files and complete
* consider splitting out reftest from that patch.
:rwood, could you pick up this work in the short term?
Assignee | ||
Comment 2•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #1)
> lets make this bug track the work to make this official.
>
> here is a patch that I have been using to test:
> https://hg.mozilla.org/try/rev/a104cd781adbf49649662bdfeb73f417b221eb4c
>
> I suspect it won't be backwards compatible, we should:
> * fix windows7 xperf jobs to run properly (they run on a VM)
> * ensure mozharness config changes are in the right files and complete
> * consider splitting out reftest from that patch.
>
> :rwood, could you pick up this work in the short term?
Sure... I don't quite understand though - what mozharness config changes do you mean? I've never worked on xperf or refests, are they for that maybe? Thanks :)
I'll file dependent bugs:
- for the reftest part of your patch
- for porting xperf win7 tests to run on 32 bit builds on win 10
- for mozharness config for ?
Flags: needinfo?(rwood) → needinfo?(jmaher)
Assignee | ||
Comment 3•7 years ago
|
||
Oh I see, you have mozharness configs already in
testing/mozharness/configs/talos/windows_config.py
I'll use this bug for that patch.
Flags: needinfo?(jmaher)
Reporter | ||
Comment 4•7 years ago
|
||
yeah, you figured out the mozharness bits- currently win7 xperf runs on VM, it should remain on VM when this patch is ready for review- no need for another bug.
we have bug 1435844 for reftests- things are moving along!
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #4)
> yeah, you figured out the mozharness bits- currently win7 xperf runs on VM,
> it should remain on VM when this patch is ready for review- no need for
> another bug.
>
> we have bug 1435844 for reftests- things are moving along!
Ok thank you sir!
Assignee | ||
Comment 6•7 years ago
|
||
Assignee | ||
Comment 7•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → rwood
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•7 years ago
|
||
In my try run (coomment 7) I have the patch working except the known failure of xperf. It is running on the AWS VM.
Assignee | ||
Comment 9•7 years ago
|
||
Assignee | ||
Comment 10•7 years ago
|
||
Update: The win32 tests (on Win 10 host) should run on: ['mozilla-beta', 'mozilla-central', 'mozilla-inbound', 'autoland', 'try']
Also note, talos g2 (damp) fails consistently on "complicated.netmonitor". There's an intermittent open for that however this seems consistent on the new h/w so I'm going to disable "complicated.netmonitor" from the damp test.
Comment 11•7 years ago
|
||
(In reply to Robert Wood [:rwood] from comment #10)
> Update: The win32 tests (on Win 10 host) should run on: ['mozilla-beta',
> 'mozilla-central', 'mozilla-inbound', 'autoland', 'try']
>
> Also note, talos g2 (damp) fails consistently on "complicated.netmonitor".
> There's an intermittent open for that however this seems consistent on the
> new h/w so I'm going to disable "complicated.netmonitor" from the damp test.
Could you do that only on windows?
(Services.appinfo.OS == "WINNT")
Comment 12•7 years ago
|
||
Also, I fixed some races around netmonitor DAMP test in bug 1419327.
It would be handy to check if it fixes this one?
Assignee | ||
Comment 13•7 years ago
|
||
Assignee | ||
Comment 14•7 years ago
|
||
(In reply to Alexandre Poirot [:ochameau] from comment #12)
> Also, I fixed some races around netmonitor DAMP test in bug 1419327.
> It would be handy to check if it fixes this one?
Thanks Alexandre, here's a try run on the new talos windows hardware, with your damp test patch imported from bug 1419327:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9bd6e1fa2b3e408a36a855977c0c627e9a5f19e7
Assignee | ||
Comment 15•7 years ago
|
||
Xperf (win7-opt) runs on aws (i.e machine i-01579509c8d7db2cc) [1]:
:27:43 INFO - u'xperf-e10s': {u'pagesets_name': u'tp5n.zip',
15:27:43 INFO - u'talos_options': [u'--xperf_path',
15:27:43 INFO - u'"c:/Program Files/Microsoft Windows Performance Toolkit/xperf.exe"'],
15:28:39 INFO - Calling ['Z:\\task_1518015759\\build\\venv\\Scripts\\python', 'Z:\\task_1518015759\\build\\tests\\talos\\talos\\run_tests.py', '--branchName', 'try', '--suite', 'xperf-e10s', '--executablePath', 'Z:\\task_1518015759\\build\\application\\firefox\\firefox', '--symbolsPath', 'https://queue.taskcluster.net/v1/task/Sq8JNtALRYqEf5n56ewXew/artifacts/public/build/target.crashreporter-symbols.zip', '--title', 'i-01579509c8d7db2cc', '--webServer', 'localhost', '--webServer', 'localhost', '--webServer', 'localhost', '--webServer', 'localhost', '--log-tbpl-level=debug', '--log-errorsummary=Z:\\task_1518015759\\build\\blobber_upload_dir\\xperf-e10s_errorsummary.log', '--log-raw=Z:\\task_1518015759\\build\\blobber_upload_dir\\xperf-e10s_raw.log'] with output_timeout 3600
15:28:40 INFO - ERROR: xperf.exe cannot be found at the path specified
15:28:40 ERROR - Return code: 1
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=c3b5145e074c92a234249c6791ce20c9bfbccca5&selectedJob=160860154
Assignee | ||
Comment 16•7 years ago
|
||
The xperf.exe path is the same on the existing xperf job on win7 aws [1]
14:46:14 INFO - u'xperf-e10s': {u'pagesets_name': u'tp5n.zip',
14:46:14 INFO - u'talos_options': [u'--xperf_path',
14:46:14 INFO - u'"c:/Program Files/Microsoft Windows Performance Toolkit/xperf.exe"'],
I have no idea why it's failing on the taskcluster job (comment 15)
[1] https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=talos&selectedJob=160852600
Assignee | ||
Comment 17•7 years ago
|
||
Ahh, 'c:/Program Files/Microsoft Windows Performance Toolkit' is missing from the path in the tc job, maybe that's it
Assignee | ||
Comment 18•7 years ago
|
||
(In reply to Robert Wood [:rwood] from comment #14)
> (In reply to Alexandre Poirot [:ochameau] from comment #12)
> > Also, I fixed some races around netmonitor DAMP test in bug 1419327.
> > It would be handy to check if it fixes this one?
>
> Thanks Alexandre, here's a try run on the new talos windows hardware, with
> your damp test patch imported from bug 1419327:
>
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=9bd6e1fa2b3e408a36a855977c0c627e9a5f19e7
Still fails on 'complicated.netmonitor' unfortunately, so for now I'll disable that subtest inside DAMP on Windows only as Alexandre suggested in comment 11
Assignee | ||
Comment 19•7 years ago
|
||
Assignee | ||
Comment 20•7 years ago
|
||
Assignee | ||
Comment 21•7 years ago
|
||
Assignee | ||
Comment 22•7 years ago
|
||
Assignee | ||
Comment 23•7 years ago
|
||
Assignee | ||
Comment 24•7 years ago
|
||
Assignee | ||
Comment 25•7 years ago
|
||
Assignee | ||
Comment 26•7 years ago
|
||
Assignee | ||
Comment 27•7 years ago
|
||
Comment hidden (mozreview-request) |
Reporter | ||
Comment 29•7 years ago
|
||
mozreview-review |
Comment on attachment 8949562 [details]
Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests;
https://reviewboard.mozilla.org/r/218916/#review224646
just need to sort out the win_worker_type_platform.
::: taskcluster/taskgraph/transforms/tests.py:928
(Diff revision 1)
> test['worker-type'] = MACOSX_WORKER_TYPES['macosx64']
> elif test_platform.startswith('win'):
> - win_worker_type_platform = WINDOWS_WORKER_TYPES[
> - test_platform.split('/')[0]
> - ]
> - if test.get('suite', '') == 'talos' and 'ccov' not in test['build-platform']:
> + # for talos xperf we want win7vm; all else on win10 hw
> + if test.get('suite', '') == 'talos' and "--suite=xperf-e10s" in \
> + test['mozharness']['extra-options']:
> + win_worker_type_platform = WINDOWS_WORKER_TYPES['windows7-32']
I am concerned that all other win7 unittests will have problems as well- here this is for talos xperf only. Maybe:
if suite == talos:
wintype = win10
or something like:
if test['virtualization'] == hardware:
wintype = win10
::: testing/mozharness/configs/talos/windows_vm_config.py:57
(Diff revision 1)
> "win64": "python3_x64.manifest",
> },
> "env": {
> # python3 requires C runtime, found in firefox installation; see bug 1361732
> - "PATH": "%(PATH)s;c:\\slave\\test\\build\\application\\firefox;"
> + "PATH": "%(PATH)s;c:\\slave\\test\\build\\application\\firefox;" \
> + "c:\\Program Files\\Microsoft Windows Performance Toolkit\\;"
is this needed?
::: testing/talos/talos/tests/devtools/addon/content/damp.html:15
(Diff revision 1)
> +// Bug 1400580 disable 'complicated.netmonitor' on Win
> +ChromeUtils.import("resource://gre/modules/Services.jsm");
> +var run_complicated_netmonitor = true;
> +if (Services.appinfo.OS == "WINNT") {
> + run_complicated_netmonitor = false;
> +}
scope creep, but I am fine with it in here.
Attachment #8949562 -
Flags: review?(jmaher) → review-
Reporter | ||
Comment 30•7 years ago
|
||
:markco can you update us in this bug with the total number of available windows moonshot machines?
Flags: needinfo?(mcornmesser)
Comment 31•7 years ago
|
||
Right now there are 29 that are ready to pick up tasks. If it helps i can can get additional machines stood up Friday morning. Is there a specific number currently needed?
I am planning Monday am at the latest to begin deploying the balance of the Windows nodes.
Flags: needinfo?(mcornmesser)
Assignee | ||
Comment 32•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8949562 [details]
Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests;
https://reviewboard.mozilla.org/r/218916/#review224646
> I am concerned that all other win7 unittests will have problems as well- here this is for talos xperf only. Maybe:
> if suite == talos:
> wintype = win10
>
>
> or something like:
> if test['virtualization'] == hardware:
> wintype = win10
I'll try that thanks
> is this needed?
I thought it was but I'll take it out and try without thanks
> scope creep, but I am fine with it in here.
Yeah good point I'm going to file a separate bug for that and cc the test owner
Assignee | ||
Comment 33•7 years ago
|
||
I'm going to move the change to the DAMP test to it's own Bug 1437028
Assignee | ||
Comment 34•7 years ago
|
||
Comment hidden (mozreview-request) |
Reporter | ||
Comment 36•7 years ago
|
||
mozreview-review |
Comment on attachment 8949562 [details]
Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests;
https://reviewboard.mozilla.org/r/218916/#review224804
looking much better, but still a concern in the transform.
::: taskcluster/taskgraph/transforms/tests.py:929
(Diff revisions 1 - 2)
> - test['mozharness']['extra-options']:
> - win_worker_type_platform = WINDOWS_WORKER_TYPES['windows7-32']
> - else:
> win_worker_type_platform = WINDOWS_WORKER_TYPES['windows10-64']
> + else:
> + win_worker_type_platform = WINDOWS_WORKER_TYPES['windows7-32']
this won't work as we are forcing win10-vm jobs to run on win7. This transform can change the machine type we choose for all jobs, in this case unittests and talos perf tests.
Attachment #8949562 -
Flags: review?(jmaher) → review-
Assignee | ||
Comment 37•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #36)
> Comment on attachment 8949562 [details]
> Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for
> talos performance tests;
>
> https://reviewboard.mozilla.org/r/218916/#review224804
>
> looking much better, but still a concern in the transform.
>
> ::: taskcluster/taskgraph/transforms/tests.py:929
> (Diff revisions 1 - 2)
> > - test['mozharness']['extra-options']:
> > - win_worker_type_platform = WINDOWS_WORKER_TYPES['windows7-32']
> > - else:
> > win_worker_type_platform = WINDOWS_WORKER_TYPES['windows10-64']
> > + else:
> > + win_worker_type_platform = WINDOWS_WORKER_TYPES['windows7-32']
>
> this won't work as we are forcing win10-vm jobs to run on win7. This
> transform can change the machine type we choose for all jobs, in this case
> unittests and talos perf tests.
I didn't know there were any Win 10 vm jobs, I thought all win 10 was h/w. Alright scratch that one.
Reporter | ||
Comment 38•7 years ago
|
||
all mochitest, xpcshell, web-platform-tests, etc. run on vm- only talos and reftest run on hardware- reftest is due to issues with the win10 vm, ideally it would only be talos on hardware.
Assignee | ||
Comment 39•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #38)
> all mochitest, xpcshell, web-platform-tests, etc. run on vm- only talos and
> reftest run on hardware- reftest is due to issues with the win10 vm, ideally
> it would only be talos on hardware.
Ah, thanks, yeah I keep thinking this is only for talos and not *all* test jobs
Assignee | ||
Comment 40•7 years ago
|
||
Assignee | ||
Comment 41•7 years ago
|
||
Assignee | ||
Comment 42•7 years ago
|
||
Thanks for the feedback, ok I *think* I have it correct now :)
Comment hidden (mozreview-request) |
Reporter | ||
Comment 44•7 years ago
|
||
mozreview-review |
Comment on attachment 8949562 [details]
Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests;
https://reviewboard.mozilla.org/r/218916/#review224884
excellent, we need to wait until machines are available. Can you do a try run with:
./mach try -b do -p win32,win64 -u all -t all
Attachment #8949562 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 45•7 years ago
|
||
Assignee | ||
Comment 46•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #44)
> Comment on attachment 8949562 [details]
> Bug 1431161 - run windows 32 and 64 bit builds on windows10-64 hardware for
> talos performance tests;
>
> https://reviewboard.mozilla.org/r/218916/#review224884
>
> excellent, we need to wait until machines are available. Can you do a try
> run with:
> ./mach try -b do -p win32,win64 -u all -t all
Thanks for the review, and right - we can't land it until the pool is ready. Good idea, thanks - landed '-u all -t all' on try (comment 45).
Comment hidden (mozreview-request) |
Assignee | ||
Comment 48•7 years ago
|
||
Rebased (and fixed conflicts) and landing on try again
Assignee | ||
Comment 49•7 years ago
|
||
Assignee | ||
Comment 50•7 years ago
|
||
Comment hidden (mozreview-request) |
Comment 52•7 years ago
|
||
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a1711e96c622
run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests; r=jmaher
Comment 53•7 years ago
|
||
Backed out for talos performance test failures.
backout: https://hg.mozilla.org/integration/autoland/rev/878d64506602a00a86f0c8e2eb970909a0276949
push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=a1711e96c6227cb839be7dee8e6c0d6ec9ae1750&selectedJob=163523588
failure log: https://tools.taskcluster.net/groups/ZYqwCbsUR5ejHuvQBaHUGQ/tasks/ejHxjg34ToKgAxr9uY3N0w/runs/0/logs/public%2Flogs%2Flive_backing.log
[taskcluster 2018-02-21T19:37:47.587Z] TASK FAIL since the task payload is invalid. See errors:
[taskcluster 2018-02-21T19:37:47.587Z] - supersederUrl: Additional property supersederUrl is not allowed
[taskcluster 2018-02-21T19:37:47.588Z] Task not successful due to following exception(s):
[taskcluster 2018-02-21T19:37:47.588Z] Exception 1)
[taskcluster 2018-02-21T19:37:47.588Z] Validation of payload failed for task ejHxjg34ToKgAxr9uY3N0w
[taskcluster 2018-02-21T19:37:47.588Z]
Flags: needinfo?(rwood)
Assignee | ||
Comment 54•7 years ago
|
||
Assignee | ||
Comment 55•7 years ago
|
||
Comment hidden (mozreview-request) |
Assignee | ||
Comment 57•7 years ago
|
||
Thanks Natalia, we have a potential fix under review
Flags: needinfo?(rwood)
Comment hidden (mozreview-request) |
Comment 59•7 years ago
|
||
mozreview-review |
Comment on attachment 8952844 [details]
Bug 1431161 - Temporarily turn off coalescing on new win tc h/w;
https://reviewboard.mozilla.org/r/222070/#review228352
Beautiful!
Attachment #8952844 -
Flags: review?(pmoore) → review+
Comment 60•7 years ago
|
||
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0160e724e111
run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests; r=jmaher
https://hg.mozilla.org/integration/autoland/rev/213725db126c
Temporarily turn off coalescing on new win tc h/w; r=pmoore
Comment 61•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/0160e724e111
https://hg.mozilla.org/mozilla-central/rev/213725db126c
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox60:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Reporter | ||
Comment 62•7 years ago
|
||
and here is the "massive" set of wins we get:
https://treeherder.mozilla.org/perf.html#/alerts?id=11706
Comment 63•7 years ago
|
||
Backout by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5bc49a32f706
backout for win10 hardware failures. r=me
Comment 64•7 years ago
|
||
Backout by archaeopteryx@coole-files.de:
https://hg.mozilla.org/mozilla-central/rev/c9ec0c37349c
backout for win10 hardware failures. r=me a=backout CLOSED TREE
Updated•7 years ago
|
Status: RESOLVED → REOPENED
status-firefox60:
fixed → ---
Flags: needinfo?(rwood)
Resolution: FIXED → ---
Target Milestone: mozilla60 → ---
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(rwood)
Comment 65•7 years ago
|
||
Looks like this backout also disabled the Windows QR talos test jobs from getting run at all. I had just turned those on recently in bug 1440968 and they're not showing up on TreeHerder any more.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 68•7 years ago
|
||
Re-opened the review and rebased it, to be more prepared for when we try to land this again
Reporter | ||
Comment 69•7 years ago
|
||
:markco- any luck figuring out the disk space issues?
Flags: needinfo?(mcornmesser)
Reporter | ||
Updated•7 years ago
|
Whiteboard: [PI:February] → [PI:March]
Comment 70•7 years ago
|
||
The generic worker upgrade (Bug 1443589) will address this.
Depends on: 1443589
Flags: needinfo?(mcornmesser)
Comment 71•7 years ago
|
||
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4992808ab5a3
run windows 32 and 64 bit builds on windows10-64 hardware for talos performance tests. r=rwood
Comment 72•7 years ago
|
||
bugherder |
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
status-firefox61:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Reporter | ||
Updated•7 years ago
|
Whiteboard: [PI:March] → [PI:March][checkin-needed-beta]
Comment 73•7 years ago
|
||
bugherder uplift |
status-firefox60:
--- → fixed
Whiteboard: [PI:March][checkin-needed-beta] → [PI:March]
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•