Closed Bug 1546414 Opened 6 years ago Closed 6 years ago

Compare test results for aws builds versue gce builds

Categories

(Testing :: General, task)

Version 3
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bc, Assigned: bc)

References

Details

No description provided.

Linux x64 shippable opt and Windows 2012 x64 opt failed to build Bg

I was able to run tests using gce builds with a few limitations and was to get preliminary results to work with. I will do additional tests with higher --rebuild factors to get higher confidence in the results.

See https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&author=bclary%40mozilla.com&fromchange=43f213fc29c130997a7aa5b3fa370fe57bc35dad&tochange=d75778a2c2cf96052e623d197ed9c288280e9919&searchStr=build for a set of 3 try runs:

  1. (tip) inbound tip with no patches using aws builds.
  2. (gcp) inbound tip with patches to make gce builds tier 1 to allow them to be used in running tier 1 and tier 2 tests. This allowed me to keep the test platform names identical across the build types. I was not able to test xpcshell due to the requirement that it use signed builds which are not available.
  3. (tip2) another tip with no patches to use to compare against the results of #1.

gce build results

Build Type Result
Linux debug Completed
Linux shippable opt Completed
Linux x64 opt Completed
Linux x64 debug Completed
Linux x64 shippable opt Busted
OS X Cross Compiled debug Completed
OS X Cross Compiled shippable opt Completed
Windows 2012 opt Completed
Windows 2012 debug Completed
Windows 2012 shippable opt Completed
Windows 2012 x64 opt Busted
Windows 2012 x64 debug Completed
Windows 2012 x64 shippable opt Completed
Android 4.0 API16+ debug Completed
Android 4.2 x86 opt Completed
Android 5.0 AArch64 opt Completed
Android 5.0 x86-64 opt Completed
Android 5.0 x86-64 debug Completed

I used the following try fuzzy query to exclude tests which were not supported due to a lack of builds:
Fuzzy query='test-&query=!pgo !asan !nightly !valgrind !source !static-analysis !cargotest

A rough comparison is that the variability between aws and gce is similar to the variability between the two aws runs. I am currently working on determining how to reliably determine if gce introduces any unique failures. Intermittent failures and crashes are making it a bit difficult but I am making progress. I would expect that asan and valgrind might be expected to highlight the differences in the two build environments if they were available.

I will also provide a comparison of build times and talos and raptor results using perfherder compare once I have a handle on test failures.

AWS vs GCP Builds

Overall, the two sets of builds behaved similarly. The GCP Builds suffered much higher rates of intermittent failures which must be solved before promoting them.

There were minor differences in Raptor and Talos performance but nothing which rose to signifcance.

It is difficult to determine if GCP introduces any unique Unittest failures with a limited set of runs due to intermittents however I did not notice any significant differences in the AWS vs GCP results.

AWS vs GCP Build Comparison

Try Jobs

The try runs consisted of 5 builds for each platform using the queries

Fuzzy query=!beetmover !instrumented !sign-and-push&query=build-android-aarch64-gcp/opt$ | build-android-api-16-gcp/debug$ | build-android-x86-gcp/opt$ | build-android-x86_64-gcp/debug$ | build-android-x86_64-gcp/opt$ | build-linux-gcp-shippable/opt$ | build-linux-gcp/debug$ | build-linux64-gcp/debug$ | build-linux64-gcp/opt$ | build-macosx64-gcp-shippable/opt$ | build-macosx64-gcp/debug$ | build-win32-gcp-shippable/opt$ | build-win32-gcp/debug$ | build-win32-gcp/opt$ | build-win64-gcp-shippable/opt$ | build-win64-gcp/debug$

Fuzzy query=!beetmover !instrumented !sign-and-push&query=build-android-aarch64/opt$ | build-android-api-16/debug$ | build-android-x86/opt$ | build-android-x86_64/debug$ | build-android-x86_64/opt$ | build-linux-shippable/opt$ | build-linux/debug$ | build-linux64/debug$ | build-linux64/opt$ | build-macosx64-shippable/opt$ | build-macosx64/debug$ | build-win32-shippable/opt$ | build-win32/debug$ | build-win32/opt$ | build-win64-shippable/opt$ | build-win64/debug$

I had to retrigger several times to get 5 good builds due to intermittent build failures for:

GCP Windows 2012 debug 1 out of 6
GCP Windows 2012 x64 debug 2 out of 7
GCP Windows 2012 x64 shippable opt 1 out of 6

Build Time Comparison is difficult to compare directly using Perfherder compare since the build platforms are different and since we have a mix of different machine type capabilities but we can create a combined table to better see the differences.

Test Platform AWS GCP # Runs
android-4-0-armv7-api16 debug taskcluster-c5d.4xlarge 690.78 ± 4.91 debug gcp taskcluster-n1-highcpu-64 1739.01 ± 14.23 5 / 5
android-4-2-x86 opt taskcluster-c5d.4xlarge 1113.82 ± 18.90 opt gcp taskcluster-n1-highcpu-64 2166.03 ± 7.95 5 / 5
android-5-0-aarch64 opt taskcluster-c5d.4xlarge 1106.41 ± 25.83 opt gcp taskcluster-n1-highcpu-64 1929.02 ± 14.87 5 / 5
android-5-0-x86_64 debug taskcluster-c5d.4xlarge 839.66 ± 15.27 debug gcp taskcluster-n1-highcpu-64 1631.86 ± 16.82 5 / 5
android-5-0-x86_64 opt taskcluster-c5d.4xlarge 1530.45 ± 20.39 opt gcp taskcluster-n1-highcpu-64 2690.90 ± 4.06 5 / 5
linux32 debug taskcluster-c5d.4xlarge 995.93 ± 6.94 debug gcp taskcluster-n1-highcpu-64 2121.98 ± 2.90 5 / 5
linux32-shippable opt nightly taskcluster-c5d.4xlarge 5074.89 ± 2.10 opt gcp nightly taskcluster-n1-highcpu-64 5207.08 ± 8.67 5 / 5
linux64 debug taskcluster-c5d.4xlarge 751.40 ± 6.54 debug gcp taskcluster-n1-highcpu-64 1940.50 ± 7.47 5 / 5
linux64 opt taskcluster-c5d.4xlarge 748.20 ± 43.76 opt gcp taskcluster-n1-highcpu-64 2479.22 ± 10.31 5 / 5
osx-cross debug taskcluster-c5d.4xlarge 863.46 ± 4.16 debug gcp taskcluster-n1-highcpu-64 1980.53 ± 10.48 5 / 5
osx-shippable opt nightly taskcluster-c5d.4xlarge 3654.78 ± 2.41 opt gcp nightly taskcluster-n1-highcpu-64 3446.69 ± 7.27 5 / 5
windows2012-32 debug taskcluster-c4.4xlarge 2831.28 ± 29.04 debug gcp taskcluster-n1-standard-32 2225.79 ± 4.28 4 / 4
windows2012-32 debug taskcluster-c5.4xlarge 2796.28 debug gcp taskcluster-n1-highcpu-32 2320.11 1 / 1
windows2012-32 opt taskcluster-c4.4xlarge 2942.47 ± 26.13 opt gcp taskcluster-n1-standard-32 3472.71 2 / 1
windows2012-32 opt taskcluster-c5.4xlarge 2097.70 ± 6.92 opt gcp taskcluster-n1-highcpu-32 3499.35 ± 8.42 3 / 4
windows2012-32-shippable opt nightly taskcluster-c4.4xlarge 5884.66 ± 0.55 opt gcp nightly taskcluster-n1-standard-32 6011.25 ± 0.49 2 / 2
windows2012-32-shippable opt nightly taskcluster-c5.4xlarge 4845.59 ± 1.61 opt gcp nightly taskcluster-n1-highcpu-32 6443.58 ± 3.30 3 / 3
windows2012-64 debug taskcluster-c4.4xlarge 2681.36 ± 29.91 debug gcp taskcluster-n1-highcpu-32 2295.43 ± 8.96 5 / 2
windows2012-64 debug taskcluster-c4.4xlarge 2681.36 ± 29.91 debug gcp taskcluster-n1-standard-32 2302.83 ± 4.51 5 / 4
windows2012-64-shippable opt nightly taskcluster-c4.4xlarge 6343.70 ± 1.05 opt gcp nightly taskcluster-n1-highcpu-32 6513.37 ± 4.51 5 / 3
windows2012-64-shippable opt nightly taskcluster-c4.4xlarge 6343.70 ± 1.05 opt gcp nightly taskcluster-n1-standard-32 6746.24 ± 9.60 5 / 2

AWS vs GCP Perftest Comparison

Try Jobs

The try run consisted of 5 runs for each test with Fuzzy query=test-android-aarch64/opt-raptor | test-android-aarch64/opt-talos | test-android-x86/opt-raptor | test-android-x86/opt-talos | test-android-x86_64/opt-raptor | test-android-x86_64/opt-talos | test-linux-shippable/opt-raptor | test-linux-shippable/opt-talos | test-linux64/opt-raptor | test-linux64/opt-talos | test-macosx64-shippable/opt-raptor | test-macosx64-shippable/opt-talos | test-win32-shippable/opt-raptor | test-win32-shippable/opt-talos | test-win32/opt-raptor | test-win32/opt-talos | test-win64-shippable/opt-raptor | test-win64-shippable/opt-talos&query=!firefox-ui .

I haven't compared the run times for the tests for AWS vs GCP.

Raptor

Perfherder Raptor Comparison

Important differences

Noise Metric Test Platform Base vs New Delta Confidence # Runs
raptor-tp6-bing-firefox opt windows7-32 59.93 ± 4.35 < 64.69 ± 2.45 7.95% 3.49 (med) 5 / 5
raptor-tp6-ebay-chromium opt windows7-32-shippable 139.64 ± 2.24 < 150.71 ± 3.04 7.92% 3.60 (med) 4 / 3
raptor-tp6-yahoo-mail-firefox opt linux64 204.51 ± 1.47 < 213.84 ± 2.22 4.56% 3.72 (med) 5 / 5

Tests without results

raptor-speedometer-geckoview pgo
raptor-speedometer-geckoview-power pgo
raptor-tp6-amazon-firefox opt
raptor-tp6-amazon-firefox pgo
raptor-tp6-apple-firefox pgo
raptor-tp6-binast-instagram-firefox opt
raptor-tp6-bing-firefox pgo
raptor-tp6-docs-firefox pgo
raptor-tp6-ebay-firefox pgo
raptor-tp6-ebay-mitm-404-recordings-202-chromium opt
raptor-tp6-ebay-mitm-404-recordings-202-firefox opt
raptor-tp6-ebay-mitm-404-recordings-404-chromium opt
raptor-tp6-ebay-mitm-404-recordings-404-firefox opt
raptor-tp6-facebook-firefox pgo
raptor-tp6-google-firefox pgo
raptor-tp6-google-mail-firefox pgo
raptor-tp6-imdb-firefox pgo
raptor-tp6-imgur-firefox pgo
raptor-tp6-instagram-firefox pgo
raptor-tp6-microsoft-firefox pgo
raptor-tp6-paypal-firefox pgo
raptor-tp6-pinterest-firefox pgo
raptor-tp6-reddit-firefox pgo
raptor-tp6-sheets-firefox pgo
raptor-tp6-slides-firefox pgo
raptor-tp6-tumblr-firefox pgo
raptor-tp6-twitter-firefox pgo
raptor-tp6-wikia-firefox pgo
raptor-tp6-wikipedia-firefox pgo
raptor-tp6-wikipedia-mitm-404-recordings-202-chromium opt
raptor-tp6-wikipedia-mitm-404-recordings-202-firefox opt
raptor-tp6-wikipedia-mitm-404-recordings-404-chromium opt
raptor-tp6-wikipedia-mitm-404-recordings-404-firefox opt
raptor-tp6-yahoo-mail-firefox pgo
raptor-tp6-yahoo-news-firefox pgo
raptor-tp6-yandex-firefox pgo
raptor-tp6-youtube-firefox pgo
raptor-tp6m-aframeio-animation-fennec-cold opt
raptor-tp6m-aframeio-animation-geckoview pgo
raptor-tp6m-allrecipes-fennec-cold opt
raptor-tp6m-allrecipes-geckoview pgo
raptor-tp6m-amazon-fennec-cold opt
raptor-tp6m-amazon-fennec-cold-live opt
raptor-tp6m-amazon-fennec-cold-live pgo
raptor-tp6m-amazon-geckoview pgo
raptor-tp6m-amazon-search-fennec-cold opt
raptor-tp6m-amazon-search-fennec-cold-live opt
raptor-tp6m-amazon-search-fennec-cold-live pgo
raptor-tp6m-amazon-search-geckoview pgo
raptor-tp6m-bbc-fennec-cold opt
raptor-tp6m-bbc-fennec-cold-live opt
raptor-tp6m-bbc-fennec-cold-live pgo
raptor-tp6m-bbc-geckoview pgo
raptor-tp6m-bing-fennec-cold opt
raptor-tp6m-bing-fennec-cold-live opt
raptor-tp6m-bing-fennec-cold-live pgo
raptor-tp6m-bing-geckoview pgo
raptor-tp6m-bing-restaurants-fenix-cold-live-live opt
raptor-tp6m-bing-restaurants-fennec-cold opt
raptor-tp6m-bing-restaurants-fennec-cold-live opt
raptor-tp6m-bing-restaurants-fennec-cold-live pgo
raptor-tp6m-bing-restaurants-geckoview pgo
raptor-tp6m-booking-fennec-cold opt
raptor-tp6m-booking-fennec-cold-live opt
raptor-tp6m-booking-fennec-cold-live pgo
raptor-tp6m-booking-geckoview pgo
raptor-tp6m-cnn-ampstories-fennec-cold opt
raptor-tp6m-cnn-ampstories-fennec-cold-live opt
raptor-tp6m-cnn-ampstories-fennec-cold-live pgo
raptor-tp6m-cnn-ampstories-geckoview pgo
raptor-tp6m-cnn-fennec-cold opt
raptor-tp6m-cold-amazon-geckoview opt
raptor-tp6m-cold-facebook-geckoview opt
raptor-tp6m-cold-google-geckoview opt
raptor-tp6m-ebay-kleinanzeigen-fenix-cold-live-live opt
raptor-tp6m-ebay-kleinanzeigen-fennec-cold opt
raptor-tp6m-ebay-kleinanzeigen-fennec-cold-live opt
raptor-tp6m-ebay-kleinanzeigen-fennec-cold-live pgo
raptor-tp6m-ebay-kleinanzeigen-geckoview pgo
raptor-tp6m-ebay-kleinanzeigen-search-fennec-cold opt
raptor-tp6m-ebay-kleinanzeigen-search-fennec-cold-live opt
raptor-tp6m-ebay-kleinanzeigen-search-fennec-cold-live pgo
raptor-tp6m-ebay-kleinanzeigen-search-geckoview pgo
raptor-tp6m-espn-fennec-cold opt
raptor-tp6m-espn-geckoview pgo
raptor-tp6m-facebook-cristiano-fennec-cold opt
raptor-tp6m-facebook-cristiano-geckoview pgo
raptor-tp6m-facebook-fennec-cold opt
raptor-tp6m-facebook-fennec-cold-live opt
raptor-tp6m-facebook-fennec-cold-live pgo
raptor-tp6m-facebook-geckoview pgo
raptor-tp6m-google-fennec-cold opt
raptor-tp6m-google-fennec-cold-live opt
raptor-tp6m-google-fennec-cold-live pgo
raptor-tp6m-google-geckoview pgo
raptor-tp6m-google-maps-fennec-cold opt
raptor-tp6m-google-maps-fennec-cold-live opt
raptor-tp6m-google-maps-fennec-cold-live pgo
raptor-tp6m-google-maps-geckoview pgo
raptor-tp6m-google-restaurants-geckoview pgo
raptor-tp6m-imdb-fennec-cold opt
raptor-tp6m-imdb-geckoview pgo
raptor-tp6m-instagram-fennec-cold opt
raptor-tp6m-instagram-fennec-cold-live opt
raptor-tp6m-instagram-fennec-cold-live pgo
raptor-tp6m-instagram-geckoview pgo
raptor-tp6m-jianshu-fennec-cold opt
raptor-tp6m-jianshu-fennec-cold-live opt
raptor-tp6m-jianshu-fennec-cold-live pgo
raptor-tp6m-jianshu-geckoview pgo
raptor-tp6m-microsoft-support-fennec-cold opt
raptor-tp6m-microsoft-support-fennec-cold-live opt
raptor-tp6m-microsoft-support-fennec-cold-live pgo
raptor-tp6m-microsoft-support-geckoview pgo
raptor-tp6m-reddit-fennec-cold opt
raptor-tp6m-reddit-fennec-cold-live opt
raptor-tp6m-reddit-fennec-cold-live pgo
raptor-tp6m-reddit-geckoview pgo
raptor-tp6m-stackoverflow-fennec-cold opt
raptor-tp6m-stackoverflow-fennec-cold-live opt
raptor-tp6m-stackoverflow-fennec-cold-live pgo
raptor-tp6m-stackoverflow-geckoview pgo
raptor-tp6m-web-de-fennec-cold opt
raptor-tp6m-web-de-geckoview pgo
raptor-tp6m-wikipedia-fennec-cold opt
raptor-tp6m-wikipedia-fennec-cold-live opt
raptor-tp6m-wikipedia-fennec-cold-live pgo
raptor-tp6m-wikipedia-geckoview pgo
raptor-tp6m-youtube-fennec-cold opt
raptor-tp6m-youtube-fennec-cold-live opt
raptor-tp6m-youtube-fennec-cold-live pgo
raptor-tp6m-youtube-geckoview pgo
raptor-tp6m-youtube-watch-fennec-cold opt
raptor-tp6m-youtube-watch-fennec-cold-live opt
raptor-tp6m-youtube-watch-fennec-cold-live pgo
raptor-tp6m-youtube-watch-geckoview pgo
raptor-unity-webgl-geckoview pgo

Talos

Talos Perfherder Comparison

Important Differences

None

Tests without results

tp5n main_normal_fileio opt e10s stylo
tp5n main_normal_netio opt e10s stylo
tp5n main_startup_fileio opt e10s stylo
tp5n main_startup_netio opt e10s stylo
tp5n mainthread_readbytes opt e10s stylo
tp5n mainthread_readcount opt e10s stylo
tp5n mainthread_writebytes opt e10s stylo
tp5n mainthread_writecount opt e10s stylo
tp5n nonmain_normal_fileio opt e10s stylo
tp5n nonmain_normal_netio opt e10s stylo
tp5n nonmain_startup_fileio opt e10s stylo
tp5n time_to_session_store_window_restored_ms opt e10s stylo

AWS vs GCP Unittest Comparison

Try Jobs

The try runs consisted of 5 runs for each test with Fuzzy query='test-&query=!xpcshell !raptor !talos !pgo !asan !nightly !valgrind !source !static-analysis !cargotest !linux64-shippable !linux64-gcp-shippable !windows2012-64/opt !windows2012-64-gcp/opt.

I haven't compared the run times for the tests for AWS vs GCP.

The build platforms were:

Linux debug
Linux shippable opt
Linux x64 opt
Linux x64 debug
OS X Cross Compiled debug
OS X Cross Compiled shippable opt
Windows 2012 opt
Windows 2012 debug
Windows 2012 shippable opt
Windows 2012 x64 opt
Windows 2012 x64 debug
Windows 2012 x64 shippable opt
Android 4.0 API16+ debug
Android 4.2 x86 opt
Android 5.0 AArch64 opt
Android 5.0 x86-64 opt
Android 5.0 x86-64 debug

The results were "fairly" consistent however 5 runs is not sufficient to experience all of the intermittents. I used a signature of test-status | test to do a coarse comparison of which tests only occured in aws or only in gcp. I haven't tried to break down the test platform here.

There were 64 unique AWS test failures not in GCP with 133 GCP test failures not in AWS out of 4414 AWS test failures and 4483 GCP test failures.

Tests which only failed in AWS

Test Status Test
TEST-FAIL dom/tests/mochitest/general/test_focus_legend_noparent.html
TEST-FAIL dom/tests/mochitest/general/test_paste_selection.html
TEST-FAIL dom/tests/mochitest/general/test_performance_now.html
TEST-FAIL /fetch/sec-metadata/window-open.tentative.https.sub.html
TEST-FAIL testing\marionette\harness\marionette_harness\tests\unit\test_crash.py TestCrash.test_unexpected_crash
TEST-FAIL testing\marionette\harness\marionette_harness\tests\unit\test_expectedfail.py TestFail.test_fails
TEST-UNEXPECTED-CRASH /beacon/beacon-error.sub.window.html
TEST-UNEXPECTED-CRASH dom/ipc/tests/test_process_error.xul
TEST-UNEXPECTED-CRASH /domparsing/outerhtml-02.html
TEST-UNEXPECTED-CRASH /feature-policy/payment-disabled-by-feature-policy.https.sub.html
TEST-UNEXPECTED-CRASH /payment-handler/idlharness.https.any.html
TEST-UNEXPECTED-CRASH /payment-request/payment-request-id-attribute.https.html
TEST-UNEXPECTED-CRASH /payment-request/payment-response/onpayerdetailchange-attribute.https.html
TEST-UNEXPECTED-CRASH /service-workers/service-worker/update-after-oneday.https.html
TEST-UNEXPECTED-CRASH /webxr/xrRigidTransform_constructor.https.html
TEST-UNEXPECTED-ERROR /cors/remote-origin.htm
TEST-UNEXPECTED-ERROR /css/css-fonts/font-display/font-display-feature-policy-reporting.tentative.html
TEST-UNEXPECTED-ERROR dom/ipc/tests/test_process_error.xul
TEST-UNEXPECTED-ERROR /html/semantics/embedded-content/media-elements/track/track-element/track-remove-insert-ready-state.html
TEST-UNEXPECTED-ERROR /payment-request/allowpaymentrequest/setting-allowpaymentrequest.https.sub.html
TEST-UNEXPECTED-ERROR /payment-request/payment-is-showing.https.html
TEST-UNEXPECTED-ERROR /payment-request/payment-request-canmakepayment-method-protection.https.html
TEST-UNEXPECTED-ERROR /payment-request/payment-request-ctor-currency-code-checks.https.html
TEST-UNEXPECTED-ERROR /payment-request/PaymentRequestUpdateEvent/updatewith-method.https.html
TEST-UNEXPECTED-ERROR testing/marionette/harness/marionette_harness/tests/unit/test_profile_management.py TestSwitchProfileWithoutWorkspace.test_replace_with_external_profile
TEST-UNEXPECTED-ERROR testing/marionette/harness/marionette_harness/tests/unit/test_profile_management.py TestSwitchProfileWithWorkspace.test_new_named_profile
TEST-UNEXPECTED-ERROR /wasm/webapi/invalid-code.any.sharedworker.html
TEST-UNEXPECTED-FAIL automation.py
TEST-UNEXPECTED-FAIL browser/components/resistfingerprinting/test/browser/browser_spoofing_keyboard_event.js
TEST-UNEXPECTED-FAIL browser/modules/test/browser/browser_UsageTelemetry_uniqueOriginsVisitedInPast24Hours.js
TEST-UNEXPECTED-FAIL /css/css-animations/Element-getAnimations.tentative.html
TEST-UNEXPECTED-FAIL /css/css-display/display-contents-dynamic-list-001-inline.html
TEST-UNEXPECTED-FAIL /css/css-grid/grid-items/grid-items-sizing-alignment-001.html
TEST-UNEXPECTED-FAIL /css/cssom-view/scroll-behavior-subframe-root.html
TEST-UNEXPECTED-FAIL devtools/client/aboutdebugging-new/test/browser/browser_aboutdebugging_serviceworker_fetch_flag.js
TEST-UNEXPECTED-FAIL devtools/client/aboutdebugging-new/test/browser/browser_aboutdebugging_serviceworker_not_compatible.js
TEST-UNEXPECTED-FAIL dom/base/test/test_blocking_image.html
TEST-UNEXPECTED-FAIL dom/canvas/test/webgl-conf/generated/test_2_conformance__glsl__misc__large-loop-compile.html
TEST-UNEXPECTED-FAIL dom/ipc/tests/test_process_error.xul
TEST-UNEXPECTED-FAIL dom/media/test/test_autoplay_policy_activation.html
TEST-UNEXPECTED-FAIL dom/media/test/test_background_video_no_suspend_not_in_tree.html
TEST-UNEXPECTED-FAIL dom/media/webaudio/test/test_convolverNodeChannelInterpretationChanges.html
TEST-UNEXPECTED-FAIL file:///Z:/task_1556692107/build/tests/reftest/tests/layout/reftests/forms/select/vertical-centering.html == file:///Z:/task_1556692107/build/tests/reftest/tests/layout/reftests/forms/select/vertical-centering-ref.html
TEST-UNEXPECTED-FAIL file:///Z:/task_1556695317/build/tests/reftest/tests/layout/reftests/font-face/variation-format-hint-1a.html == file:///Z:/task_1556695317/build/tests/reftest/tests/layout/reftests/font-face/variation-format-hint-1A-ref.html
TEST-UNEXPECTED-FAIL file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes/float-contiguous-vlr-011.xht == file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/reference/ref-filled-green-100px-square.xht
TEST-UNEXPECTED-FAIL file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes/float-contiguous-vlr-013.xht == file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes/float-contiguous-vrl-012-ref.xht
TEST-UNEXPECTED-FAIL file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes/float-contiguous-vrl-010.xht == file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/reference/ref-filled-green-100px-square.xht
TEST-UNEXPECTED-FAIL file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes/float-contiguous-vrl-012.xht == file:///Z:/task_1556695672/build/tests/reftest/tests/layout/reftests/w3c-css/received/css-writing-modes/float-contiguous-vrl-012-ref.xht
TEST-UNEXPECTED-FAIL /media-source/mediasource-seek-beyond-duration.html
TEST-UNEXPECTED-FAIL mobile/android/components/extensions/test/mochitest/test_ext_browserAction_getPopup_setPopup.html
TEST-UNEXPECTED-FAIL ShutdownLeaks
TEST-UNEXPECTED-FAIL testing/marionette/harness/marionette_harness/tests/unit/test_switch_window_content.py TestSwitchToWindowContent.test_switch_tabs_with_focus_change
TEST-UNEXPECTED-FAIL toolkit/components/extensions/test/mochitest/test-oop-extensions/test_ext_cookies_expiry.html
TEST-UNEXPECTED-FAIL toolkit/components/pictureinpicture/tests/browser_toggleTransparentOverlay-1.js
TEST-UNEXPECTED-FAIL toolkit/components/printing/tests/browser_preview_print_simplify_non_article.js
TEST-UNEXPECTED-FAIL /webvtt/rendering/cues-with-video/processing-model/disable_controls_reposition.html
TEST-UNEXPECTED-NOTRUN /service-workers/service-worker/client-navigate.https.html
TEST-UNEXPECTED-TIMEOUT automation.py
TEST-UNEXPECTED-TIMEOUT dom/html/test/test_iframe_sandbox_navigation.html
TEST-UNEXPECTED-TIMEOUT dom/tests/mochitest/fetch/test_fetch_basic_sw_reroute.html
TEST-UNEXPECTED-TIMEOUT /fetch/sec-metadata/window-open.tentative.https.sub.html
TEST-UNEXPECTED-TIMEOUT /screen-capture/feature-policy.https.html
TEST-UNEXPECTED-TIMEOUT /service-workers/service-worker/client-navigate.https.html
TEST-UNEXPECTED-TIMEOUT toolkit/components/printing/tests/browser_preview_print_simplify_non_article.js (finished)

Tests which only failed in GCP

Test Status Test
TEST-FAIL devtools/client/debugger/test/mochitest/browser_dbg-worker-scopes.js
TEST-FAIL devtools/client/debugger/test/mochitest/browser_dbg-xhr-breakpoints.js
TEST-FAIL testing\marionette\harness\marionette_harness\tests\unit\test_crash.py TestCrash.test_unexpected_crash
TEST-FAIL testing\marionette\harness\marionette_harness\tests\unit\test_expectedfail.py TestFail.test_fails
TEST-UNEXPECTED-CRASH /encoding/legacy-mb-japanese/shift_jis/sjis-encode-form-errors-han.html?3001-4000
TEST-UNEXPECTED-CRASH /feature-policy/payment-default-feature-policy.https.sub.html
TEST-UNEXPECTED-CRASH /payment-request/payment-is-showing.https.html
TEST-UNEXPECTED-CRASH /payment-request/payment-request-canmakepayment-method-protection.https.html
TEST-UNEXPECTED-CRASH /payment-request/payment-request-ctor-currency-code-checks.https.html
TEST-UNEXPECTED-CRASH /payment-request/PaymentRequestUpdateEvent/updatewith-method.https.html
TEST-UNEXPECTED-CRASH /service-workers/service-worker/clients-get.https.html
TEST-UNEXPECTED-CRASH /shape-detection/shapedetection-cross-origin.sub.html
TEST-UNEXPECTED-CRASH /webxr/xrFrame_getPose.https.html
TEST-UNEXPECTED-ERROR /payment-handler/idlharness.https.any.html
TEST-UNEXPECTED-ERROR /payment-request/allowpaymentrequest/setting-allowpaymentrequest-timing.https.sub.html
TEST-UNEXPECTED-ERROR /payment-request/payment-request-id-attribute.https.html
TEST-UNEXPECTED-ERROR /payment-request/payment-response/onpayerdetailchange-attribute.https.html
TEST-UNEXPECTED-ERROR /resource-timing/resource_timing.worker.html
TEST-UNEXPECTED-ERROR testing/marionette/harness/marionette_harness/tests/unit/test_profile_management.py TestSwitchProfileWithWorkspace.test_new_random_profile_name
TEST-UNEXPECTED-FAIL /css/CSS2/bidi-text/bidi-box-model-007.xht
TEST-UNEXPECTED-FAIL /css/CSS2/floats/floats-wrap-top-below-inline-001r.xht
TEST-UNEXPECTED-FAIL /css/CSS2/generated-content/before-after-001.xht
TEST-UNEXPECTED-FAIL /css/CSS2/margin-padding-clear/margin-bottom-applies-to-009.xht
TEST-UNEXPECTED-FAIL /css/CSS2/margin-padding-clear/margin-right-applies-to-009.xht
TEST-UNEXPECTED-FAIL /css/CSS2/margin-padding-clear/padding-right-083.xht
TEST-UNEXPECTED-FAIL /css/CSS2/normal-flow/max-height-003.xht
TEST-UNEXPECTED-FAIL /css/CSS2/normal-flow/min-height-003.xht
TEST-UNEXPECTED-FAIL /css/CSS2/normal-flow/width-001.xht
TEST-UNEXPECTED-FAIL /css/css-backgrounds/border-image-width-005.xht
TEST-UNEXPECTED-FAIL /css/css-backgrounds/border-image-width-008.html
TEST-UNEXPECTED-FAIL /css/css-color/t422-rgba-clip-outside-device-gamut-b.xht
TEST-UNEXPECTED-FAIL /css/css-position/position-relative-table-tbody-top-absolute-child.html
TEST-UNEXPECTED-FAIL /css/css-values/ex-unit-001.html
TEST-UNEXPECTED-FAIL /css/vendor-imports/mozilla/mozilla-central-reftests/background/border-image-repeat-space-4.html
TEST-UNEXPECTED-FAIL devtools/client/debugger/test/mochitest/browser_dbg-xhr-breakpoints.js
TEST-UNEXPECTED-FAIL devtools/client/inspector/rules/test/browser_rules_custom.js
TEST-UNEXPECTED-FAIL dom/base/test/test_bug435425.html
TEST-UNEXPECTED-FAIL /dom/events/Event-timestamp-safe-resolution.html
TEST-UNEXPECTED-FAIL /fetch/http-cache/304-update.html
TEST-UNEXPECTED-FAIL /fetch/http-cache/cc-request.html
TEST-UNEXPECTED-FAIL file:///Z:/task_1556696155/build/tests/reftest/tests/layout/reftests/bugs/615121-1.html == file:///Z:/task_1556696155/build/tests/reftest/tests/layout/reftests/bugs/615121-1-ref.html
TEST-UNEXPECTED-FAIL /html/browsers/history/the-history-interface/history_go_no_argument.html
TEST-UNEXPECTED-FAIL /html/browsers/history/the-history-interface/history_go_to_uri.html
TEST-UNEXPECTED-FAIL /html/semantics/embedded-content/media-elements/track/track-element/track-cue-rendering-after-controls-removed.html
TEST-UNEXPECTED-FAIL /html/semantics/embedded-content/media-elements/track/track-element/track-cue-rendering-empty-cue.html
TEST-UNEXPECTED-FAIL /html/semantics/embedded-content/media-elements/track/track-element/track-cue-rendering-line-doesnt-fit.html
TEST-UNEXPECTED-FAIL /html/webappapis/timers/negative-setinterval.html
TEST-UNEXPECTED-FAIL http://10.0.2.2:8854/tests/dom/media/tests/crashtests/1185191.html
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/bug1423331-1.html
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/bug1423331-2.html
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/bug956530-1.html
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next1AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next1S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next1SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next2S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next2SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next3S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next3SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next4S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next4SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next5S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next5SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next6S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next6SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next7AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next7S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#next7SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev1AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev1S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev1SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev2S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev2SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev3S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev3SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev4S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev4SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev5S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev5SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev6S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev6SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev7AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev7S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-script-select.html#prev7SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next1AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next1S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next1SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next1SL
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next1SR
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next2S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next2SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next3S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next3SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next4S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next4SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next5S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next5SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next6S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next6SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next7AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next7S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next7SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#next8AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev1AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev1S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev1SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev1SL
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev1SR
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev2S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev2SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev3S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev3SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev4S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev4SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev5S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev5SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev6S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev6SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev7AD
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev7S_
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev7SA
TEST-UNEXPECTED-FAIL http://mochi.test:8888/tests/layout/base/tests/multi-range-user-select.html#prev8AD
TEST-UNEXPECTED-FAIL Last test finished
TEST-UNEXPECTED-FAIL layout/base/tests/test_reftests_with_caret.html
TEST-UNEXPECTED-FAIL /media-source/mediasource-config-change-mp4-av-framesize.html
TEST-UNEXPECTED-FAIL org.mozilla.geckoview.test.HistoryDelegateTest.onHistoryStateChange
TEST-UNEXPECTED-FAIL remote/test/browser/browser_runtime_executionContext.js
TEST-UNEXPECTED-FAIL remote/test/browser/browser_tabs.js
TEST-UNEXPECTED-FAIL remote/test/browser/browser_target.js
TEST-UNEXPECTED-FAIL /service-workers/service-worker/navigation-timing.https.html
TEST-UNEXPECTED-FAIL /service-workers/service-worker/postmessage.https.html
TEST-UNEXPECTED-PASS /2dcontext/shadows/2d.shadow.enable.y.html
TEST-UNEXPECTED-PASS /css/css-color/lab-007.html
TEST-UNEXPECTED-PASS /dom/events/Event-timestamp-safe-resolution.html
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED

thanks :bc, this is a lot of great information.

I see the build times needing fixing (I think some of the caching needs have yet to be implemented)- but assuming buildtimes can be reduces in half across the board we are good. Odd that windows builds ended up being faster in some instances.

the perf results, for the missing tests, this is the only one that isn't pgo or experimental (live, cold) or known broken (binast, and tpn*):
raptor-tp6-amazon-firefox opt

I am not sure why that didn't run, but on the tip of central I don't see it run in the tp6-1 job where it is specified in-tree.

unittests:
Given this data:
4414 AWS test failures and 4483 GCP test failures.

the results are for all purposes identical - a 1.5% increase in total failures seen on a small sample set. I would have been worried if the data was:
4414 AWS test failures and 4700 GCP test failures.

With the exception of build times and intermittent failures as bc noted, the builds themselves are stable.

You need to log in before you can comment on or make changes to this bug.