1769154 - run a subset of xpcshell tests with a conditioned profile

Assignee

Description

•

2 years ago

in looking at specific tests on mochitest and xpcshell, there are many failures. This bug is about xpcshell failures. I suspect most of these are because the profile has more data than what the test case input in.

dom/quota/test/xpcshell (ttung?)

test_initTemporaryStorage.js
test_storagePressure.js

browser/components/migration/tests/unit/test_MigrationUtils_timedRetry.js (dthayer?)

toolkit/components/places/tests/migration/ (mak?)

test_current_from_v53.js
test_current_from_v54.js
test_current_from_v61.js
test_current_from_v65.js
toolkit/components/satchel/test/unit/test_db_update_v5.js (mak?)

toolkit/components/url-classifier/tests/unit/test_dbservice.js (:dimi?)

security/manager/ssl/tests/unit/ (:keeler?)

test_ocsp_private_caching.js
test_db_format_pref_new.js
test_sss_readstate.js
test_sss_eviction.js
test_sss_readstate_empty.js
test_sss_readstate_garbage.js
test_sss_readstate_huge.js

browser/components/extensions/test/xpcshell/ (:mixedpuppy)

test_ext_bookmarks.js
test_ext_topSites.js
test_ext_urlbar.js
toolkit/mozapps/extensions/test/xpcshell/
test_XPIStates.js
test_shutdown_early.js
test_sideloads_after_rebuild.js
test_system_upgrades.js
test_systemaddomstartupprefs.js
test_update_isPrivileged.js
toolkit/components/extensions/test/xpcshell/
test_ExtensionStorageSync_migration_kinto.js
test_ext_schemas_manifest_permissions.js
test_ext_schemas_privileged.js
test_ext_startup_cache_telemetry.js
test_extension_permissions_migration.js

this is a new set of tests, so anything we skip we are not losing coverage, but I would like to know if anything should be split off as a separate bug or ideally not skip-if = condprof when we run the tests?

here is a try push with the tests failing:
https://treeherder.mozilla.org/jobs?repo=try&revision=b6c40be2b2f86f0f98b1a468f390de73b0303cb0&searchStr=xpcshell

Flags: needinfo?(shes050117)

Flags: needinfo?(mixedpuppy)

Flags: needinfo?(mak)

Flags: needinfo?(dothayer)

Flags: needinfo?(dlee)

Flags: needinfo?(dkeeler)

Dana Keeler (she/her) (use needinfo) (:keeler for reviews)

Comment 1

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #0)

security/manager/ssl/tests/unit/ (:keeler?)

test_ocsp_private_caching.js

This one could maybe be modified to account for the presence of unrelated data, but I don't know if it's worth it.

test_db_format_pref_new.js

test_sss_readstate.js

test_sss_eviction.js

test_sss_readstate_empty.js

test_sss_readstate_garbage.js

test_sss_readstate_huge.js

These all essentially rely on having a clean profile - it's probably best to skip them in this configuration.

Flags: needinfo?(dkeeler)

Shane Caraveo (:mixedpuppy)

Comment 2

•

2 years ago

I'm surprised there are not more extension test failures, we generally write xpcshell tests with a "clean room" expectation. I'd prefer that they are skipped, add a new bug for those extension tests and we'll have to deal with them at some point in the future.

Flags: needinfo?(mixedpuppy)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Updated

•

2 years ago

Depends on: 1769184

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 3

•

2 years ago

the second (and ideally last round) of tests that fail are here:
https://treeherder.mozilla.org/jobs?repo=try&revision=27cd0d824b29fd247e4b94d8f307589150595ed9

browser/components/migration/tests/unit/ (:gijs)

test_MigrationUtils_timedRetry.js
test_Chrome_passwords.js (windows only)
test_Chrome_passwords_emptySource.js (windows only)

browser/components/sessionstore/test/unit/ (:dao)

test_backup_once.js
test_migration_lz4compression.js
test_startup_nosession_async.js

from comment 0, I had :ttung and now realized that I should probably check with the triage owner (:jstutte)

this is a new set of tests, so anything we skip we are not losing coverage, but I would like to know if anything should be split off as a separate bug or ideally not skip-if = condprof when we run the tests?

Flags: needinfo?(shes050117)

Flags: needinfo?(jstutte)

Flags: needinfo?(gijskruitbosch+bugs)

Flags: needinfo?(dao+bmo)

Jens Stutte [:jstutte]

Comment 4

•

2 years ago

Looking at the failure of test_storagePressure.js I see an NS_ERROR_ABORT that apparently for simple DB can derive only from SDBConnection::CheckState.

One way to set AllowToClose is QuotaClient::InitiateShutdown which is called as part of the shutdown sequence. Is there any chance that we initiated shutdown in parallel while the test was still running?

Unfortunately the X-condprof-fis task seems not supported by pernosco in treeherder? Or am I overlooking something?

Flags: needinfo?(jstutte)

Jens Stutte [:jstutte]

Comment 5

•

2 years ago

(in any case :janv is the better expert here)

Flags: needinfo?(jvarga)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 6

•

2 years ago

thanks :jstutte, I am not sure what tasks work in Pernosco- for that test case it would be a good tool to see if this is a product issue or a test case issue.

:Gijs (he/him)

Comment 7

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #3)

browser/components/migration/tests/unit/ (:gijs)

test_MigrationUtils_timedRetry.js

test_Chrome_passwords.js (windows only)

test_Chrome_passwords_emptySource.js (windows only)

I don't really understand what we're trying to achieve here. That is, these tests fail with stuff like:

TEST-UNEXPECTED-FAIL | browser/components/migration/tests/unit/test_Chrome_passwords.js | test_importIntoEmptyDB - [test_importIntoEmptyDB : 295] There are no logins initially - 413 == 0

Sounds like that would be expected, because the profile isn't empty.

What are we hoping to achieve by running these tests against a "conditioned" profile? I can't tell from reading this bug and bug 1769184 and bug 1769097, so I don't know how to answer the needinfo.

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 8

•

2 years ago

thanks :gijs, the goal is to find issues with our automation where a pre-existing profile might cause issues. If these tests would not be useful to run in that scenario, that would be good to know- likewise anything else in the related manifests. It seemed somewhat interesting to me, but I know that many tests are written with the assumption of a blank/empty profile.

Flags: needinfo?(jmaher)

:Gijs (he/him)

Comment 9

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #8)

thanks :gijs, the goal is to find issues with our automation where a pre-existing profile might cause issues. If these tests would not be useful to run in that scenario, that would be good to know- likewise anything else in the related manifests. It seemed somewhat interesting to me, but I know that many tests are written with the assumption of a blank/empty profile.

The windows-only failures for Chrome_* are not that interesting so I'd just disable for this new set of tests.

It's much harder to say what is going on with the timedRetry test. It looks like it may be sqlite-related. In bug 1760074 we're seeing the same failures on macOS debug, but this is Linux opt and looks to be deterministic (3 runs and they all failed) so I don't know if that's going to be helpful. In that bug Jari did some investigation, so perhaps Jari is interested or has ideas why it'd be failing here.

Flags: needinfo?(jjalkanen)

Dimi Lee [:dimi]

Comment 10

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #0)

toolkit/components/url-classifier/tests/unit/test_dbservice.js (:dimi?)

This testcase can be modified to fit the use case of conditioned profile without too much effort.
I can file another bug to fix this if this is how we want to proceed.

Flags: needinfo?(dlee)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 11

•

2 years ago

:dimi, if it is fairly simple, then please file a bug and let me know what I can do to help.

Dimi Lee [:dimi]

Comment 12

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #11)

:dimi, if it is fairly simple, then please file a bug and let me know what I can do to help.

Filed Bug 1769828.

Marco Bonardo [:mak]

Comment 13

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #0)

toolkit/components/places/tests/migration/ (mak?)

test_current_from_v53.js

test_current_from_v54.js

test_current_from_v61.js

test_current_from_v65.js

These are testing database upgrade, that means we are putting a specific places.sqlite in the profile folder. Before doing that, we check there's no existing places.sqlite as a way to check a db connection has noot been initialized yet. It may be possible to add an alternative way to do thatin code but it doesn't seem worth it. Running these tests on a conditioned profile wouldn't bring interesting advantages in my opinion.

toolkit/components/satchel/test/unit/test_db_update_v5.js (mak?)

This is pretty much the same, it's copying an old version db into the profile folder and initializing the component, then checking migration. My conclusion is the same.

Flags: needinfo?(mak)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 14

•

2 years ago

thanks :mak and :dimi

Jan Varga [:janv]

Comment 15

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #0)

test_storagePressure.js

The test depends on having no data stored for example.com. I guess a conditioned profile contains data for example.com in <profile>/storage, so the test fails because of that. We can either clear all data for example.com in the beginning of the test or just use a different (unique) origin.

Flags: needinfo?(jvarga)

Jari Jalkanen

Comment 16

•

2 years ago

(In reply to :Gijs (he/him) from comment #9)

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #8)

thanks :gijs, the goal is to find issues with our automation where a pre-existing profile might cause issues. If these tests would not be useful to run in that scenario, that would be good to know- likewise anything else in the related manifests. It seemed somewhat interesting to me, but I know that many tests are written with the assumption of a blank/empty profile.

The windows-only failures for Chrome_* are not that interesting so I'd just disable for this new set of tests.

It's much harder to say what is going on with the timedRetry test. It looks like it may be sqlite-related. In bug 1760074 we're seeing the same failures on macOS debug, but this is Linux opt and looks to be deterministic (3 runs and they all failed) so I don't know if that's going to be helpful. In that bug Jari did some investigation, so perhaps Jari is interested or has ideas why it'd be failing here.

When it comes to the timedRetry test, from the logs it appears that we are getting both the expected exception and the unexpected disk I/O error which gets handed over to the user after the retries have been exhausted. Just like previously seen on mac, this is working as expected.

The puzzle is, why is sqlite unable to access the disk and why is this reproducible?

:jmaher To understand the issue better, would it be possible to have pernosco enabled for this kind of job under the Artifacts and Tools menu?

From this Dockerfile it seems that our overlay file system might already be known to have performance challenges - could this contribute to the disk accessibility?

Also, /builds/worker/workspace seems to be the unavailable disk and something is mounted to it somehow, how sure are we that there is no contention on that resource?

Flags: needinfo?(jjalkanen)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Updated

•

2 years ago

Depends on: 1773763

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 17

•

2 years ago

I am back working on this bug after a few weeks AFK. here are the remaining test cases to sort out (most of these are new failures)

For reference we are running these tests in a new configuration (i.e. not reducing existing coverage) in hope to try and find new bugs. The new configuration will be a conditioned profile, one that has browsing history that is updated every day.

To answer the question above about the docker file and performance, I am not sure. Does this specific test case access resources on there whereas other test cases do not (or only briefly)?
browser/components/migration/tests/unit/xpcshell.ini

test_MigrationUtils_timedRetry.js - conversation above

The remaining tests haven't been mentioned above and I want to bring visibility to this (i.e. confirm it is not realistic to run with a conditioned profile, or fix the testcase or firefox!):

dom/localstorage/test/unit/xpcshell.ini (:jstutte)

test_unicodeCharacters.js

security/manager/ssl/tests/unit/xpcshell.ini (:keeler)

test_sss_savestate.js

toolkit/components/cleardata/tests/unit/xpcshell. (:pbz)

test_network_cache.js

toolkit/components/satchel/test/unit/xpcshell.ini (:serg)

test_notify.js

Flags: needinfo?(sgalich)

Flags: needinfo?(pbz)

Flags: needinfo?(jstutte)

Flags: needinfo?(dkeeler)

Dana Keeler (she/her) (use needinfo) (:keeler for reviews)

Comment 18

•

2 years ago

test_sss_savestate.js

Same as comment 1 - this essentially relies on having a clean profile, and should be skipped in this configuration.

Flags: needinfo?(dkeeler)

Sergey Galich [:serg]

Comment 19

•

2 years ago

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #17)

The remaining tests haven't been mentioned above and I want to bring visibility to this (i.e. confirm it is not realistic to run with a conditioned profile, or fix the testcase or firefox!):
...
toolkit/components/satchel/test/unit/xpcshell.ini (:serg)

test_notify.js

I've looked at test_notify.js and it doesn't feel like it sets invisible expectations on what data may already be in the profile. This line is deleting whatever is stored before test runs. Let me know if it fails with conditioned profile and we will fix it.

Flags: needinfo?(sgalich)

Paul Zühlcke [:pbz]

Comment 20

•

2 years ago

I suggest we skip test_network_cache.js since it assumes specific cache state in order to test (partitioned) cache clearing. Happy to discuss this further if you think otherwise!

Flags: needinfo?(pbz)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 21

•

2 years ago

thanks :pbz, I can add a note in the manifest and skip test_network_cache.js while running with a condprof.

:serg, regarding test_notify.js, the test is intermittent and today with 8 runs on try it seems to be passing ( https://treeherder.mozilla.org/jobs?repo=try&test_paths=toolkit%2Fcomponents%2Fsatchel%2Ftest%2Funit&revision=fb8081723f1c9345cd7241f5353af16b5eaf89dd ) I do double check these tests before needinfo and before landing, so in this case I think the problem has gone away!

I am also going to remove the skip-if on test_unicodeCharacters.js and test_MigrationUtils_timedRetry.js as they are not failing in recent try pushes. I tried a bunch of other tests and they are still failing.

Flags: needinfo?(jstutte)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 22

•

2 years ago

Attached file Bug 1769154 - Run some xpcshell tests with a conditioned profile. r=aryx! (deleted) — Details

Pulsebot

Comment 23

•

2 years ago

Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7b707ff8a7d1 Run some xpcshell tests with a conditioned profile. r=aryx,extension-reviewers,kmag

bszekely

Comment 24

•

2 years ago

Backed out for causing xpcshell failures on test_system_upgrades.js

Failure line: TEST-UNEXPECTED-FAIL | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_system_upgrades.js | xpcshell return code: 0
Return code: 1

Push with failures

Failure log

Backout link

[task 2022-06-22T13:57:11.367Z] 13:57:11     INFO -  TEST-PASS | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_upgrade.js | took 1550ms
[task 2022-06-22T13:57:11.372Z] 13:57:11     INFO -  Retrying tests that failed when run in parallel.
[task 2022-06-22T13:57:11.375Z] 13:57:11     INFO -  TEST-START | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_system_upgrades.js
[task 2022-06-22T13:57:12.477Z] 13:57:12  WARNING -  TEST-UNEXPECTED-FAIL | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_system_upgrades.js | xpcshell return code: 0
[task 2022-06-22T13:57:12.478Z] 13:57:12     INFO -  TEST-INFO took 1101ms
[task 2022-06-22T13:57:12.478Z] 13:57:12     INFO -  >>>>>>>
[task 2022-06-22T13:57:12.479Z] 13:57:12     INFO -  PID 23400 | [Parent 23400, Main Thread] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /builds/worker/checkouts/gecko/toolkit/crashreporter/nsExceptionHandler.cpp:2972
[task 2022-06-22T13:57:12.480Z] 13:57:12     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2022-06-22T13:57:12.480Z] 13:57:12     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2022-06-22T13:57:12.481Z] 13:57:12     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2022-06-22T13:57:12.481Z] 13:57:12     INFO -  running event loop
[task 2022-06-22T13:57:12.482Z] 13:57:12     INFO -  xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_system_upgrades.js | Starting test_system_addon_upgrades
[task 2022-06-22T13:57:12.483Z] 13:57:12     INFO -  (xpcshell/head.js) | test test_system_addon_upgrades pending (2)
[task 2022-06-22T13:57:12.483Z] 13:57:12     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
[task 2022-06-22T13:57:12.483Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231704	addons.manager	DEBUG	Application has been upgraded"
[task 2022-06-22T13:57:12.484Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231709	addons.manager	DEBUG	Loaded provider scope for resource://gre/modules/addons/GMPProvider.jsm"
[task 2022-06-22T13:57:12.485Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231709	addons.manager	DEBUG	Starting provider: XPIProvider"
[task 2022-06-22T13:57:12.485Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231710	addons.xpi	DEBUG	startup"
[task 2022-06-22T13:57:12.485Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231711	addons.xpi	INFO	SystemAddonLocation directory is missing"
[task 2022-06-22T13:57:12.486Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231728	addons.xpi	INFO	Removing all system add-on upgrades."
[task 2022-06-22T13:57:12.486Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231729	addons.xpi	DEBUG	checkForChanges"
[task 2022-06-22T13:57:12.491Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231729	addons.xpi	DEBUG	Loaded add-on state: ${}"
[task 2022-06-22T13:57:12.492Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231732	addons.xpi	DEBUG	New add-on updates@test in app-system-defaults"
[task 2022-06-22T13:57:12.492Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231733	addons.xpi	DEBUG	scanForChanges changed: true, state: {}"
[task 2022-06-22T13:57:12.492Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231748	addons.xpi-utils	DEBUG	Synchronously loading the add-ons database"
[task 2022-06-22T13:57:12.492Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231749	addons.xpi-utils	DEBUG	Starting async load of XPI database /tmp/xpc-profile-bdy566xi/extensions.json"
[task 2022-06-22T13:57:12.493Z] 13:57:12     INFO -  "CONSOLE_MESSAGE: (info) 1655906231758	addons.xpi-utils	DEBUG	New add-on updates@test installed in app-system-defaults"
[task 2022-06-22T13:57:12.493Z] 13:57:12     INFO -  PID 23400 | [2022-06-22T13:57:12Z WARN  rkv::backend::impl_safe::environment] `load_ratio()` is irrelevant for this storage backend.
[task 2022-06-22T13:57:12.493Z] 13:57:12     INFO -  "Bootstrap method install 5 for updates@test version 1.0"
[task 2022-06-22T13:57:12.493Z] 13:57:12     INFO -  TEST-PASS | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_system_upgrades.js | test_system_addon_upgrades - [test_system_addon_upgrades : 162] Should not have seen install method call for updates@test - true == true

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Assignee

Comment 25

•

2 years ago

I updated the patch to fix a missing line in the manifest and scheduled for relanding. Thanks for backing this out

Flags: needinfo?(jmaher)

Pulsebot

Comment 26

•

2 years ago

Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/93c78ec9fb88 Run some xpcshell tests with a conditioned profile. r=aryx,extension-reviewers,kmag

Norisz Fay [:noriszfay]

Comment 27

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/93c78ec9fb88

Status: NEW → RESOLVED

Closed: 2 years ago

status-firefox103: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 103 Branch

Treeherder Bug Filer

Updated

•

2 years ago

Regressions: 1776135

Andreea Pavel [:apavel]

Updated

•

2 years ago

Regressions: 1776351