Virtualenv import scope should behave consistently
Categories
(Firefox Build System :: Mach Core, enhancement)
Tracking
(firefox97 fixed)
Tracking | Status | |
---|---|---|
firefox97 | --- | fixed |
People
(Reporter: mhentges, Assigned: mhentges)
References
(Blocks 1 open bug)
Details
Attachments
(31 files, 5 obsolete files)
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details |
There's additional details and discussion in this bug.
TL;DR: Virtualenvs should have the same packages in scope, regardless of how they're used:
./mach <command-using-venv>
or<venv>/bin/python <script>
In both of these cases, they should have both their own requirements and Mach's requirements in-scope.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 1•3 years ago
|
||
Both of them are version 2018.4.16
, but our local one is vendored in a
pip
-compatible way (it includes a .dist-info
directory).
FWIW, we'll probably need to keep these two versions in-sync (CI
should be verifying this in virtualenv-compatibility tests) because:
certifi
is needed bysentry-sdk
, a Mach-global dependency- Web platform tests should use the version of
certifi
that
exists in the upstreamcertifi
repo.
Depends on D126281
Assignee | ||
Comment 2•3 years ago
|
||
Mach's import scope includes:
- Its
pth
entries - Its pip packages, which is either:
- The Mach virtualenv site-packages, if packages were "pip
installed" over the internet. - The system environment's site-packages, if installing packages
over the internet is disabled and Mach is grabbing packages from
the system Python instead.
- The Mach virtualenv site-packages, if packages were "pip
Command virtualenvs already had this import scope when they were
dynamically activated from an existing Mach process, but not when
they were used directly (e.g. by <venv>/bin/python <script>
).
By resolving that inconsistency, there was added risk of system packages
conflicting with command virtualenv packages. So, this patch also
includes behaviour to verify system<=>command-venv compatibility at
activation-time.
A few notes about this system-package-verification:
- It happens at virtualenv activation-time instead of build-time because
system packages may change after the virtualenv is built. - It takes roughly 1.5s, which is rough, but it should mainly occur in
CI, where it's acceptable. Devs usingMACH_USE_SYSTEM_PACKAGES
should unset the variable to avoid the time delay. - The algorithm works by asserting top-level requirements (e.g.
psutil>=5.4.2,<=5.8.0
), then runspip check
over the combined set
of packages that would be in-scope.
A side benefit of the discoveries in this patch is that installing
packages in virtualenvs was made faster by adding our vendored packages
to pthfiles beforehand. This makes pip install
not redundantly install
any packages that are already vendored.
As part of this, pth
references were made absolute instead of
relative (because assert_system_package_compatibility()
could operate
on a different Windows partition from topsrcdir
).
We had explicitly used relative paths before so that topsrcdir
could be moved without having unexpected pth
-related failures.
However, now that we're verifying pth
entries and rebuilding
virtualenvs if they don't line up, we shouldn't see any regressions.
Depends on D120402
Comment 4•3 years ago
|
||
Backed out 10 changesets (Bug 1712151, Bug 1724279, Bug 1730712, Bug 1717051, Bug 1723031, Bug 1731145) for causing failures on test_yaml.py
Log: https://treeherder.mozilla.org/logviewer?job_id=352859436&repo=autoland&lineNumber=925
Bustage: https://treeherder.mozilla.org/logviewer?job_id=352865415&repo=autoland&lineNumber=1148
Backout: https://hg.mozilla.org/integration/autoland/rev/b59ab731aecfcc3b9cb4e1388950b70039bf95ac
Assignee | ||
Updated•3 years ago
|
Comment 6•3 years ago
|
||
bugherder |
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 7•3 years ago
|
||
install_pip_package()
does some logic around not running pip if a
package is already installed.
However, this is redundant, because when building a venv, no packages
are installed.
Additionally, this was the constraint forcing us to populate the
virtualenv in a separate process from the one building it.
Assignee | ||
Comment 8•3 years ago
|
||
Now that packages are consistently installed in a dedicated
pip install
process, we can remove our funky re-execution of
virtualenv.py
.
Depends on D129294
Assignee | ||
Comment 9•3 years ago
|
||
Inline a variable used once, and ensure that old_env_variables
is
defined consistently before the finally
block runs to satisfy an
intellisense warning.
Also, ensure that a wide comment is shuffled underneath the line limit.
Depends on D129295
Assignee | ||
Comment 10•3 years ago
|
||
Mach already verifies the Python version during initialization.
Maintaining a second version check can be tough to keep up-to-date, as
we're already seeing due to the obsolete Python 2 check.
Depends on D120402
Assignee | ||
Comment 11•3 years ago
|
||
The optional log_handle
argument was only used by:
- Configure, but the output was always dumped to stdout and not
config.log
. The manual logging was used to handle encoding issues,
but those are likely invalidated by the Python 3 migration. ./mach doc
, where we were putting virtualenv setup into stderr,
which seems incorrect. The commit adding it doesn't explain why it's
the case, but I'm guessing it shouldn't be too risky to remove.
Depends on D129297
Assignee | ||
Comment 12•3 years ago
|
||
The base_python
was much more useful when we had both Python 2 and
Python 3 virtualenvs.
However, we're now using Python 3 across the board, and the need to have
a different Python 3 for "configure" vs for the rest of the build is
unlikely.
Depends on D129298
Updated•3 years ago
|
Assignee | ||
Comment 13•3 years ago
|
||
The virtualenv python is always Python 3, so inline the version_info()
function.
Depends on D129299
Assignee | ||
Comment 14•3 years ago
|
||
The command venv manager needs to be able to do ad-hoc pip
installations, while the Mach venv manager needs access to
requirements()
to manage its special activation process.
By splitting the class into two, we can now give each use case the
attention it deserves.
Updated•3 years ago
|
Assignee | ||
Comment 15•3 years ago
|
||
It's only used in one location, so the separate function is
unncessary.
Depends on D129324
Assignee | ||
Comment 16•3 years ago
|
||
There's a few places where we care about calculating important paths in
general Python virtualenvs.
We already had a bit of an abstraction with VirtualenvHelper
, but:
- The name wasn't super useful,
site-packages
dir calculation could be moved, and- The inheritance could be simplified by using "composition" instead.
Note: activate_this.py
doesn't apply to all Python virtualenvs (it's
only created by virtualenv
, but not venv
), so it shouldn't be in new
abstraction.
Depends on D129684
Assignee | ||
Comment 17•3 years ago
|
||
MozVirtualenvMetadata
only needs to know the virtualenv its operating
on - the knowledge of its internal metadata filename should not have to
be exposed.
Depends on D129685
Assignee | ||
Comment 18•3 years ago
|
||
create()
was only called from one place, and had more complex logic
than necessary.
Related to this change, check_call()
is used rather than asserting the
return code.
Depends on D129686
Assignee | ||
Comment 19•3 years ago
|
||
Rather than requiring that consumers remember to ensure()
before
calling activate()
, we can do so automatically during activation.
There isn't a valid use case in which we want obsolete virtualenvs to be
activateable.
Usages of VirtualenvManager
have been updated accordingly.
Depends on D129687
Assignee | ||
Comment 20•3 years ago
|
||
The information about the Python executable is now stored with other
details in the JSON metadata file we put in each virtualenv.
Depends on D129688
Assignee | ||
Comment 21•3 years ago
|
||
As of bug 1686168, __PYVENV_LAUNCHER__
is purged when Mach starts
running. Since VirtualenvManager
is only used by Mach, its internal
purging of the environment variable is redundant.
Depends on D129689
Assignee | ||
Comment 22•3 years ago
|
||
IIRC, this was added prematurely for an upcoming test due to a faulty
VCS commit split.
Let's temporarily remove this parameter, then re-add it when needed for
whatever test consumes it.
Depends on D129690
Assignee | ||
Comment 23•3 years ago
|
||
Sorry for the flip-flop on technique here :S
validate_environment_packages()
was originally run when checking if a
venv is up-to-date to ensure that ad-hoc pip installs didn't replace
needed packages with those of different versions.
However, since it was added, a few notes have come up:
- The case where requirements change isn't caught by this - that will
be caught by the cheap "a requirements file has changed on-disk"
check. - This is really slow, and doing it for most Mach commands is not worth
it (as evident by how theskip_pip_package_check
was already added
for the Mach virtualenv's use case). - If the tree as-is doesn't have (common) cases where ad-hoc
installations break an environment, then this check, though helpful,
isn't adding a significant amount of value considering its performance
cost.
However, these aren't to say that this won't be valuable in the future:
I'd like to reach a point where virtualenvs are considered "sealed" by
default: no ad-hoc pip installations are allowed.
However, add the ability to mark virtualenvs as unsealed/"allowing
ad-hoc pip installations". Then, re-add the pip package check, but only
for such flexible, unsealed virtualenvs.
Depends on D129691
Assignee | ||
Comment 24•3 years ago
|
||
There were a bunch of locations where we were doing path shenanigans
with requirements.pth/vendored
items.
There was a bit of complexity because we were specifically making each
pthfile
line be a relative path to support moving the Firefox
topsrcdir without causing issues.
However, now that we're more intelligent about checking if pthfile
lines are up-to-date (and since moving your topsrcdir will still require
re-generating the Mach virtualenv), this behaviour became less useful.
So, once this was simplified and all usages were happy with absolute
paths, we can safely generalize it back to MachEnvRequirements
and
reuse it everywhere.
Depends on D129692
Updated•3 years ago
|
Assignee | ||
Comment 25•3 years ago
|
||
Define an interface to fetch details about the external Python
environment.
Moves environment-validation logic from requirements.py
accordingly.
Depends on D129529
Assignee | ||
Comment 26•3 years ago
|
||
It's possible for the PYTHON3
config to point to a different Python
than that that is executing the configure scripts themselves.
This flexibility was needed for the Python 2->3 migration, but now that
it's complete, we can remove the extra configuration and just lean on
the Python interpreter used to run configure.
Depends on D129298
Assignee | ||
Comment 27•3 years ago
|
||
Now that the "build" virtualenv isn't created during configure, we can
remove the logic that tries to temporarily shelter pip from
configure's changes.
Assignee | ||
Comment 28•3 years ago
|
||
- The return value was never used, so
.run(check=True)
could change to
.check_call()
**kwargs
was always the same (stderr=subprocess.STDOUT
)`
Depends on D129684
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 29•3 years ago
|
||
As _run_pip()
is being removed from VirtualenvManager
in an upcoming
patch, its usages need to be removed. Besides, they're using an
"internal" function, which is a bit of a smell.
Note that this could have been solved by exposing a public run_pip()
function. However, I felt like that was worse because:
- Friction here is good as we try to migrate the codebase to embrace the
"requirements definition file" technique to install dependencies. - There could be confusion about the relationship between
install_pip_package()
(only works
if venv already activated) and_run_pip()
, which works "in general".
Comment 30•3 years ago
|
||
Comment 31•3 years ago
|
||
bugherder |
Comment 32•3 years ago
|
||
Comment on attachment 9248055 [details]
Bug 1730712: Add requirements.pths_as_absolute()
Revision D129693 was moved to bug 1739177. Setting attachment 9248055 [details] to obsolete.
Assignee | ||
Comment 33•3 years ago
|
||
The existing terminology had two issues:
VirtualenvManager
wasn't always associated with an on-disk
virtualenv
: for example, when running in automation, Mach
"activates" aVirtualenvManager
, updating its import scope,
but without ever creating an on-diskvirtualenv
.- An upcoming patch splits the
VirtualenvManager
class, pulling
"on-disk virtualenv-handling functions" from the project-wide
interface for managing Python's import scope.
After some good discussion with Ahal, I think we've struck
the terminology that handles this distinction well: we'll call
the "import scope"-handling part the "site", and we'll continue
to call on-disk virtualenvs (and their representative classes)
as, well, virtualenvs.
Depends on D130120
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 34•3 years ago
|
||
Comment 35•3 years ago
|
||
bugherder |
Comment 36•3 years ago
|
||
Comment 37•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/9ce726b8f7e5
https://hg.mozilla.org/mozilla-central/rev/3d3d73e5831d
https://hg.mozilla.org/mozilla-central/rev/9d7231ec989e
https://hg.mozilla.org/mozilla-central/rev/58299cc393fb
https://hg.mozilla.org/mozilla-central/rev/5444bc2d5456
https://hg.mozilla.org/mozilla-central/rev/2c801a0ee8b1
Comment 38•3 years ago
|
||
Assignee | ||
Comment 39•3 years ago
|
||
In an upcoming patch, command virtualenvs will only be able to
created if a Moz-managed site is currently active.
This is because being in an active site provides a few guarantees:
- We know that the Mach site is compatible with the system, (when using
the system python) - If we're adding "pth" references to the Mach site, we know it's been
created. - The "external python" is already defined: there's no guesswork as to
if "sys.executable" is the real "external python" or not.
Now, this does add some duplication: the "build" site is set up in
building.py
and configure.py
. I would've preferred to remove the
one in configure.py
, but that file is directly invoked by
configure.in
scripts, and I'd like them to fail gracefully if they're
still used and their $PYTHON3
is set to an unexpected executable.
Updated•3 years ago
|
Comment 40•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8867abf5bb2f
https://hg.mozilla.org/mozilla-central/rev/5fc6520fe155
https://hg.mozilla.org/mozilla-central/rev/08ecc92e06e8
https://hg.mozilla.org/mozilla-central/rev/ff557ad3fd38
https://hg.mozilla.org/mozilla-central/rev/0dd570bbd878
https://hg.mozilla.org/mozilla-central/rev/6086617710c1
https://hg.mozilla.org/mozilla-central/rev/7a1f17f3f10b
https://hg.mozilla.org/mozilla-central/rev/cb50d288b729
https://hg.mozilla.org/mozilla-central/rev/d5dcd6536122
https://hg.mozilla.org/mozilla-central/rev/e8829be338ed
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 42•3 years ago
|
||
Assignee | ||
Comment 43•3 years ago
|
||
It's no longer necessary to explicitly grab the system Python handle to
access its packages.
Assignee | ||
Comment 44•3 years ago
|
||
Depends on D132081
Assignee | ||
Comment 45•3 years ago
|
||
Now that command virtualenvs are automatically containing Mach's
sys.path entries, they no longer need to be duplicated.
For compatibility's sake, all "common" requirements have
been moved to be "Mach" requirements. As time goes on
and we know which packages aren't needed for Mach itself,
we can move them back into the common (to commands) requirements
file, and eventually into only the virtualenvs that need them.
Depends on D132082
Updated•3 years ago
|
Comment 46•3 years ago
|
||
bugherder |
Assignee | ||
Comment 47•3 years ago
|
||
Even if a command site has its own comfy virtualenv, if Mach is using
the system packages then they'll still be in-scope for commands.
So, we still need to _assert_pip_check()
in case the command's
vendored packages conflict.
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 48•3 years ago
|
||
Writing a comment to track: there's still a remaining issue when there's multiple layers of Mach, where the top-level system packages get ignored - I think Mach is only keeping packages from the "parent" executable, and not including the paths inherited by the "parent" executable.
Assignee | ||
Comment 49•3 years ago
|
||
As of D127182, we now use a vendored version of pip
instead of
installing it into virtualenvs.
This means that, even if Mach isn't using a virtualenv, we //will// have
pip
in-scope once the Mach site is active.
So, remove the redundant "is pip available" check from
CommandSiteManager.
Depends on D132082
Assignee | ||
Comment 50•3 years ago
|
||
This patch resolves cases like the following:
- The system Python has
zstandard
. ./mach python --virtualenv psutil <script>
is run, addingpsutil
to the import scope.<script>
runs Mach a second time, and this time Mach needs to
importzstandard
.
The previous behaviour would add the "site-packages" of the //invoking//
Python interpreter, but ancestor packages would get dropped.
To rectify this issue, this patch changes "import inheritance" to keep
all of sys.path
, rather than just
<external-python>.all_site_packages_dirs()
.
Depends on D134200
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 51•3 years ago
|
||
pytest
requires pluggy<0.7,>=0.5
and py>=1.5.0
.
Since python-test
jobs use a Docker image that installs
tox_requirements
, and the system packages are used (due to zstandard
existing), we need to ensure compatibility.
Depends on D132082
Updated•3 years ago
|
Assignee | ||
Comment 52•3 years ago
|
||
It's no longer necessary to explicitly grab the system Python handle to
access its packages.
Depends on D134201
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 53•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 54•3 years ago
|
||
Port some sys.path
modifications to the centralized system.
Not only is this cleaner (ad-hoc global modifications are spicy), but
this also avoids "reordering" issues that can creep in during nested
calls, causing virtualenvs to be unnecessarily recreated.
Comment 55•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Updated•3 years ago
|
Comment 56•3 years ago
|
||
Comment 57•3 years ago
|
||
Assignee | ||
Updated•3 years ago
|
Comment 58•3 years ago
|
||
Updated•3 years ago
|
Comment 59•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Updated•3 years ago
|
Comment 60•3 years ago
|
||
Comment 61•3 years ago
|
||
bugherder |
Comment 62•3 years ago
|
||
Comment on attachment 9252375 [details]
WIP: Bug 1730712: symbols_archive.py should use zstandard directly
Revision D132081 was moved to bug 1755515. Setting attachment 9252375 [details] to obsolete.
Description
•