[meta] Centralize Python dependency definitions
Categories
(Firefox Build System :: Mach Core, task, P2)
Tracking
(Not tracked)
People
(Reporter: mhentges, Unassigned)
References
(Depends on 8 open bugs, Blocks 2 open bugs)
Details
(Keywords: meta)
We currently define Python dependencies in a bunch of different locations, sometimes vendored and sometimes from PyPI.
We should centralize this and manage constraints (such as the "build" mach command not permitting non-vendored dependencies) accordingly.
See "Phase 1" of the "Consistently Handle Python Dependencies" proposal.
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Comment 1•3 years ago
|
||
Had minor Matrix chatter with :glandium and a good discussion with :ahal about some python-dependency constraints and how we can find a solution within them, which I'd like to document here for future reference.
Constraints:
- [main point of this ticket]: different Mach commands should be able to have different, conflicting Python requirements
- We can't vendor native Python package code because:
- We'd need to vendor wheels (space-inefficient), or
- We'd need to build packages before doing work in CI (time-inefficient, also have to set up Python build environment dependencies)
- We can't download Python packages from PyPI.org because:
- We don't want PyPI.org downtime to affect releases/chemspills
- Some downstream builders of Firefox don't allow network access during a build whatsoever
- We can't lock/pin python dependencies for all commands because:
- We install native python packages onto our CI workers' system python at worker-provision time (so, the worker can't be compatible with all worked-on revisions' lockfiles).
- In-process activation of virtualenvs is "risky", and has to be manually verified to be compatible. We verify compatibility only between the
mach
venv and each command venv. Activating from command-venv to another command-venv is unsafe, since they may have conflicting packages.- Therefore, Mach must not being invoked with a command venv's python, because then it's erroneously be doing
command-venv -> Mach venv -> another command-venv
, which is unsafe.- This is a new constraint that doesn't match existing CI expectations, and requires changes. (note: ahal, this is the cause of the failure of the task I sent to you today)
- Therefore, Mach must not being invoked with a command venv's python, because then it's erroneously be doing
- [existing behaviour] It must be possible to run scripts in a new Python process with the current virtualenv (keeping in-tree
sys.path
entries) - [ideal] A command virtualenv should behave identical between:
./mach $command
(virtualenv lifecycle:system python -> Mach venv -> $command-venv
)$objdir/_virtualenvs/$command/bin/python
(virtualenv lifecycle:$command-venv
)- This currently isn't true when the Mach venv has dependencies that $command-venv doesn't have. For example:
./mach $command
=>import glean_sdk
β$objdir/_virtualenvs/$command/bin/python -c "import glean_sdk"
β
Solutions:
-
Virtualenv consistency:
- On environments where we can download Python packages:
- Create Mach env as normal
- Create command venv, add Mach env's
site-packages
to a pthfile, add Mach env'ssys.path
modifications to the command venv too
- On environments where we cannot download Python packages:
- For the Mach env: first, verify that the external Python package top-level requirements are met (e.g.
psutil>=5.4.0,<=5.8.0
, etc). Then, create the venv with in-treesys.path
entries. Add system'ssite-packages
to a pthfile. - For the command env: first, verify that the external Python package top-level requirements are met (e.g.:
zstandard==0.15.2
, etc). Then, create the command venv with in-treesys.path
entries, add Mach env'ssys.path
modifications, then finally add systemsite-packages
to a pthfile.
- For the Mach env: first, verify that the external Python package top-level requirements are met (e.g.
- Why?
- This makes virtualenv-creation consistent: always use a Mach venv, always have a virtualenv for commands. (Right now in CI, the Mach virtualenv isn't created)
- Regardless of how a virtualenv is used,
import
will always behave consistently
- On environments where we can download Python packages:
-
Pinning packages
- For developer machines, Moz CI (non-build): talk to PyPI.org directly, use lockfiles
- [discussion with :glandium upcoming] For Moz CI (build): use pypi.pub.b.m.o and lockfiles. This ensures we still don't have a dependency on PyPI for chemspills, while still allowing the embracing of lockfiles.
- For downstream workers that don't have network access: Warn that lockfiles aren't used, but continue operation after verifing each venv's top-level requirements (e.g.
psutil>=5.4.0,<=5.8.0
) against the system python. - Why?
- This embraces lockfiles everywhere possible while still supporting the downstream builder network-less constraint.
- The assertion of "top-level requirements" against system Python is effective because Firefox developers will know how to determine their minimum/maximum library version bounds, while downstream will have sufficient information to simply change package versions on their workers without having to dig into Mach command source code.
- Note: the assertion of "top-level requirements" is a bonus feature we can provide for downstream, but isn't necessarily required up-front.
-
./mach python
changes:--no-virtualenv
is obsolete (was needed forget_and_diffoscope
task in the past, but no longer). Remove it.--no-activate
is a stand-in for supporting the "get native python packages from system python" use case. This will be solved by the "virtualenv consistency" solution listed above (systemsite-packages
in-scope for command venvs)--requirements
is only used to download and installzstandard
: replace this with a--virtualenv
argument that uses a centralized virtualenv. Use themach
virtualenv by default.
Comment 2•3 years ago
|
||
./mach $command => import glean_sdk β
Isn't that a special case because mach itself imports glean_sdk?
For the Mach env: first, verify that the external Python package top-level requirements are met (e.g. psutil>=5.4.0,<=5.8.0, etc). Then, create the venv with in-tree sys.path entries. Add system's site-packages to a pthfile.
It should be fine for mach native dependencies to be missing. The ones we have are psutil, glean_sdk and zstandard, and you can perfectly build with none of them installed. Note that adding the system's site-packages wholesale can be dangerous if different versions of what is available in-tree are also installed (at the very least, it would need to be guaranteed to come last). Also note that upper bounds can be a PITA.
For the command env: first, verify that the external Python package top-level requirements are met (e.g.: zstandard==0.15.2, etc). Then, create the command venv with in-tree sys.path entries, add Mach env's sys.path modifications, then finally add system site-packages to a pthfile.
(note zstandard is actually optional)
--no-virtualenv is obsolete (was needed for get_and_diffoscope task in the past, but no longer). Remove it.
It still is necessary if removing it prints an error message when psutil ends up being loaded. See bug 1620936.
--no-activate is a stand-in for supporting the "get native python packages from system python" use case
Not that I remember. It's used to not set $PATH. See bug 1675707.
--requirements is only used to download and install zstandard
The only uses I see in-tree for this are for psutil.
Comment 3•3 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #2)
./mach $command => import glean_sdk β
Isn't that a special case because mach itself imports glean_sdk?
BTW, there's a timely complication, as per Matrix, @chutten is trying to add a mach command that uses the in-tree version of glean_parser, which conflicts with mach loading the one from the glean_sdk install. It's a problem today because the command uses the mach virtualenv, but a command virtualenv wouldn't solve the problem since for anything that wants to use a different version of a module that mach itself already loads, the in-process activation of the command virtualenv is hindering.
Reporter | ||
Comment 4•3 years ago
|
||
Isn't that a special case because mach itself imports glean_sdk?
That's exactly the case I'm referring to: code running in a command virtualenv should behave identically whether it's activated though the Mach venv, or is executed in a new python process started from the command virtualenv directly. We resolve this by ensuring that the mach venv's packages are also always in scope for command venvs.
BTW, there's a timely complication, as per Matrix, @chutten is trying to add a mach command that uses the in-tree version of glean_parser, which conflicts with mach loading the one from the glean_sdk install. It's a problem today because the command uses the mach virtualenv, but a command virtualenv wouldn't solve the problem since for anything that wants to use a different version of a module that mach itself already loads, the in-process activation of the command virtualenv is hindering.
Exactly. No matter which way we turn this problem, the command venv won't be able to use a different version of glean_parser
than mach, because Mach imports it at the beginning of its process.
My solution is to embrace this: we're already validating that all command venvs are compatible with the mach venv (this caught the glean-parser
mismatch, btw), so it's good behaviour if command virtualenvs always contain mach virtualenv packages.
It should be fine for mach native dependencies to be missing. The ones we have are psutil, glean_sdk and zstandard, and you can perfectly build with none of them installed.
(note zstandard is actually optional)
Yes, these packages are optional, but that doesn't mean that we, in Moz CI, aren't using them. And, when we are using, say psutil
during the build, we want to be using a compatible version.
So, to append details to make the plan here more explicit: for each external python package dependency:
- If it's required: see description in my original comment
- If it's optional:
- If it's not installed in the system environment, log a warning (e.g.: telemetry won't be collected), continue
- If it's installed in the system environment: assert that it matches loose requirements. If this assertion fails, then raise an error.
Note that adding the system's site-packages wholesale can be dangerous if different versions of what is available in-tree are also installed (at the very least, it would need to be guaranteed to come last)
It is dangerous, but we're already doing it today in some cases.
Since the system site-packages will indeed be last in the sys.path
, all shadowed packages won't be imported, which is good. The only risk is that, if some python script has some conditional if import some_unexpected_package
behaviour, that could cause issues if that package is in the system python environment and doesn't have its version validated. This is unlikely, though, especially since a package containing such behaviour should explicitly describe that dependency.
Also note that upper bounds can be a PITA.
Can you elaborate?
It still is necessary if removing it prints an error message when psutil ends up being loaded. See bug 1620936.
It used to print an error because we used to try to build psutil from vendored source when we built virtualenvs. We don't anymore.
Not that I remember. It's used to not set $PATH. See bug 1675707.
It's used to not set $PATH for supporting the "get native python packages from system python" use case (in this case, zstandard
). Once virtualenvs include a sys.path
entry to the system python (when we can't download packages from a PyPI repo), it'll be OK to use a command virtualenv directly.
The only uses I see in-tree for this are for psutil.
Right, psutil
, not zstandard
. Mixed those up, thanks.
Reporter | ||
Comment 5•3 years ago
|
||
Hmm, there is one potential failure case with native packages coming from the system, though.
Imagine this scenario:
- A top-level Mach requirement is
glean-sdk>=36.0.0,<=40.0.0
- In the build virtualenv, we vendor
glean-parser==3.6.0
(compatible withglean-sdk==40.0.0
, notglean-sdk==36.0.0
) - On machines that download pinned dependencies, we have
glean-sdk==40.0.0
andglean-parser==3.6.0
. β - On some CI workers, we have
glean-sdk==36.0.0
. However, when we run thebuild
command, we haveglean-sdk==36.0.0
in scope, but our version ofglean-parser
(3.6.0
) is too new for it. It fails at runtime.- If we flipped this around and used
glean-parser
from the system (2.5.0
, as is compatible withglean-sdk==36.0.0
), then the build would fail because it expects3.6.0
, not2.5.0
.
- If we flipped this around and used
We have three options here:
- Determine a way to verify that the "flexible requirements" are all compatible with vendored packages in-scope. This may be non-trivial
- Don't allow flexible requirements, ensure that all external python packages depended on by the
mach
orbuild
venvs are optional requirements. - Accept this risk, since it will only affect downstream builders and our CI (until if/when it embraces pypi.pub and pinning)
Reporter | ||
Comment 6•3 years ago
|
||
Hmm, thinking about this more, when using packages from the system, simply asserting that the top-level package meets requirements is insufficient.
This is because transitive dependencies may conflict with vendored requirements. For example:
Imagine a virtualenv with the requirements:
pypi:native_package==1.0
vendored:shared_package
(is==1.0
)
The dependencies of native_package==1.0
are:
native_package==1.0
|
V
middle_package>=1.0,<=2.0
==1.0 | ==2.0
/ | \
/ | \
shared_package==1.0 | shared_package==2.0
In CI, we have a test for each virtualenv to check "are the specified pypi and vendored packages compatible?"
- Essentially (for this venv) this does
pip install native_package==1.0 AND shared_package==1.0
- However, this asserts that there exists a combination of
$(vendored packages) + $(pypi packages and dependencies)
that are compatible, but NOT that every combination is compatible
So, for this example, it would find that:
native_package==1.0
middle_package==1.0
shared_package==1.0
- β Are compatible β
However, if a system Python environment has the following installed:
native_package==1.0
middle_package==2.0
shared_package==2.0
When we run the command, the following packages are in-scope:
native_package==1.0
(from system)middle_package==2.0
(from system)shared_package==1.0
(from vendored)
But middle_package==2.0
is NOT compatible with shared_package==1.0
!
Simply asserting native_package==1.0
is not sufficient to confirm compatibility.
There's a new solution here though, that allows us to still depend on system packages:
- In addition to checking that top-level package requirements are met, do:
- For each vendored package, assert that it doesn't conflict with the system environment. So, it should either:
- Not be installed in the system environment
- Be the exact same version (
==
) as the one installed in the system environment
- Finally, run
pip check
on the system environment to ensure that the system environment is internally correct.
This should have us covered. Unfortunately, it will cause some additional work for system environments, because when vendored packages shared by the system environment change versions, the system environment will need to be updated - even if it has a compatible version. This is because we're doing a ==
check instead of a "compatible" check, but doing a "compatible" check requires knowledge of all packages in scope, and pip check
doesn't allow us to hack our vendored packages into the scope. So, we'd have to create a brand new virtualenv, install vendored package plus system packages (without network/hitting PyPI!), then run pip check
inside of it, which is a lot harder.
Reporter | ||
Comment 7•3 years ago
|
||
An additional constraint and associated solution is worth documenting here.
Right now in CI, we don't differentiate between tasks that can fetch packages from a PyPI repo and which tasks can't. Rather, it's done implicitly (and inconsistently, heh) - the general trend is that ./mach build
shouldn't download pip packages (it does in our CI, but A: it shouldn't and B: downstream, it doesn't have to). All Mach commands that aren't needed for a chemspill (IIRC, only ./mach build is
) should be able to fetch from PyPI.
Anyways, this means that, in the early part of Mach initialization, we don't know whether we'll have access to PyPI or whether we need to need to access pip packages via the system. We can't lean on the MOZ_AUTOMATION
environment variable because it's even used when we are allowed to talk to external PyPI repos.
So, instead, I propose that we embrace the MACH_USE_SYSTEM_PYTHON
environment variable in our CI. When it's set, Mach will:
- Restrict the downloading of pip packages (for any venv, whether it's the "mach" venv, a command venv, or otherwise)
- Perform the system-package-assertions listed in the previous comment.
MACH_USE_SYSTEM_PYTHON |
1 |
<UNSET> |
---|---|---|
Mach venv | Add Mach pths to sys.path |
Create Mach virtualenv |
Mach venv | Check system packages match pypi: requirements |
Install pypi: requirements to Mach virtualenv |
Mach venv | Do pip check with system pip |
Scrub system site-packages from sys.path |
Mach venv | Check that system packages are compatible with vendored: requirements |
Activate Mach virtualenv |
Command venv | Create command virtualenv | Create command virtualenv |
Command venv | Add Mach pths to command virtualenv pthfile | Add Mach pths to command virtualenv pthfile |
Command venv | Check system packages match pypi: requirements |
Install pypi: requirements to command virtualenv |
Command venv | Check that system packages are compatible with vendored: requirements |
Add Mach virtualenv's site-packages to command virtualenv pthfile |
Command venv | Add system site-packages to pthfile |
Activate command virtualenv |
Command venv | Activate command virtualenv |
So far so good.
However, now let's consider the current situation where we aren't specifically using MACH_USE_SYSTEM_PYTHON=1
in CI for tasks that can't hit pip
.
We want to push forward our virtualenv consistency, without causing breakage in the interrim.
First, my proposed solution doesn't reach the same level of stability/guarantees as above, so I'm going to prime your thoughts by listing the downsides with the approach that we've been using so far:
- Due to command virtualenvs not containing Mach's scope, we've had do workarounds to access packages only installed to the system.
- As mentioned above, a command virtualenv behaves differently (has different packages in scope for
import
) depending on whether it is activated via a./mach <command>
, or whether<command-venv>/bin/python
is used directly. - In cases where we're doing
./mach <command>
with apip install
, we've been at risk of system packages conflicting with the installed packages.
The CI-without-MACH_USE_SYSTEM_PYTHON
solution I'm leaning towards looks like the following:
- When
MOZ_AUTOMATION
isn't set, create virtualenvs and install from PyPI "as normal". This is identical to the above solution whenMACH_USE_SYSTEM_PYTHON
isn't set. - When
MOZ_AUTOMATION=1
, gate whether or not we can access PyPI based on the command.build
will force us to use system packages, while others (e.g.:doc
) will allow PyPI downloading.
As mentioned above, since we don't know the command at Mach-initialization time, we're forced to use system packages for the Mach env.
Command virtualenv | build |
something else, e.g.: doc |
---|---|---|
Mach venv | Add Mach pths to sys.path |
<- identical |
Mach venv | Check system packages match pypi: requirements |
<- identical |
Mach venv | Do pip check with system pip |
<- identical |
Mach venv | Check that system packages are compatible with vendored: requirements |
<- identical |
Command venv | Create command virtualenv | Create command virtualenv |
Command venv | Add Mach pths to command virtualenv pthfile | Add Mach pths to command virtualenv pthfile |
Command venv | Check system packages match pypi: requirements |
Install pypi: requirements to command virtualenv |
Command venv | Check that system packages are compatible with vendored: requirements |
Activate command virtualenv |
Command venv | Add system site-packages to pthfile |
|
Command venv | Activate command virtualenv |
Notes and downsides with this approach:
- When
MOZ_AUTOMATION=1
and the command allows fetching from PyPI (e.g.:./mach doc
):- We aren't able to verify that the command's
pypi:
dependencies are compatible with the system packages. - The virtualenv will behave differently between
./mach <command>
and<command-venv>/bin/python <script>
.- For
./mach <command>
, system packages will still be in scope. This makes sense, because, for example, Mach's glean handle is in scope, and it depends on nativeglean-sdk
installed-to-system package. - For
<command-venv>/bin/python <script>
, system packages will not be in scope. This is to reduce the window in which incompatible system packages and virtualenv packages are imported at the same time. This also matches existing behaviour.
- For
- We aren't able to verify that the command's
The benefits of implementing this approach to CI are:
- The
./mach build
virtualenv will behave more consistently and be easier to use ./mach <command>
will behave similar to before, without making any downsides worse, and- We'll be in a more stable, well-defined position when we do the small migration to
MACH_USE_SYSTEM_PACKAGES
.
Reporter | ||
Comment 8•3 years ago
|
||
Side-discovery here: pip
respects sys.path
(and therefore pth
files as documented in site
), so it automatically picks up all packages with a .dist-info
or .egg-info
directory.
So, we don't need to get fancy with how we verify requirements with the system Python. Instead, we can:
- Create a temporary virtualenv
- Add a
pth
file with:
import sys; sys.path.extend([$vendored_package_paths])
import sys; sys.path.extend([$system_site_paths])
- Finally, run
pip check
within that virtualenv.
This is a slightly additive solution on top of the one described in this comment:
For each vendored package, assert that it doesn't conflict with the system environment. So, it should either [not be installed] or be the exact same version (==) as the one installed in the system environment
Instead, now if a package exists both as a vendored package and as installed in the system environment, it doesn't have to be an ==
match, but rather just compatible. Great!
Additionally, there's less work that we have to do - we just entirely lean on pip
.
The downside is that it can't see packages that were copied in manually - those without .dist-info
or .egg-info
.
However, there's relatively few of these packages, especially since we ported all possible third_party/python/*
packages to use pip
to vendor.
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Updated•3 years ago
|
Reporter | ||
Comment 12•3 years ago
|
||
It's been 10 months, so it's worth summarizing the improvements that we've seen from this work so far:
Summary of landed improvements
- The
mach
virtualenv is now kept up-to-date- No more
./mach create-mach-environment
- No more implicit, confusing failures due to glean/zstandard/psutil being out-of-date, having to manually
./mach bootstrap|create-mach-environment
to fix
- No more
- The Python import scope is now consistent
- It's no longer confusing when a package is available are not (are you in the Mach process? A subprocess? Running through
make
?). Now, it strictly depends on what "virtualenv" the running Mach command is using. - CI-provided packages are no longer of an unknown version, failing implicitly: they are asserted against version specifiers
- Easier for Python devs: just change the virtualenv manifest version specifier, affected jobs will fail in obvious ways: "package A doesn't meet version specifier B"
- Easier for devs changing CI: if a package is missing or the wrong version, Mach tells them exactly what package requirements contract needs to be fulfilled
- No more implicit package incompatibility (CI system packages <=> vendored packages <=>
pip
-installed packages).
- It's no longer confusing when a package is available are not (are you in the Mach process? A subprocess? Running through
- Vendored packages are now tracked by
pip
, guaranteed to be compatible with each other and the environment - Virtualenv manifests (
python/sites/<site>.txt
) now support PyPI packages in addition to vendored packages - Documentation updated for Mach's interaction with Python packages
- Out-of-date virtualenvs are no longer implicitly used (also see this patch)
./mach python
arguments simplified- And, of course, as with any nontrivial work, code quality was significantly improved:
- We no longer re-invoke a Mach module in a new process just to install
pip
packages - Tests were added to validate in CI that vendored packages always match their manifest file (
third_party/python/requirements.txt
), and that Mach command virtualenvs are compatible with Mach's own dependencies - We no longer have separate implementations for parsing virtualenv manifest files, it's now shared
- A bunch of places where the
sys.path
is manually modified have been cleaned up sys.path
-/virtualenv-management code has been untangled nicely (though, unfortunately, complexity is still roughly the same due to added features around import consistency).
- We no longer re-invoke a Mach module in a new process just to install
What's next?
- Use lockfiles for
pip
-installed packages - Port more usages to the centralized dependency-management system
- Consider how to integrate the centralized system with Mozharness, which has its own, pypi.pub.build.mozilla.org-specific way of handling dependencies.
- And more, according to the dependencies of this ticket.
Description
•