... can be found there: https://github.com/vjmp/pipbacktracking and issue can be found there: pypa/pip#11480
Installing (internally conflicting) "requirements.txt" which has lots of
transient dependencies, with simple command like pip install -r requirements.txt
with "very simple" looking content of "requirements.txt" looking like:
pywebview[qt]==3.6.2
rpaframework==15.6.0
This will take long time to fail, like 4+ hours.
And note that this specific example applies only on Linux environments. But I think problem is general, and "old, previously working" requirement sets can get "rotten" over time, as dependency "future" takes "wrong" turn. This is because resolver works from latest to oldest, and even one few versions of some required dependencies can derail resolver into backtracking "mode".
Here are some things, that make this problem to Robocorp customers.
- machines executing "pip install" can be fast or slow (or very slow)
- pip version can be anything old or new (backward compatible generic usage)
- pip environment setup time is "billable" time, so "fail fast" is cheaper in monetary terms than "fail 4+ hours later on total environment build failure"
- automation is setting up environment, not humans
- our tooling for automation in our
rcc
which is used here to also make this failure as repeatable process - and general context for automation is RPA (robotic process automation) so processes should be repeatable and reliable and not break, even if time passes
Currently happy path works (fast enough), but if you derail resolver to unbeaten
path, then resolution takes long time, because in pip source
https://github.com/pypa/pip/blob/main/src/pip/_internal/resolution/resolvelib/resolver.py#L91
there is magical internal variable try_to_avoid_resolution_too_deep = 2000000
which causes very long search until it fails.
When package, like rpaframework
below, has something around 100 dependencies
it its dependency tree, even happy path resolution takes 100+ rounds of pip
dependency resolution to find it. When backtracking, (just one) processor
becomes 100% busy for backtracking work.
INFO: pip is looking at multiple versions of selenium to determine which version is compatible with other requirements. This could take a while.
and ...
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
... are nice for pip
to inform user that it is taking longer than usual, but
in our customers automation cases, there is nobody who could see those, or
to press that "Ctrl + C".
This could be improved, if there would be environment variable like
MAX_PIP_RESOLUTION_ROUNDS
instead of having hard coded 2000000 internal limit.
Also adding this as environment variable (instead of command line option is
better backwards compatibility, since "extra" environment variable does not
kill old pip version commands, but CLI option will).
What is needed:
- a linux machine
- content of repo containing this README.md file
- rcc executable (optional, but useful, and if you have it, you don't have to manually install following two things ...)
- python3, in our case we have tested 3.9.13
- pip, in our case we have tested 22.1.2 (but mostly anything after 20.3 has this feature; this current example uses pip v22.2.2)
You need rcc
to run these examples. Or do manual environment setup if you will.
You can download rcc binaries from https://downloads.robocorp.com/rcc/releases/index.html or if you want to more information, see https://github.com/robocorp/rcc
To run success case as what normal user sees, use this:
rcc run --task pass
And to see debugging output, use this:
rcc run --dev --task pass
To run failing case as what normal user sees, use this ... and have patience to wait:
rcc run --task fail
And to see debugging output, use this ... and have patience to wait:
rcc run --dev --task fail