Pip: New resolver takes a very long time to complete

Created on 30 Nov 2020  ·  79Comments  ·  Source: pypa/pip

What did you want to do?

One of CI jobs for Weblate is to install minimal versions of dependencies. We use requirements-builder to generate the minimal version requirements from the ranges we use normally.

The pip install -r requirements-min.txt command seems to loop infinitely after some time. This started to happen with 20.3, before it worked just fine.

Output

Requirement already satisfied: google-auth<2.0dev,>=1.21.1 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.23.0)
Requirement already satisfied: pytz>dev in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from celery[redis]==4.4.5->-r requirements-min.txt (line 3)) (2020.4)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.52.0)
Requirement already satisfied: six>=1.9.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from bleach==3.1.1->-r requirements-min.txt (line 1)) (1.15.0)
Requirement already satisfied: protobuf>=3.12.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (3.14.0)
Requirement already satisfied: grpcio<2.0dev,>=1.29.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.33.2)
Requirement already satisfied: google-auth<2.0dev,>=1.21.1 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.23.0)
Requirement already satisfied: pytz>dev in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from celery[redis]==4.4.5->-r requirements-min.txt (line 3)) (2020.4)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.52.0)
Requirement already satisfied: six>=1.9.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from bleach==3.1.1->-r requirements-min.txt (line 1)) (1.15.0)
Requirement already satisfied: protobuf>=3.12.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (3.14.0)

This seems to repeat forever (well for 3 hours so far, see https://github.com/WeblateOrg/weblate/runs/1474960864?check_suite_focus=true)

Additional information

Requirements file triggering this: requirements-min.txt

It takes quite some time until it gets to above loop. There is most likely something problematic in the dependencies set...

new resolver crash

Most helpful comment

Thanks, @dstufft.

I'll mention here some useful workaround tips from the documentation -- in particular, the first and third points may be helpful to folks who end up here:

  • If pip is taking longer to install packages, read Dependency resolution backtracking for ways to reduce the time pip spends backtracking due to dependency conflicts.

  • If you don’t want pip to actually resolve dependencies, use the --no-deps option. This is useful when you have a set of package versions that work together in reality, even though their metadata says that they conflict. For guidance on a long-term fix, read Fixing conflicting dependencies.

  • If you run into resolution errors and need a workaround while you’re fixing their root causes, you can choose the old resolver behavior using the flag --use-deprecated=legacy-resolver. This will work until we release pip 21.0 (see Deprecation timeline).

All 79 comments

I'm going to use this issue to centralize incoming reports of situations that seemingly run for a long time, instead of having each one end up in it's own issue or scattered around.

@jcrist said in https://github.com/pypa/pip/issues/8664#issuecomment-735961391

_Note: I was urged to comment here about our experience from twitter._

We (prefect) are a bit late on testing the new resolver (only getting around to it with the 20.3 release). We're finding that install times are now in the 20+ min range (I've actually never had one finish), previously this was at most a minute or two. The issue here seems to be in the large search space (prefect has loads of optional dependencies, for CI and some docker images we install all of them) coupled with backtracking.

I enabled verbose logs to try to figure out what the offending package(s) were but wasn't able to make much sense of them. I'm seeing a lot of retries for some dependencies with different versions of setuptools, as well as different versions of boto3. For our CI/docker builds we can add constraints to speed things up (as suggested here), but we're reluctant to increase constraints in our setup.py as we don't want to overconstrain downstream users. At the same time, we have plenty of novice users who are used to doing pip install prefect[all_extras] - telling them they need to add additional constraints to make this complete in a reasonable amount of time seems unpleasant. I'm not sure what the best path forward here is.

I've uploaded verbose logs from one run here (killed after several minutes of backtracking). If people want to try this themselves, you can run:

pip install "git+https://github.com/PrefectHQ/prefect.git#egg=prefect[all_extras]"

Any advice here would be helpful - for now we're pinning pip to 20.2.4, but we'd like to upgrade once we've figured out a solution to the above. Happy to provide more logs or try out suggestions as needed.

Thanks for all y'all do on pip and pypa!

These might end up being resolved by https://github.com/pypa/pip/issues/9185

Thanks, @dstufft.

I'll mention here some useful workaround tips from the documentation -- in particular, the first and third points may be helpful to folks who end up here:

  • If pip is taking longer to install packages, read Dependency resolution backtracking for ways to reduce the time pip spends backtracking due to dependency conflicts.

  • If you don’t want pip to actually resolve dependencies, use the --no-deps option. This is useful when you have a set of package versions that work together in reality, even though their metadata says that they conflict. For guidance on a long-term fix, read Fixing conflicting dependencies.

  • If you run into resolution errors and need a workaround while you’re fixing their root causes, you can choose the old resolver behavior using the flag --use-deprecated=legacy-resolver. This will work until we release pip 21.0 (see Deprecation timeline).

For my case, the problematic behavior can be reproduced much faster with pip install 'google-cloud-translate==3.0.0' 'requests==2.20.0' 'setuptools==36.0.1', so it sounds like #9185 might improve it.

The legacy resolver bails out on this quickly with: google-auth 1.23.0 requires setuptools>=40.3.0, but you'll have setuptools 36.0.1 which is incompatible..

One other idea toward this is, stopping after 100 backtracks (or something) with a message saying "hey, pip is backtracking due to conflicts on $package a lot".

I wonder how much time is taken up by downloading and unzipping versus actually taking place in the resolver iteself?

I wonder how much time is taken up by downloading and unzipping versus actually taking place in the resolver iteself?

Most of it, last I checked. Unless we're hitting some very bad graph situation, in which case... :shrug: the users are better off giving pip the pins.

I'm having our staff fill out that google form where ever they can, but I just want to mention that pretty much all of our builds are experiencing issues with this. Things that worked fine and had a build time of about 90 seconds are now timing out our CI builds. In theory we could increase the timeout, but we're paying for these machines by the minute so having all of our builds take a huge amount of time longer is a painful choice. We've switched over to enforcing the legacy resolver on all of our builds for now.

As a general note to users reaching this page, please read https://pip.pypa.io/en/stable/user_guide/#dependency-resolution-backtracking.

I was asked to add some more details from twitter, so here are some additional thoughts. Right now the four solutions to this problem are:

  1. Just wait for it to finish
  2. Use trial and error methods to reduce versions checked using constraints
  3. Record and reuse those trial error methods in a new "constraints.txt" file
  4. Reduce the number of supported versions "during development"

Waiting it out is literally too expensive to consider

This solution seems to rely on downloading an epic ton of packages. In the era of cloud this means-

  • Larger harddrives are needed to store the additional packages
  • More bandwidth is consumed downloading these packages
  • It takes longer to process everything due to the need to decompress these images

These all cost money, although the exact balance will depend on the packages (people struggling with a beast like tensorflow might choke on the hard drive and bandwidth, while people with smaller packages just get billed for the build time).

What's even more expensive is the developer time wasted during an operation that used to take (literally) 90s that now takes over 20 minutes (it might take longer but it times out on our CI systems).

We literally can't afford to use this dependency resolution system.

Trial and error constraints are extremely burdensome

This adds a whole new set of processes to everyone's dev cycle where not only do they have to do the normal dev work, but now they need to optimize the black box of this resolver. Even the advice on the page is extremely trial and error, basically saying to start with the first package giving you trouble and continue iterating until your build times are reasonable.

Adding more config files complicates and already overcomplicated ecosystem.

Right now we already have to navigate the differences between setup.py, requirements.txt, setup.cfg, pyproject.toml, and now adding in constraints.txt just adds even more burden (and confusion) on maintaining python packages.

Reducing versions checked during development doesn't scale

Restricting versions during development but releasing the package without those constraints means that the users of that package are going to have to reinvent those constraints themselves during development. If I install a popular package my build times could explode until I duplicate their efforts. There's no way to share those constraints other than copy/paste methods, which adds to the maintenance burden.

What this is ultimately going to result in is people not using constraints at all, instead limiting the dependency versions directly based not off of actual compatibility but a mix of compatibility and build times. This will make it harder to support smaller packages in the long term.

Most of it, last I checked.

Might be a good reason to prioritize https://github.com/pypa/warehouse/issues/8254

Might be a good reason to prioritize pypa/warehouse#8254

Definitely. And a sdist equivalent when PEP 643 is approved and implemented.

This solution seems to rely on downloading an epic ton of packages

It doesn't directly rely on downloading, but it does rely on knowing the metadata for packages, and for various historical reasons, the only way to get that data is currently by downloading (and in the case of source distributions, building) the package.

That is a huge overhead, although pip's download cache helps a lot here (maybe you could persist pip's cache in your CI setup?) On the plus side, it only hits hard in cases where there are a lot of dependency restrictions (where the "obvious" choice of the latest version of a package is blocked by a dependency from another package), and it's only tended to be really significant in cases where there is no valid solution anyway (although this is not always immediately obvious - the old resolver would happily install invalid sets of packages, so the issue looks like "old resolver worked, new one fails" where it's actually "old one silently broke stuff, new one fails to install instead").

This doesn't help you address the issue, I know, but hopefully it gives some background as to why the new resolver is behaving as it is.

@tedivm please look into using pip-tools to perform dependency resolution as a separate step from deployment. It's essentially point 4 -- "local" dependency resolution with the deployment only seeing pinned versions.

Actually, It would be an interesting experiment to see. These pathological cases that people. are experiemnting with, if they let the resolver complete once, persist the cache, and then try again, is it faster? If it's still hours long even with a cache, then that suggests pypa/warehouse#8254 isn't going to help much.

I don't know what we're doing now exactly, but I also wonder if it would make sense to stop exhaustively searching the versions after a certain point. This would basically be a trade off of saying that we're going to start making assumptions about how dependencies evolve over time. I assume we're currently basically starting with the latest version, and iterating backwards one version at a time, is that correct? If so, what if we did something like:

  1. Iterate backwards one version at a time until we fail resolution X times.
  2. Start a binary search, cut the remaining candidates in half and try with that.
    2a. If it works, start taking the binary search towards the "newer" side (cut that in half, try again, etc).
    2b. If it fails, start taking the binary search towards the "older"side (cut that in half, try again, etc).

This isn't exactly the correct use of a binary search, because the list of versions aren't really "sorted" in that way, but it would kind of function similiarly to git bisect? The biggest problem with it is it will skip over good versions if the latest N versions all fail, and the older half of versions all fail, but the middle half are "OK".

Another possible idea is instead of a binary search, do a similar idea but instead of bucketing the version set in halves, try to bucket them into buckets that match their version "cardinality". IOW, if this has a lot of major versions, bucket them by major version, if it has few major versions, but a lot of minor versions, bucket it by that, etc. So that you divide up the problem space, then start iterating backwards trying the first (or the last?) version in each bucket until you find one that works, then constraint the solver to just that bucket (and maybe one bucket newer if if you're testing the last version instead of first?).

I dunno, it seems like exhaustively searching the space is the "correct" thing to do if you want to always come up with the answer if one exists anywhere, but if we can't make that fast enough, even with changes to warehouse etc, we could probably try to be smart about using heuristics to narrow the search space, under the assumption that version ranges typically don't change that often and when they do, they don't often change every single release.

Maybe if we go into heuristics mode, we emit a warning that we're doing it, suggest people provide more information to the solver, etc. Maybe provide a flag like --please-be-exhaustive-its-ok-ill-wait to disable the heuristics.

Maybe we're already doing this and I'm jsut dumb :)

We're not doing it, and you're not dumb :-) But it's pretty hard to do stuff like that - most resolution algorithms I've seen are based on the assumption that getting dependency data is cheap (many aren't even usable by pip because they assume all dependency info is available from the start). So we're getting into "designing new algorithms for well-known hard CS problems" territory :-(

Another possible idea is instead of a binary search, do a similar idea but instead of bucketing the version set in halves, try to bucket them into buckets that match their version "cardinality". IOW, if this has a lot of major versions, bucket them by major version, if it has few major versions, but a lot of minor versions, bucket it by that, etc.

Some resolvers I surveyed indeed do this, espacially from ecosystems that promote semver heavily (IIRC Cargo?) since major version bumps there imply more semantics, so this is at least a somewhat charted territory.

The Python community do not generally adhere to semver that strictly, but we may still be able to do it since the resolver never promised to return the best solution, but only a good enough one (i.e. if both 2.0.1 and 1.9.3 satisfy, the resolver does not have to choose 2.0.1).

The other part is how we handle failure-to-build. With our current processes, we could have to get build deps, do the build (or at best call prepare_metadata_for_build_wheel to get the info).

With binary search-like semantics, we'd have to be lenient about build failures and allow pip to attempt-to-use a different version of the package on failures (compared to outright failing as we do today).

Maybe provide a flag like --please-be-exhaustive-its-ok-ill-wait to disable the heuristics.

I think stopping after we've backtracked 100+ times and saying "hey, this is taking too long. Help me by reducing versions of $packages, or tell me to try harder with --option." is something we can do relatively easily now.

If folks are on board with this, let's pick a number (I've said 100, but I pulled that out of the air) and add this in?

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

@dstufft in my case, there was no suitable solution (see https://github.com/pypa/pip/issues/9187#issuecomment-736010650). I guessed which might be the problematic dependencies and with reduced set of packages it doesn't take that long and produces expected error. With full requirements-min.txt it didn't complete in hours.

With nearly 100 pinned dependencies, the space to search is enormous, and pip ends up with (maybe) infinitely printing "Requirement already satisfied:" when trying to search for some solution (see https://github.com/WeblateOrg/weblate/runs/1474960864?check_suite_focus=true for long log, it was killed after some hours). I just realized that the CI process is slightly more complex that what I've described - it first installs packages based on the ranges, then generates list of minimal versions and tries to adjust existing virtualenv. That's probably where the "Requirement already satisfied" logs come from.

The problematic dependency chain in my case was:

  • google-cloud-translate==3.0.0 from command line
  • setuptools==36.0.1 from command line
  • google-api-core[grpc] >= 1.22.0, < 2.0.0dev from google-cloud-translate==3.0.0
  • google-auth >= 1.19.1, < 2.0dev from google-api-core
  • setuptools>=40.3.0 from google-auth (any version in the range)

In the end, I think the problem is that it tries to find solution in areas where there can't be any. With full pip cache:

$ time pip install  google-cloud-translate==3.0.0 setuptools==36.0.1
...

real    0m6,206s
user    0m5,136s
sys 0m0,242s
$ time pip install  google-cloud-translate==3.0.0 setuptools==36.0.1 requests==2.20.0
...

real    0m28,724s
user    0m25,162s
sys 0m0,283s

In this case, adding requests==2.20.0 (which can be installed without any problem with either of the dependencies) multiplies the time nearly five times. This is caused by pip looking at different chardet and certifi versions for some reason.

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

I'm pretty sure in prefect's case with [all_extras] it's because no solution exists, but I haven't yet been able to determine what the offending package(s) are. At some point I'll sit down and iteratively add dependencies on to the base install until things slow down, just need to find the time.

Tips on interpreting the logs might be useful here - I can see what packages pip is searching through, but it's not clear what constraint is failing leading to this search.


Regarding the few comments above about giving up after a period or using heuristics/assumptions about version schemes - for most things I've worked on, a valid install is usually:

  • All packages use the most recent versions (e.g. most recent A, B, and C)
  • Except if some dependency's most recent release breaks, in which case we usually fix things pretty quick to make it work and use a fairly recent release of the broken one (e.g. latest A and B, C is 1 or 2 releases old).

Rarely will the install I'm looking for be "the most recent versions of A and B, plus a release of C from 3 years ago". The one case where I might want this is if I'm debugging something, or trying to recreate an old environment, but in that case I'd usually specify that I want C=some-old-version directly rather than having the solver do it for me.

@brainwane asked me to post my case here from #9126. TLDR: the new resolver is (only) 3x slower in my case.

Basically, I use
pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager
to convert my manual environment (after adding and removing packages, upgrading, downgrading, trying out things) into something that is as up to date as possible. That failed with the old resolver, but works great with the new one. It upgrades old packages that can be upgraded, and downgrades packages that are too new for some other package.

The only thing I wondered about is how much slower the new resolver was. It's about a factor 3 (42 vs 13 seconds, using pip==2.3.0 with and without --use-deprecated legacy-resolver). I though that maybe network requests would be the main issue, but pip list --outdated takes only about 20s with the exact same number of GET request (125). I was wondering how pip can spend ~30s just on resolving version, but again, in the context of this thread, I begin to understand what the problem is.

Feel free to use or ignore this comment as you seem fit ;)

> time pip list --outdated
Package           Version Latest Type
----------------- ------- ------ -----
gast              0.3.3   0.4.0  wheel
grpcio            1.32.0  1.33.2 wheel
h5py              2.10.0  3.1.0  wheel
lazy-object-proxy 1.4.3   1.5.2  wheel
protobuf          3.13.0  3.14.0 wheel

real    0m19.373s
user    0m19.718s
sys     0m0.721s


> time pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager

[...]

real    0m41.655s
user    0m38.308s
sys     0m1.786s


> time pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager
> --use-deprecated legacy-resolver

[...]

Successfully installed gast-0.4.0 grpcio-1.33.2 h5py-3.1.0 lazy-object-proxy-1.5.2 protobuf-3.14.0

real    0m13.064s
user    0m10.804s
sys     0m0.391s


> time pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager

[...]

Successfully installed gast-0.3.3 grpcio-1.32.0 h5py-2.10.0 lazy-object-proxy-1.4.3 protobuf-3.13.0

real    0m42.860s
user    0m39.015s
sys     0m2.000s

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

Well, it works with the old resolver without error but not with the new one- does that answer the question?

That is a huge overhead, although pip's download cache helps a lot here (maybe you could persist pip's cache in your CI setup?) On the plus side, it only hits hard in cases where there are a lot of dependency restrictions (where the "obvious" choice of the latest version of a package is blocked by a dependency from another package), and it's only tended to be really significant in cases where there is no valid solution anyway (although this is not always immediately obvious - the old resolver would happily install invalid sets of packages, so the issue looks like "old resolver worked, new one fails" where it's actually "old one silently broke stuff, new one fails to install instead").

This doesn't help you address the issue, I know, but hopefully it gives some background as to why the new resolver is behaving as it is.

We do persist the cache, but it literally never finishes.

I do understand how the resolver works, but my point is that understanding it doesn't make the problem go away. This level of over overhead is literally orders of magnitudes more than the previous version- or of any other package manager out there.

I understand the legacy decisions that had to be supported here, but frankly until the issue of performance is addressed this version of the resolver should not be the default. PyPi should be sending out the already computed dependency data, not forcing us to build dozens of packages over and over again to generate the same data that hundreds of other people are also regenerating. I understand that this is in the roadmap, pending funding, but it's my opinion that this resolver is not ready for production until this issue is addressed.

I have to leave my thoughts here: I agree with @tedivm that this resolver is not ready for production. The UX of having pip run for 10s of minutes with no useful output is terrible. Right now pip is producing an ungodly amount of duplicative text(which is probably slowing down the search) Requirement already satisfied: ....

If the resolver fails on the first attempt(I assume pip tries to install the latest versions) I think pip should print out the constraint violations immediately. Or add options to limit the search to N attempts, or only make M attempts for a given package. And maybe after some amount of attempts pip should print the situation with the least amount of constraint violations. As it stands I have to just Ctrl-C pip when it runs for too long (10 minutes is too long) and I get no useful information from having waited.

My python packages installment is taking too long and then Jenkins CI/CD pipeline fails after 2 hrs.

Collecting amqp==2.5.2
14:27:30 Downloading amqp-2.5.2-py2.py3-none-any.whl (49 kB)
14:27:30 Collecting boto3==1.16.0
14:27:30 Downloading boto3-1.16.0-py2.py3-none-any.whl (129 kB)
14:27:30 Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /root/.local/lib/python3.6/site-packages (from boto3==1.16.0->gehc-edison-ai-container-support==3.5.0) (0.10.0)
14:27:30 Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /root/.local/lib/python3.6/site-packages (from boto3==1.16.0->gehc-edison-ai-container-support==3.5.0) (0.3.3)
14:27:30 Collecting botocore==1.19.26
14:27:30 Downloading botocore-1.19.26-py2.py3-none-any.whl (6.9 MB)
14:27:30 Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /root/.local/lib/python3.6/site-packages (from botocore==1.19.26->gehc-edison-ai-container-support==3.5.0) (2.8.1)
14:27:30 Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /root/.local/lib/python3.6/site-packages (from boto3==1.16.0->gehc-edison-ai-container-support==3.5.0) (0.10.0)
14:27:30 Requirement already satisfied: urllib3<1.27,>=1.25.4 in /root/.local/lib/python3.6/site-packages (from botocore==1.19.26->gehc-edison-ai-container-support==3.5.0) (1.25.11)
14:27:30 Collecting celery==5.0.2
14:27:30 Downloading celery-5.0.2-py3-none-any.whl (392 kB)
14:27:30 INFO: pip is looking at multiple versions of botocore to determine which version is compatible with other requirements. This could take a while.
14:27:31 INFO: pip is looking at multiple versions of boto3 to determine which version is compatible with other requirements. This could take a while.
14:27:31 INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
14:27:31 INFO: pip is looking at multiple versions of amqp to determine which version is compatible with other requirements. This could take a while.

why pip is looking at multiple versions?

Is there any resolution for this.

@Nishanksingla I see one item in the output you have copied here:

14:27:31 INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while.

Is that literally what pip output, or did you remove the name of a package?

Also, I recommend that you take a look at the tips and guidance in this comment.

I updated the comment, git was not showing <Python from Requires-Python>

Well, it works with the old resolver without error but not with the new one- does that answer the question?

No. The old resolver would regularly resolve to a set of dependencies that violated the dependency constraints. The new resolver is slower, in part, because it stops doing that, and part of the work to stop doing that makes things slower (partially for reasons that are unique to the history of Python packaging).

PyPi should be sending out the already computed dependency data, not forcing us to build dozens of packages over and over again to generate the same data that hundreds of other people are also regenerating. I understand that this is in the roadmap, pending funding, but it's my opinion that this resolver is not ready for production until this issue is addressed.

This is not actually possible in all cases.

Basically we have wheels, who have statically computed dependency information. This currently requires downloading a wheel from PyPI and extracting this information from that wheel. We currently have plans to do that for sure in Warehouse.

Wheels are the easy case... the problem then comes down to sdists. Historically sdists can have completely dynamic dependency information, like something like this:

setup(
    install_requires=[random.choice(["urllib3", "requests"])]
)

is a completely valid (although silly) setup.py file, where it isn't possible to pre-compute the set of dependencies. A more serious example would be one that introspects the current running environment, and adjusts the set of dependencies based on what it's discovered about the current environment, this could sometimes be as mundane as based on the OS or the Python version (which in modern times we have static ways to express that, but not everyone is using those yet) or things like what C libraries exist on a system or. something like that.

Thus for sdists, we have some cases we end up where some of them could have static dependency information (but currently don't, though we have a plan for it) but some of them cannot and will not, and in those cases backtracking through those choices are basically always going to be slow.

So our hope here is to speed up the common cases by having what static sets of dependencies we can compute be available as part of the repository API, but there's always a chance that some project can exist in a state that triggers this slow behavior even with those improvements (and this can happen with all package managers that use a system liek this, however Python is in a worse position because of our dynamic dependency capability).

I think it's probably true that more people were hitting the "bad case" than expected, one of the reasons I asked if these slow things eventually end with a resolved set of dependencies, or if they end with an unresolvable set of dependencies. If they are typically errors, then it makes sense to just bomb out sooner with an error message, because our heuristic can be "if we have ot backtrack more than N times, we're probably heading towards a failure". If they typically end with success, but it jsut takes awhile to get there ,then that suggests it would be better to invest in trying to make our heuristics for picking candidates smarter in some way, to try to arrive at a solution faster.

One thing whic is surprising me is that I am not getting this issue in my system when I install my requirements.txt with pip 20.3 and python3.6 in a virtual environment.
But for the same requirements.txt I am getting issue in my Jenkins pipeline.
Any ideas?

Well, it works with the old resolver without error but not with the new one- does that answer the question?
No. The old resolver would regularly resolve to a set of dependencies that violated the dependency constraints. The new resolver is slower, in part, because it stops doing that, and part of the work to stop doing that makes things slower (partially for reasons that are unique to the history of Python packaging).

When I say "without error" I was speaking literally- the violation you're saying could happen did not. Normally when it breaks things it says so- ;ike you get an message saying something like "Package wants X but we installed Y instead". I am explicitly saying that we got no such message.

When we run the legacy resolver on the same set of dependencies as the the new resolver the old legacy resolver comes back with a valid working set of dependencies in one minute and twenty nine seconds, while the old resolver fails after timing out our CI systems with 20 minutes of nothing.

Any ideas?

Environmental differences perhaps? It's possible to have dependencies conditional to the environment that pip is running in (python version, OS, platform etc)

Tips on interpreting the logs might be useful here - I can see what packages pip is searching through, but it's not clear what constraint is failing leading to this search.

There is an undocumented + unsupported option that I'd added for my own personal debugging: PIP_RESOLVER_DEBUG. No promises that it'll be in future releases or that there won't be a performance hit, but right now, you can probably use that. Moar output! :)

Normally when it breaks things it says so- ;ike you get an message saying something like "Package wants X but we installed Y instead". I am explicitly saying that we got no such message.

Oh interesting! Are you sure you're not suppressing the error message? (there's a CLI option, env var or config file that can do this -- pip config list would help identify the last two)

If not, could you post reproduction instructions in a Github Gist perhaps, and link to that from here?

PS: I've worked on/written each of the components here - the warning, the old resolver and the new one, and AFAIK what you're describing shouldn't be possible unless I've derped real hard and no one else has noticed. ;)

My experience is the following:

  1. The new resolver with backtracking is straightforwardly too slow to use (would be very helpful if there were a flag to just hard fail it as soon as it starts to backtrack), so the obvious workaround is just to snapshot dependencies that we know work from a legacy pip freeze into a constraints.txt file as a stopgap. (God knows how we're going to regenerate that file, but that's a problem for another day).

  2. Uh oh, looks like we still have a conflict even though we know that the versions work, but luckily the project we depend on has fixed its dependencies on master, so let's just depend on the git URL. Ahh, cool, that doesn't work (https://github.com/pypa/pip/issues/8210), those belong in requirements.txt.

  3. A few more issues, including a hard failure on bad metadata for a .post1 version (https://github.com/pypa/pip/pull/9085 apparently didn't fix, or thinks this is a real failure) -- so now I'm manually editing the constraints.txt and adding comments explaining that this file is going to need to be maintained by hand going forwards.

4; Everything seems resolved, and now I'm in an apparently infinite loop (who knows, I stopped it after 33k lines were printed to stdout) in which the following lines are printed over and over and over again:

Requirement already satisfied: google-auth<2.0dev,>=0.4.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from google-api-core<1.24,>=1.16.0->dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.23.0)
Requirement already satisfied: google-auth<2.0dev,>=0.4.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from google-api-core<1.24,>=1.16.0->dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.23.0)
Requirement already satisfied: six>=1.14.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.15.0)
Requirement already satisfied: requests<2.24.0,>=2.18.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-core@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-core&subdirectory=core->-r python_modules/elementl-data/requirements.txt (line 3)) (2.23.0)
Requirement already satisfied: pytz>=2015.7 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from Babel>=2.0->agate<2,>=1.6->dbt-core@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-core&subdirectory=core->-r python_modules/elementl-data/requirements.txt (line 3)) (2020.4)
Requirement already satisfied: googleapis-common-protos<1.53,>=1.6.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.6.0)
Requirement already satisfied: setuptools>=34.0.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from google-api-core<1.24,>=1.16.0->dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (50.3.2)
Requirement already satisfied: six>=1.14.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.15.0)

Baffling. I've attached the requirements.txt, and constraints.txt. The setup.py runs as follows:

from setuptools import setup

setup(
    install_requires=[
        "boto3",
        "dagster_aws",
        "dagster_dbt",
        "dagster_gcp",
        "dagster_pandas",
        "dagster_slack",
        "dagster",
        "dagstermill",
        "dbt",
        "google-cloud-bigquery",
        "idna",
        "nltk",
        "pandas",
        "pybuildkite",
        "requests",
        "slackclient",
        "snowflake-sqlalchemy",
        "tenacity",
    ],
)

requirements.txt
constraints.txt

@mgasner I imagine you'd benefit from adopting pip-tools, and performing the dependency graph construction and dependency management as a separate step from installation. :)

Oh interesting! Are you sure you're not suppressing the error message? (there's a CLI option, env var or config file that can do this -- pip config list would help identify the last two)

This is in CircleCI so it's not trivial to run the command, but I've seen these messages before in CircleCI with these containers and we're not overriding things so I have no reason to believe we're suppressing anything.

If not, could you post reproduction instructions in a Github Gist perhaps, and link to that from here?

I do appreciate you all looking into it, and will definitely try to help replicate it- but since it involves some private libraries of ours (pulled from github repos) I'll have to put some effort in and can't promise it'll be quick.

@pradyunsg Yes, it's crystal clear that using the dependency resolver to resolve dependencies is a nonstarter, but that is not the issue I encountered here -- that's the starting point.

A note to everyone reporting problems here:

Hi. I'm sorry you're having trouble right now. Thank you for sharing your report with us. We're working on the multiple intertwined problems that people are reporting to us.

(If you don't mind, please also tell us what could have happened differently so you could have tested and caught and reported this during the resolver beta period.)

FYI: PEP-643 (Metadata for Package Source Distributions) has been approved. :rocket:

So earlier I predicted that this would force people to stop supporting valid versions of packages simply because of the dependency issues, not because of any actual programmatic problem with them. This is already happening-

Screen Shot 2020-12-02 at 12 05 03 PM

This change is going to push people into being far, far more restrictive in the supported versions and that's going to have ramifications that I really hope people have considered.

This change is going to push people into being far, far more restrictive in the supported versions and that's going to have ramifications that I really hope people have considered.

The hope is that people would provide reasonably strict requirements than the collective community traditionally prefer. It is very rarely, when users ask for “requests” (for example), that really any version of requests would do; but Python packaging tools traditionally “help” the user out by naively settling on the newset possible version. My hope is that Python package users and maintainers alike would be able to provide more context when a requirement is specified; this would help all users, maintainers, and tool developers to provide a better co-operating environment.

One thing that might be worth considering is whether the reports of long resolution times share any common traits - the most obvious thing being a particular set of "troublesome" packages. I've seen botocore come up a lot in reports and I wonder whether it's got an unusually large number of releases, or has made more incompatible changes than other packages?

Obviously, it's not practical for us (the pip developers) to investigate packages on a case by case basis, but we need something more specific to get traction on the problem.

Maybe we could instrument pip to dump stats ("tried X versions of project A, Y versions of project B, ..., before failing/succeeding"), to a local file somewhere that we ask people to upload? But that's mostly what's in the log anyway, and it's less useful unless people let the command run to completion, so maybe it wouldn't be much additional help.

https://github.com/pypa/pip/issues/9187#issuecomment-736104404

One other idea toward this is, stopping after 100 backtracks (or something) with a message saying "hey, pip is backtracking due to conflicts on $packages a lot".

Let's do this -- and pick a number for this. And allow the user to pick a higher number from the CLI?

One thing to consider is how do we count toward that number. Say if X depends on Y. X==2.0 is pinned, Y were backtracked three times and ultimately all versions failed, so X is backtracked and pinned into X==1.0, where Y is backtracked another two times and finally found a working version. Does Y now have a backtrack count of 3 or 5? I can think of reasons why either may be better than the other.

I've seen botocore come up a lot in reports and I wonder whether it's got an unusually large number of releases, or has made more incompatible changes than other packages?

It indeed has an unusually high number of releases, it's being released nearly daily, see https://pypi.org/project/botocore/#history

It indeed has an unusually high number of releases, it's being released nearly daily

And as an example, it depends on python-dateutil>=2.1,<3.0.0. So if you to to install python-dateutil 3.0.0 and botocore, pip will have to backtrack through every release of botocore before it can be sure that there isn't one that works with dateutil 3.0.0.

Fundamentally, that's the scenario that's causing these long runtimes. We can't assume that a version of botocore from years ago might not have allowed any version of python-dateutil (even though 3.0.0 probably didn't even exist back then and in practice won't work with it), so we have to check. And worse still, if an ancient version of botocore does have an unconstrained dependency on python-dateutil, we could end up installing it with dateutil 3.0.0, and have a system that, while technically consistent, doesn't actually work.

The best fix is probably for the user to add a constraint telling pip not to consider versions of botocore earlier than some release that the user considers "recent". But pip can't reasonably invent such a constraint.

I've seen botocore come up a lot in reports and I wonder whether it's got an unusually large number of releases, or has made more incompatible changes than other packages?

The AWS packages are indeed released frequently, and probably the fact that they pin so strictly is the cause for the extensive backtracking. So it seems not only too loose, but also too strict requirement specifications can cause problems for the resolver. There are some tricks to force conflicts fast (e.g. choosing next package to solve based on least amount of versions still viable, and choosing packages that have least amount of dependencies, ref https://github.com/sdispater/mixology/pull/5).

The classic exploding example is combining a preferred boto3 version with any version of awscli, because boto3 is restrictive on botocore, and awscli is restrictive on botocore as well.

Some libraries try to solve this issue by providing extras_require, e.g. aiobotocore[awscli,boto3]==1.1.2 will enforce exact pins (awscli==1.18.121 boto3==1.14.44) that are known to be compatible with each other (botocore being deciding here).

If either is requested without pin however, the resolver will have to consider a huge amount of versions to find one that requests overlapping botocore versions.

$ pipgrip --tree boto3==1.14.44 awscli==1.18.121
boto3==1.14.44 (1.14.44)
├── botocore<1.18.0,>=1.17.44 (1.17.44)
│   ├── docutils<0.16,>=0.10 (0.15.2)
│   ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
│   ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
│   │   └── six>=1.5 (1.15.0)
│   └── urllib3<1.26,>=1.20 (1.25.11)
├── jmespath<1.0.0,>=0.7.1 (0.10.0)
└── s3transfer<0.4.0,>=0.3.0 (0.3.3)
    └── botocore<2.0a.0,>=1.12.36 (1.17.44)
        ├── docutils<0.16,>=0.10 (0.15.2)
        ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
        ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
        │   └── six>=1.5 (1.15.0)
        └── urllib3<1.26,>=1.20 (1.25.11)
awscli==1.18.121 (1.18.121)
├── botocore==1.17.44 (1.17.44)
│   ├── docutils<0.16,>=0.10 (0.15.2)
│   ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
│   ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
│   │   └── six>=1.5 (1.15.0)
│   └── urllib3<1.26,>=1.20 (1.25.11)
├── colorama<0.4.4,>=0.2.5 (0.4.3)
├── docutils<0.16,>=0.10 (0.15.2)
├── pyyaml<5.4,>=3.10 (5.3.1)
├── rsa<=4.5.0,>=3.1.2 (4.5)
│   └── pyasn1>=0.1.3 (0.4.8)
└── s3transfer<0.4.0,>=0.3.0 (0.3.3)
    └── botocore<2.0a.0,>=1.12.36 (1.17.44)
        ├── docutils<0.16,>=0.10 (0.15.2)
        ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
        ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
        │   └── six>=1.5 (1.15.0)
        └── urllib3<1.26,>=1.20 (1.25.11)

@ddelange thanks for the analysis!

There are some tricks to force conflicts fast

We're exploring that option in #9211

The classic exploding example is combining a preferred boto3 version with any version of awscli, because boto3 is restrictive on botocore, and awscli is restrictive on botocore as well.

Unfortunately, I can't think of any way to address this without the package maintainers helping somehow (or users explicitly, and manually, constraining what versions they are willing to let pip consider).

Maybe we need a mechanism to mark versions as "too old to be worth considering by default". But that would need packaging standards to define how that information is exposed, and package maintainers to manage that information, so in practice I doubt it would be practical.

FYI: PEP-643 (Metadata for Package Source Distributions) has been approved. 🚀

Ignoring the more platform-specific/legacy etc packages, would it theoretically become possible for pip to fetch all .whl.METADATA files for every version of a package in one big call to PyPI?

With proper caching both on pypa/warehouse servers side and on the pip-user side, it could be a huge speedup. As you mentioned earlier:

most resolution algorithms I've seen are based on the assumption that getting dependency data is cheap

@ddelange If the major cost is downloading+building packages, then yes. See https://github.com/pypa/warehouse/issues/8254. :)

Edit: @dstufft discussed about this at some length in https://github.com/pypa/pip/issues/9187#issuecomment-736792074.

2\. Start a binary search, cut the remaining candidates in half and try with that.
    2a. If it works, start taking the binary search towards the "newer" side (cut that in half, try again, etc).
    2b. If it fails, start taking the binary search towards the "older"side (cut that in half, try again, etc).

This isn't exactly the correct use of a binary search, because the list of versions aren't really "sorted" in that way, but it would kind of function similiarly to git bisect? The biggest problem with it is it will skip over good versions if the latest N versions all fail, and the older half of versions all fail, but the middle half are "OK".

I wonder if a noisy binary search/probabilistic bisection algorithm could make this approach more robust. https://github.com/choderalab/thresholds/blob/master/thresholds/bisect.py

And as an example, it depends on python-dateutil>=2.1,<3.0.0. So if you to to install python-dateutil 3.0.0 and botocore, pip will have to backtrack through every release of botocore before it can be sure that there isn't one that works with dateutil 3.0.0.

The best fix is probably for the user to add a constraint telling pip not to consider versions of botocore earlier than some release that the user considers "recent". But pip can't reasonably invent such a constraint.

Apologies if this has already been thought of and nixed

I wonder if pip could make an assumption here though.

Given packages A and B where A depends on B. If A version 10 supports versions B <= 5, I think pip could assume that versions of A < 10 don't support versions of B > 5. In my experience, packages rarely use upper limits on dependencies. And when they do, they rarely decrease in number (usually maintainers bump versions, not decrement them). It seems unlikely that A==9 would support B<=6 and decrement to B<=5 on the next release. And if it did, the user could still get pip to solve this environment by providing an explicit constraint.

I think this would help the case where pip keeps backtracking on botocore versions - it can check the most recent version, see it doesn't support python-dateutil 3.0.0 and bail out early (since it assumes older versions of botocore don't support a newer version of python-dateutil.

Maybe we need a mechanism to mark versions as "too old to be worth considering by default". But that would need packaging standards to define how that information is exposed, and package maintainers to manage that information, so in practice I doubt it would be practical.

@pfmoore I had a number of early failures where the dependency resolver was wandering back over a dozen package versions and crashing out when trying to get package metadata from packages so old that their setup.py wasn't compatible with Python 3...

  1. I'd like to hope the dependency resolver is being smart enough to not download a version a version that it _knows_ is python2 only... not sure exactly the best way how, but just stating my sentiment on the matter.
  2. Giving up before downloading the n+1th (for an arbitrary reasonable value of n) versions of a package with some kind of informative message about version constraints in the event of a failure, is probably smarter than downloading 100+ packages until the build times out.

Not specifically addressed at you @pfmoore, just my general feedback after dealing with (and researching into) the new resolver in the latest published release of pip for several (>6) hours today.

  1. The issue behind most people's issues seems to be less a case of "it couldn't find a valid solution" and more the case that the new resolver is being _pathologically persistent_ in its effort to resolve things. It _feels_ like the new resolver has no safeguards against obvious infinite loop and pathological worst case performance times. Why would _anyone_ want to let pip run for over an hour _as default behaviour_. There are several suggestions along the line of a "fail fast" or "give up early" option, which when put along side the issue of the resolver fetching dependency information for a single package, going back 50 versions (see my lxml example in point 4) as it gets slower and slower over the course of an hour on one CI build... doesn't feel like this was ready to be made the default.

  2. The only way I was able to get even close to acceptable performance was with --use-feature=fast-deps but even this is a partial mitigation as the dependency resolver will happily walk back so far that it can no longer get the information this way, and starts doing things the slow way again. My best example being lxml which used the fast path from versions 4.6.1 until 3.7.2 Then the slower download the package tarball method was in use from 3.7.1 until that CI run timed out after 60 minutes, having walked all the way back to 2.0.8.

  3. CI systems for docker containers frequently contain _no pip cache_ due to the limitations of various combinations of available docker building tools. Regardless of other issues going on, measuring performance of the resolver _must_ include some kind of performance testing for this case, reasonable size projects of at least 100 total dependencies with a 100% clean build environment, nothing cached.

    • For interested people troubleshooting their own performance problems with the new resolver, I've found 1 partial mechanism around this https://stackoverflow.com/questions/58018300/using-a-pip-cache-directory-in-docker-builds but it relies on experimental new features, so I'd caution anyone from relying on it, and even with this to pull the cache out from the docker environment onto the host... there is the issue of limitations imposed by CI systems that reside outside of the docker build itself... such as in my own case where even if I did pull a pip cache out of the container onto the build host filesystem, the CI service would throw that filesystem content away within 60 minutes of the build box being idle.

Edit- fixed typos

Apologies if this has already been thought of and nixed

It's certainly something that's been implicit in our discussions, but we may never have explicitly covered it. So thanks for bringing it up.

I wonder if pip could make an assumption here though.

The problem is, this is very dangerous ground. There are a huge number of assumptions pip could make which would simplify things, but experience has shown that whenever we do make such assumptions, we find some package that breaks them. And "making an invalid assumption" is pretty much guaranteed to be reported to us as "a bug" 🙂

Your example seems very reasonable, but what if a package released a version supporting python-dateutil <= 3.0, but then added a feature that relied on behaviour that was removed in dateutil 3.0, so they released their next version depending on python-dateutil < 3 as a short term approach while they tried to implement their feature in a way that worked with newer versions of dateutil?

I think if we started using heuristics like this, we'd need to add ways to allow users to control if they are enabled. And that gets complex very fast.

I like the idea in principle (I wish projects wouldn't do weird things that make my job as a pip developer hard 🙂) but the practicalities will quite likely make it infeasible.

@pfmoore Along the lines of what I was getting at in point 3 of my comment https://github.com/pypa/pip/issues/9187#issuecomment-738103308 - A "simple default" heuristic solution here where "default package version backtrace depth = N" and a command line flag to override this with a user defined value of M would solve some of the issue with with botocore and other packages that walk back far too many versions. For the new resolver I think that simple heuristics like this, with well documented defaults, helpful error messages, and easy configuration... could go a long way to mitigating the bug reports that could come from more complicated "smart" approaches.

I think if we started using heuristics like this, we'd need to add ways to allow users to control if they are enabled. And that gets complex very fast.

A fair point. I wholly relate to the "too many knobs" on an OSS project. I understand that making assumptions comes at a cost here, and trust your (and the pip team's) judgement. That said, I think the assumption above is a fair one given a well-designed escape hatch. Perhaps:

  • User's that need an older version could set a constraint for that version? This changes the workaround behavior for the new resolver from forcing users to explicitly set lower bounds on requirements to speed things up to setting upper bounds on dependencies requiring an older version due to a dependency upper-bound downgrade.

  • Pip might slightly make the assumption described above, but also backtracks a small amount as needed (with a configurable number of backtracks). This makes the assumption that decreases in a dependency max version are rare and shortlived (e.g. release a patch release that downgrades the dep, fix in the next version or two and remove the cap again) and would solve:

    what if a package released a version supporting python-dateutil <= 3.0, but then added a feature that relied on behaviour that was removed in dateutil 3.0, so they released their next version depending on python-dateutil < 3 as a short term approach while they tried to implement their feature in a way that worked with newer versions of dateutil?

That said, I haven't thought about this nearly as much as y'all have. Just my 2 cents.

@jcrist's proposal makes the same assumption that @dstufft's proposal for binary search does: that the minimum required dependencies of a package is usually ordered through the versions of the package

I suppose the question is whether the cases where this is not true occur frequently enough that it is not worth speed ups of the resolver.

As absurd as it might sound... I wonder if anyone has used all the available package metadata on PyPI to perform an actual analysis of the situation. It should be possible to discover a reasonable lower bound for how often such decreases in a dependency maximum version actually happen.

I'm not able to respond as fast as the suggestions are coming in now, so I'll leave this discussion for now and review later. But can I just say thanks to everyone for the constructive and helpful suggestions. This is a really tricky problem to get right, and new ideas and perspectives are really helpful!

As absurd as it might sound... I wonder if anyone has used all the available package metadata on PyPI to perform an actual analysis of the situation. It should be possible to discover a reasonable lower bound for how often such decreases in a dependency maximum version actually happen.

Not absurd at all. I've an ongoing piece of work trying to collect that data for analysis. The problem is, I've got the stuff available from the PyPI JSON API, and that's been a pretty big job to collect. But getting dependency data means downloading every wheel on PyPI, as well as building all the sdists (and even then, for sdists I only get metadata that applies to my system). I've yet to even work out where to begin with that task. Even "just pick representative packages" isn't practical, as I wouldn't have looked at "botocore" and assumed it was an important package, if I hadn't seen it come up here so often.

(If anyone has a PyPI mirror and could extract the medatada files from all the wheels, and publish just that data somewhere, that would be very useful).

So yes, it's a reasonable suggestion, but the logistics of downloading the whole of PyPI in order to get the data makes it a lot harder than you'd think. (The pip devs don't have any privileged access to PyPI that might make this easier, unfortunately).

I'm going to let things accumulate overnight and see how the conversation has moved on. But for now... @pfmoore (and anyone else curious ... its a bit late for me to start running this sort of thing at this hour of night) if you're curious theres a few starting points I found while looking into if it had been attempted that are from other people who have done PyPI analysis "the hard way" in the past.

... edit: Last thought, for anyone looking into this idea, don't forget to check out how pip does the dependency lookup for --use-feature=fast-deps because it will likely make a big difference in how easily you get the metadata for so many versions.

@techdragon These posts are mostly from a fair few years ago, and PyPI has significantly more packages now than it did before (here's a post from early 2019, showing the growth curve: https://blog.adafruit.com/2019/03/13/growth-in-the-python-ecosystem-python/). It's roughly exponential growth.

Many of these techniques now require _significantly_ more resources than what they used to. I've had some visibility in a non-public effort to do this, and even with the biggest single EC2 machine from AWS, it was non trivial and called off before anything useful came out of it.

On the issue of analyzing (and possibly validating) archival metadata from packages on PyPI, folks may be interested in https://github.com/pypa/warehouse/issues/474#issuecomment-370986838 and https://github.com/pypa/packaging-problems/issues/264 , as listed in the "Audit and update package metadata" item in the list of fundable packaging projects.

(If anyone has a PyPI mirror and could extract the medatada files from all the wheels, and publish just that data somewhere, that would be very useful).

fwiw: I scraped PyPI in September 2019 and August 2020, using distlib to get the dependencies of each package, and dumped the results to a file. The results are messy and incomplete (I don't remember how distlib works out the dependencies, so I don't know exactly _how_ incomplete), and I don't have time right now to clean them up, but that's now available here if it's useful to anyone: https://github.com/sersorrel/pypi-stats

The hope is that people would provide reasonably strict requirements than the collective community traditionally prefer. It is very rarely, when users ask for “requests” (for example), that really any version of requests would do; but Python packaging tools traditionally “help” the user out by naively settling on the newset possible version. My hope is that Python package users and maintainers alike would be able to provide more context when a requirement is specified; this would help all users, maintainers, and tool developers to provide a better co-operating environment.

This is reasonable at the top- capping the version makes sense since you don't know if new versions are going to work (and this is obviously easier for packages that support semantic versioning).

The problem is in the other direction- more restrictive limits (we only support the latest three versions of library X for example) lowers the compatibility window between packages. If you've got one package that works with ~v1 but is set to only >v1.46 and another package that's capped at v1.40 you won't be able to find a match at all.

That's fine if it's for legitimately reasons, but if it's only happening to avoid dealing with excessive issues from the package manager then it seems like a problem. This is a simplified example- in the real world where people depend on more than two libraries, and maintainers (who often work for free) are sticking to only supporting a minimal versions set, it'll be far worse and result in a lot of cases where pip won't be able to find matches.

Even "just pick representative packages" isn't practical, as I wouldn't have looked at "botocore" and assumed it was an important package

https://pypistats.org/top

What @tedivm said is true -- by encouraging module developers to more tightly pin their requirements, you are asking for Version Hell where two modules are just impossible to install together, even if they would actually work.

Seriously PIP maintainers - please roll-back to the previous resolver and come back with better one, when all those problems get resolved. PLEASE. There is no way you are going to resolve all those issues. There is no way it can be done.

Just treat is as a learning opportunity. There is no problem with trying but when you see it does not work, there is no shame with withdrawing.

Hi, @potiuk! Thank you for sharing your advice and opinion. We will take it into consideration.

We ran a beta period for the new pip resolver to solicit bug reports, starting with the release of pip 20.2 in July. Folks who are reporting new issues to us: it would really help us if you could also tell us what could have happened differently so you could have tested and caught and reported the problems you're seeing during the pip resolver beta period. No matter whether we rip out the legacy resolver as planned, keep it around for longer, or do something else, your response on that point will help us with future rollouts.

I actually shared Airflow Story here already with full explanation of why we actually tried, but could not test the new resolver before https://github.com/pypa/pip/issues/9203#issuecomment-737878689

@potiuk Thank you for doing that! Other participants here - please do follow @potiuk's lead.

And to be very frank - this was not a complaint of any sort. It was just a piece of advice. I DO understand how hard job you have keeping half of the internet working! It was really suggestion, and observation as a user that there is no way you can fix it now, looking at the type and amount of problems we see. It's simply realistic (I think) assessment of the situation and suggesting the course of action you might take.

I do appreciate all the work you put in it! It's just not going to work this time and I think it's good you face the reality.

A second to both of @potiuk's points -- I've long bemoaned the lack of a "proper" solver in pip so I can't wait for the kinks to get worked out of this.

And the developer equivalent of HugOps to everyone the the pip team!

Maybe the backtracker should stop at 2 backtracks per default ?

Would it make sense to raise by default (and print out the test case) if total elapsed time exceeds a default threshold, as well?

(Note that the "generate an infinite chain of packages and slowloris" worst case here is still unbounded when there's only a backtrack limit)

@westurner no, because not everyone has fast internet connections and/or computers, so time based methods would make pip unusable by default for such users.

Number of backtracks would be a better heuristic.

This is a long thread, with many different opinions and suggestions. I'll throw my own experience in here too.

At work, I'm using the new resolver and it works very well. We have ~70 complex direct dependencies, including large ML frameworks such as pytorch, tensorflow, etc and also boto3 and the usual suspects like requests, Click, Flask. It resolves to a full dependency graph of ~267 dependencies successfully.

For production systems, I am very much of the opinion that dependencies should be locked ahead of time. A developer working on a project needs to understand what the impact is when they update a dependency or add a new dependency. I realize this workflow or opinion is not the only workflow or opinion. Im not too interested in discussing the pros or cons either. I'm only sharing what works well in my experience and what I believe leads to deterministic and good outcomes.

To lock our dependencies, we have very positive results using pip-tools which was mentioned earlier.

Our workflow is as follows:

  • Developer modifies requirements.in
  • Developer runs a script that ultimately invokes python3.7 -m piptools compile
  • Developer observes the resolution results and commits the locked requirements.txt file to the repo
  • CI only needs to use the locked requirements.txt file which is already guaranteed to resolve in a reasonable time

I recognize this only works for "end-users" and not other projects such as Airflow which need to have "looser" dependencies. However, I do think projects such as awscli, boto, airflow have some responsibility for setting reasonable constraints on their dependencies. Other language ecosystems such as Java and JavaScript are able to do this successfully, why shouldn't Python?

I'd also like to bring attention to some data that may assist with some static analysis of PyPI.
https://github.com/DavHau/pypi-deps-db

The link above is from mach-nix which tries very hard to make Python pleasant to use in Nix. I'm not suggesting the Python ecosystem needs to move towards such extreme mechanisms, but perhaps that database will help if somebody is looking to analyse things.

Reasonable default limits for a tool that should not download an infinite sequence of code to execute with install-level permissions:

  • number of backtracks

    • 100?

  • total runtime

    • 1hr?

  • number of packages

    • 1000?

These could be overridden with pip.conf and/or PIP_ environment variables.

We absolutely should make attempts to bound the resource consumption (bandwidth, cpu time) of the worst cases.

Pinning dependencies is a workaround which can introduce delay in time to patch for critical severity issues: with (e.g. SemVer) constraints, users needn't be locked to old versions.

(Edit) Though, to be fair, pip doesn't claim to manage the full software distribution lifecycle: pip install -U could break things, not work: we should not expect users to run pip install regularly or even again (until a container is rebuilt with --no-cache TBH)

Was this page helpful?
0 / 5 - 0 ratings