Pip: Add `pip install --dry-run` or similar, to get resolution result

Created on 15 Mar 2011  ·  58Comments  ·  Source: pypa/pip

What's the problem this feature will solve?

pip currently does not have any mechanism for users to get the result of pip's dependency resolution. This functionality is useful for users to be able to:

  • generate something like a "lockfile"
  • checking if installing a package would break the existing environment
  • checking for dependency conflicts among a set of packages
  • (more?)

All these can be performed today, but require installing packages to some environment and introspecting the environment for information. Since all the relevant information is available to pip install at run time, it would be useful to avoid hitting issues with this.

Describe the solution you'd like

8032 proposes a pip install --dry-run option.

7819 proposes a pip resolve command.

1345 has more proposals. :)

There have likely more proposals in the issue tracker that I can't find.

Alternative Solutions

Let other not-pip tooling in the ecosystem provide this functionality to users. This is sub-optimal given that pip's resolver isn't exposed publicly in any way (pip's internals are not to be used like a library).

The most notable example is pip-tools project, which is the current best answer for any user who is seeking this functionality.


Note: This description has been edited by @pradyunsg in April 2020 (see edit history for details), and some really old, outdated comments have been hidden on this issue.

dependency resolution UX feature request

Most helpful comment

Someone needs to figure out how to present the resolution result first. AFAIK pip maintainers involving in resolver work are all currently putting efforts on improving the resolver itself at the moment, so this would need some outside help to move forward.

All 58 comments

Hmm, maybe we need a command to list all package versions available in PyPI.
What you think?

Like:

$ pip list Django

1.2.4

1.2.3

1.2.2

1.2.1

1.2

1.1.3

1.1.2

1.0.4

$

Original Comment By: Hugo Lopes Tavares

i like that idea.


Original Comment By: CarlFK

I just implemented it in my fork. I want carljm, jezdez and ianb's opinions
before merging.

Check it here: https://bitbucket.org/hltbra/pip-list-
command/changeset/e5c135a46204


Original Comment By: Hugo Lopes Tavares

I haven't thought about this deeply, but it seems better to me to improve the
"search" command to also list versions (maybe list all versions with a -v flag
or something), rather than adding a new separate command. This functionality
seems to be logically a part of search.


Original Comment By: Carl Meyer

I would be ok with adding this to search as long as my other ER is implemented
too so that I don't get 3000 lines (really) when I look for django's versions.
(there are over 1000 hits to "pip search django" because of all the django-foo
packages. )


Original Comment By: CarlFK

I think it's unnecessary to show what version would be installed as long as
there is a list of available versions, because it's easy to determine which of
these is the latest. Furthermore there is no reason for using a new flag to
enable the output of this list, because the overhead tends toward zero. It's
just a small change:
https://bitbucket.org/jumpa/pip/changeset/62076643cf33


Original Comment By: jumpa

Hi Carl, I have though about adding an option to search command, but my fear
was happen what happened to install command: if you want to "upgrade" a
package, you need to use "install" option - that's weird, and we know that.

And if most people think adding an option to search is better, I don't see too
much problem about using search --versions or search -v.


Original Comment By: Hugo Lopes Tavares

I think listing all available versions by default is too verbose. Some
packages have many available versions on pypi - ten or twenty isn't uncommon
at all. Listing the latest version by default and all versions with a flag
seems reasonable.

And I agree with CarlFK that we also need improvements to search make it
easier to narrow. The trouble there is we depend on what PyPI's API gives us,
which isn't much: we aren't going to download all of PyPI and do a regex
search locally! I'd be in favor of something like an --exact flag to search as
part of this change, so you can search for e.g. "--exact django" and only get
django itself in your result.


Original Comment By: Carl Meyer

Hi jumpa, Great idea to show the versions using the one got from xmlrpc
connection!

I got the following snip searching for pip:

pip                       - pip installs packages.  Python packages.  An

easy_install replacement (versions: 0.2, 0.2.1, 0.3, 0.3.1, 0.4, 0.5, 0.5.1,
0.6, 0.6.1, 0.6.2, 0.6.3, 0.7, 0.7.1, 0.7.2, 0.8, 0.8.1, 0.8.2)

This is going to be such a big list of packages, using lots of versions -
because our search cares about names and summaries. Should we care about it?


Original Comment By: Hugo Lopes Tavares

As an enduser, I like the idea of having a separate list command*. As time
goes on a separate command maybe easier to modify and enhance in isolation. To
make a separate list command more useful you could indicate which version, if
any, is currently installed.

pip list Django


1.2.4

1.2.3 installed

1.2.2

1.2.1

1.2

1.1.3

1.1.2

1.0.4

Note: Many popular package managers such as YUM(RedHat), pkglist(BSD),
dpkg(Debain) provide a separate list flag or command.


Original Comment By: Kelsey Hightower

Carl, I was thinking another day about this issue, and I have to agree with
Kelsey: another command is not bad.

Think about we using a flag to indicate I want just that package name, and
another flag to indicate I want to retrieve all available version.

It is a bit odd.

Let's try to illustrate:

$ pip search -v --exact Django

versus

$ pip list Django

Kelsey's idea of showing what version is installed just improves the list
command.


Original Comment By: Hugo Lopes Tavares

I made the changes into search command.

Since the search command already shows the installed and latest version available, it probably makes more sense to augment search command to list also all available versions too.

Here's my changeset: https://github.com/phuihock/pip/commit/46bad23b5bf31850a02803ea1ca18be5deca9db3

What's the status with this? Can one see the latest available version on PyPI by using pip?

Is this implemented already? This has been nagging me for about two years now, and I would love to see this solved.
The limited search combined with a flag for versions looks like a very usable solution.

Just wanted to add here that I just ran across this thread while doing a search for how to show versions... I tried pip search -v package -- somehow that would have made intuitive sense to me: a verbose description of the package to be installed, including version info...

I _just_ realised this functionality is still not implemented after a colleague of mine asked me about it. Are there any plans that it will become available in the coming versions of pip?

I believe this PR might be relevant, that's currently being worked on?

You might also be interested in the Linuxconf 2014 talk "Python Packaging 2.0: Playing Well With Others" on the current situation, as well as the future, of python packaging. The speaker said (if I understood correctly) that some of the limitations on metadata in pip are a consequence of the design of PyPI, which was originally based on CPAN, and that reworking PyPI's backend (whilst remaining compatible with the current one using tests) should improve the situation. He was mostly talking about "system integrators" i.e. downstream packagers, but I guess that would affect things like this issue, making them easier to resolve.

+1,000,000

and how's it now?

Now, is there any way to show which version is latest without installing it?
This issue is open since 2011 an the patch I've seen above is just one line. :(

This seems like a minor feature to enable, how is there not an option equivalent to say apt-cache madison at this point?

I would really like to see the latest version of the PyPi package when searching as well. Having an exact match works, but I use awk as a workaround.

This frustrated me as well and seeing that there's little to no hope on this I decided that creating an alternative might worth my time. Alongside with this problem I implemented some other requested features (like regex search and colorized output, etc.). If you are interested you can check it out here

wget -qO- https://pypi.python.org/pypi/uWSGI | egrep -o "uwsgi-[0-9].[0-9].[0-9][0-9].tar.gz" | sort -u

wget -qO- https://pypi.python.org/pypi/uWSGI/json | grep '"version"'

@andreif This will not alwasy find the correct version (pip ignores alphas, betas, release candidates, etc unless --pre is provided). This is closer (but no guarantees either):
wget -qO- https://pypi.python.org/pypi/uWSGI/json | grep -E ' {8}"[0-9."]*": \[' | sort -V | tail -n 1 | tr -d ' ":['

Ok, then the JSON response should include something like "pre-version": "1.2.3a4", so one can grep both of them with a simple expression.

I don't really understand this issue... What is this about?

  • Adding a new option on pip install which would make pip run only until resolution of the packages and print the packages selected and exit (skipping the installation)?
  • Showing the latest version next to package names when using pip search?

    • Being able to see the latest version of a package on PyPI?

The latter seems to have been solved to me and the former should probably get a new dedicated issue.

@pradyunsg Well, if I recall correctly, I needed a simple way to check what version is currently available (both release and pre-release). This version will be installed by pip install -U [--pre].

I needed it for a script setting up a virtualenv. When there was a newer version of a package it asked whether to update or keep the current version. So this use-case is covered by the new pip search.

pkg="foo"; pip install --download /dev/null --no-cache-dir --no-binary :all: "$pkg" 2>&1 | egrep -i "$pkg"'-' | head -1 | egrep -io "$pkg"'-[^ ]+' | sed 's/^'"$pkg"'-\(.*\)\.tar\.gz$/\1/g'

Post --download-deprecation version...

pkg="foo"; tmp="$(mktemp -d)"; pip download -d "$tmp" --no-cache-dir --no-binary :all: "$pkg" 2>&1 | egrep -i "$pkg"'-' | head -1 | egrep -io "$pkg"'-[^ ]+' | tr A-Z a-z | sed 's/^'"$pkg"'-\(.*\)\.tar\.gz$/\1/g'

Using something like:

pip install foo==

gives a list of all available versions (for a valid pypi available package, molecule in this case):

Could not find a version that satisfies the requirement molecule== (from versions: 1.20.1, 1.20.3, 1.21.1, 1.22.0, 1.23.0, 1.25.0, 1.25.1, 2.10.0, 2.10.1, 2.11.0, 2.12.0, 2.12.1, 2.13.0, 2.13.1, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.18.1, 2.19.0) No matching distribution found for molecule==

but it would be nice to be able to get the version that would be installed using pip without actually downloading and/or installing it to dev/null :-)

Agreed. A package manager without this functionality is kind of a joke. Even your example of getting a list of versions is a complete hack to compensate for yet another feature this package manager doesn't have. I don't say that to be mean, but these are things that should've been available well before eight years ago when this very bug which is utterly ignored was created. I assume pip will just be replaced by something that does in fact have more features, but is also abominably, grotesquely, monstrously large & overcomplicated. Oh well, we can always write our own.

pip-tools is a tool based on pip which I believe can help with some of the things the people on this thread are asking about: https://github.com/jazzband/pip-tools

If you give it a list of abstract dependencies (i.e. with no versions specified), it will tell you the specific versions of each requirement and dependency to install.

pkg="django"; echo "$pkg" | pip-compile - --output-file - | egrep -i '^'"$pkg"'=' | cut -d '=' -f 3- is just about as silly (more silly if you count that it's another thing you have to install [further silly if you need python2 support])

Moreover, the whole point of pip-tools (& pipenv) can be accomplished with plain pip and a constraints file. (pip install -r reqs -c constraints; pip freeze > constraints).

Adding a new option on pip install which would make pip run only until resolution of the packages and print the packages selected and exit (skipping the installation)?

I just performed some significant housekeeping here, and this issue is now for tracking / discussing this use case + if/how pip would change to accommodate this request for new functionality.


I've hidden a whole bunch of comments, ranging from really old comments that no longer have the relevant context attached, to those that included "potential hacks/scripting nuggets" to do 90%-of-the-job for this. For the latter group, please refrain from posting more of the latter here -- there's other user support forums where those suggestions would be more appropriate, not on the forum for discussing how to implement the fix in the tool itself. :)

Apologies to anyone whose actually-still-relevant-and-useful comment got hidden; I really couldn't figure out context for a lot of comments and yours might've just gotten hidden in collateral damage.

As this ticket is blocked by the development of the dependency resolver (#988), I thought I would mention here that the team is looking for help from the community to move forward on that subject.

We need to better understand the circumstances under which the new resolver fails, so are asking for pip users with complex dependencies to:

  1. Try the new resolver (use version 20.1, run --unstable-feature=resolver)
  2. Break it :P
  3. File an issue

You can find more information and more detailed instructions here

One use case I'd like to highlight that arose from experimentation in #7819 which I don't yet see mentioned in this thread is specifically recording URLs for later downloading and installing packages, which is slightly orthogonal functionality to the lockfiles discussed above, and is specifically useful for consuming the results of a pip resolve without having to download any large files.

We have been developing a packaging method for large, typically machine learning applications at Twitter which we are calling "ipex" which can be shipped without containing 3rdparty code until they are first executed (which reduces their size greatly). In the case of pantsbuild/pants#8793, we generate an executable pex archive which calls the pex runtime library to resolve requirements (pex executes pip under the covers). I'm currently working on a prototype that replaces the full pex/pip resolve step at runtime with a replacement that records just the URLs to download dists from (the req.link). This is extremely fast in practice (and it can be cached very granularly), since the download and file copying to create the "hydrated" pex file can be performed entirely in parallel.

That capability (of downloading and installing tons of wheels/non-wheels in parallel) relies on additionally just exposing the URL of any wheel or non-wheel that we would put into a lockfile, which I don't see mentioned here yet. That allows pants to invoke pip exactly once (to resolve download URLs) when a dehydrated ipex file is created, and then the result of that "resolve" step with URLs can be consumed to download requirements when the ipex file is first invoked on a completely different machine without having to invoke pip again.

It required a lot of effort in #7819 to propagate the URLs from the guts of the v1 resolver to the output. It was a lot less effort when I last tried to make it work with the v2 resolver. Right now we're probably planning to ship some experimental version of a --dry-run or resolve command internally which spits out download URLs -- if we successfully do that, that should hopefully help to turn up any remaining issues with --unstable-feature=resolver in the meantime! :D :D

As you mentioned, the lock file format design is orthogontal to the resolver implementation. However, it also means it is out of scope for the current pip project. There has been discussions on the topic (warning: very long thread), but considering the lack of developer time available, this in turn means that serious discussion will likely not happen before the resolver is at least stablised.

Thank you for the link!!

On Sun, May 24, 2020 at 19:34 Tzu-ping Chung notifications@github.com
wrote:

As you mentioned, the lock file format design is orthogontal to the
resolver implementation. However, it also means it is out of scope for the
current pip project. There has been discussions on the topic
https://discuss.python.org/t/structured-exchangeable-lock-file-format-requirements-txt-2-0/876/1
(warning: very long thread), but considering the lack of developer time
available, this in turn means that serious discussion will likely not
happen before the resolver is at least stablised.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pypa/pip/issues/53#issuecomment-633346918, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAJ6UT3IK65CUQGUIOGBNVDRTHKMVANCNFSM4AIQRXLA
.

@cosmicexplorer Have you shipped that experimental version of a --dry-run or resolve command internally yet? If so, how is it going?

You may have noticed that I'm extremely interested in this feature 😄

Using something like:

pip install foo==

gives a list of all available versions (for a valid pypi available package, molecule in this case):

Could not find a version that satisfies the requirement molecule== (from versions: 1.20.1, 1.20.3, 1.21.1, 1.22.0, 1.23.0, 1.25.0, 1.25.1, 2.10.0, 2.10.1, 2.11.0, 2.12.0, 2.12.1, 2.13.0, 2.13.1, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.18.1, 2.19.0) No matching distribution found for molecule==

but it would be nice to be able to get the version that would be installed using pip without actually downloading and/or installing it to dev/null :-)

Nice trick !! Useful and convenient !! Really impressive !!

potential hacks/scripting nuggets ... please refrain from posting more of the latter here -- there's other user support forums where those suggestions would be more appropriate, not on the forum for discussing how to implement the fix in the tool itself. :)

Not for nothing, truly, but this is the kind of thing that happens when you let a bug linger for very long (7.56 years in the case of my own most recently contributed "nugget", although this still-open bug is now 9.25 years old). People share their workarounds.

I also doubt hiding comments including workarounds will help people realize the workaround they're about to post in a comment is unnecessary (partly because nobody is going to click all those hidden comments to see what has already been said). When someone comes to a decade-old bug and doesn't find some kind of progress or direction from the maintainers, or a workaround, I dare say they will consider sharing a workaround to be necessary, because making other people suffer through the work you've already done itself is unnecessary.

And yes, this very comment is also what happens when what you've done in response to this still-open bug happens.

Don't worry, I for one won't add any more unprovoked comments unless pip breaks my script again and this bug is still open.

Thanks for what you do. :)

@brainwane @ei8fdb I want to flag this issue as important from a UX perspective - related to #8377

High level summary based on my understanding:

  • with the new resolver, pip will be less permissive and refuse to install conflicting dependencies (ResolutionImpossible)
  • dependency conflicts can exist anywhere in the dependency tree
  • existing tools (pipdeptree pip-conflict-checker) only show packages that are already installed, not those that have been requested, but failed
  • There is currently no way for users to work out where the dependency conflict is _before_ a package is installed, or when a ResolutionImpossible error occurs (other than to manually inspect each project's dependencies)

In short, we need a way for users to detect possible dependency conflicts based on their top level requirements (i.e. the packages specified in requirements.txt or entered directly into the command line).

If we decide to do this, the proposed flag name (--dry-run) should be researched/discussed.

@uranusjr @pfmoore - please correct me if I've gotten anything wrong, or missed something based on our discussion. Thx

@nlhkabu I agree with all of your comments above. However, just to be clear, a --dry-run style of command will allow users to check if there's going to be a dependency conflict. But as described, it won't offer any additional help in diagnosing why the conflict exists. So it's basically a "look before you leap" install command, in contrast to the normal "ask forgiveness" approach where we install if we can, but do nothing and report the error if not.

What this doesn't provide, and which is something that IMO would be very useful (either as a pip subcommand, or just as usefully as a 3rd party tool) is a way of listing what the dependency tree that pip is working from looks like. (This doesn't require a resolver or an actual install step, it's "simply" transitively listing the dependency metadata from the package sources).

This could also take form of a pip resolve command.

pip resolve is what most would expect, please call it that 😄 It would also allow for its own flags eventually.

Thanks for the clarification @pfmoore. From a user perspective, I'm not sure how much use --dry-run is without resolve?

IMO, it's not enough to tell users they will get an error - we also need to give them enough information to find where it is and do something about it.

So, imagine that a user runs --dry-run... we could include something like this in the response:

Dependency conflict detected. pip will not be able to install d 1.0 and c 1.0.
The conflict is caused by:
d 1.0 depends on E==2.0
c 1.0 depends on E==1.0
Run pip resolve to inspect dependency tree.

We could also reuse pip resolve in the ResolutionImpossible error message (see #8377), which would be a big win.

@pradyunsg do we have a separate ticket for pip resolve ?

Also, just to be clear, I believe the intended use case for pip resolve is to either (assuming success):

  1. redirect the output to a file (that will usually be committed), or
  2. other tools will use/parse the output

For twitter, using the "ipex" tool as described in #7819, we are creating executable pex files using a pip resolve command which will output download URLs for all resolved distributions instead of downloading anything (not being used in production yet). This, along with several other optimizations e.g. #8448, allows creating these ipex files within seconds. These ipex files then download all of the distributions output from the pip resolve command the first time they are executed, from within the same datacenter -- this allows the ipex files themselves to be megabytes instead of gigabytes, which improves upload time from many regions.

So we basically embed a json version of the pip resolve output as a file in the pex archive, and we have a bootstrap script read that to download the distributions in parallel.

Any update on this?

Someone needs to figure out how to present the resolution result first. AFAIK pip maintainers involving in resolver work are all currently putting efforts on improving the resolver itself at the moment, so this would need some outside help to move forward.

Please correct me if I'm wrong, but the following seems to be true:

  • Installing a Python package involves executing its setup.py.
  • Without a --dry-run option, there's no easy and reliable way to know which packages pip's resolver will choose to install.

Hence, it seems to me that running pip install means consenting to running code from a rather arbitrary selection of PyPI packages on one's machine without an easy and reliable way to audit it. That selection depends recursively on the dependency choices and security practices of individual package authors.

It depends if the project and version to be installed only has a source distribution (sdist, contains setup.py) or also wheels (built distribution, contains a metadata text file, is installed by file copies without arbitrary code being run)

Even with --dry-run, it's likely that pip will need to run build backends for the packages (which for setuptools, involves running the setup.py) which do not have wheels.

Was this page helpful?
0 / 5 - 0 ratings