Moby: Reset properties inherited from parent image

Created on 6 Jan 2014  ·  153Comments  ·  Source: moby/moby

When building an image I may want to reset some of its properties instead of inheriting them from the parent image. It makes sense to inherit all properties by default, but there should be a way to explicitly and selectively reset them when it makes sense.

This is a more generic solution to #2210 which only addresses expose.

arebuilder kinenhancement statuneeds-attention

Most helpful comment

I would definitely like to have some way of removing VOLUME points inherited from parent images.

For example, suppose I had a main image for an application that used an external mountpoint for persistent data, but I also wanted an image based on it that was pre-populated with test data instead. As-is, I can't do that if the parent image used VOLUME because any changes/additions to those directories, even if those changes are during a docker build, are lost on commit.

All 153 comments

Suggestions welcome for syntax.

The best I can come up with is corresponding commands like UNVOLUME or more generically -VOLUME (but that would add more confusion, and potentially even create the misconception that +VOLUME should work, and should work differently from just VOLUME).

I very definitely want such a thing (most especially for VOLUMEs). It's also a little disorienting that things like VOLUME apply to following RUN lines, but things like ENTRYPOINT don't. Sometimes that's very useful, and sometimes it's not, but a generic "disable previous X instruction" could solve the issues around that quite nicely.

Is there a workaround for this in the meantime? I'm extending an image with an ENTRYPOINT (https://github.com/jagregory/pandoc-docker/blob/master/Dockerfile) and I need to unset the entrypoint. I tried using the following in my Dockerfile:

FROM jagregory/pandoc
ENTRYPOINT [] # this basically gets ignored (bug?)

FROM jagregory/pandoc
ENTRYPOINT [""] # this will make docker try to exec '' (the empty string)

FROM jagregory/pandoc
ENTRYPOINT ["/bin/sh", "-c"] 
# this will only work if docker run args are quoted:
#   docker run dergachev/pandoc "echo a b c"

Thanks!

I would definitely like to have some way of removing VOLUME points inherited from parent images.

For example, suppose I had a main image for an application that used an external mountpoint for persistent data, but I also wanted an image based on it that was pre-populated with test data instead. As-is, I can't do that if the parent image used VOLUME because any changes/additions to those directories, even if those changes are during a docker build, are lost on commit.

Just to update from @dergachev's comment, CMD [] and ENTRYPOINT [] were working last time I tested them recently, and should still be working (anything else would be ripe for bug filing).

You can reset all of the single-option commands via

ENTRYPOINT []
CMD []
USER 0
WORKDIR /

This should leave the remaining, un-resettable metadata as ENV, VOLUME, EXPOSE and maybe ONBUILD.

(This is coming from #8709)

If I exposed sockets 9000-9002 in the parent, but needed to unexpose 9001 in the child, I'd then have to write in the style of "unsetting"

EXPOSE
EXPOSE 9000
EXPOSE 9002

which would work but

UNEXPOSE 9001

looks nicer.

An advantage being, that it doesn't affect any EXPOSEs from further up the inheritance chain, which I might want to add later.

+1 @codeon-nat

This has been discussed in #8177, we're closing this for a lack of real world use cases.

Why is this being closed? There were 9 people commenting here. I think this would be a really useful thing to have. The real world use cases are being able to build upon existing images with ease. Sometimes you want to add property, sometimes you want to remove it. This is normal.

I agree, for example, I'm extending the nginx image for an SSL offloader and I want to UNEXPOSE 80 but leave 443.

Being able to unexpose ports is pretty critical if you want to run multiple instances of, e.g., nginx. Without this feature, each image that inherits from nginx tries to expose 80 and 443, causing the error Bind for 0.0.0.0:80 failed: port is already allocated.

Nevermind, this was just poor configuration on my side.

(April 15: "No real world use cases" I'm surprised you cannot imagine at least one and closed this)

I have a base image that exposes volumes or ports for optional software, then FROM that in another Dockerfile to make an image that should not expose anything it doesn't want to, or even things that it has uninstalled from the ancestor. Why would we NOT want to be able to remove these settings?

I also have a use case for this. I want to be able to create an image containing a database snapshot, but all the mysql packages have VOLUME /var/lib/mysql set. It would nice to be able to turn off the volume the changes to the database made in my Dockerfile will stick with the image.

The only other option is to completely re-create a custom mysql image, but that somehow seems wasteful since plenty of other people have already put together better default mysql servers than I could.

Adding an additional use case - I'm inheriting from the official RabbitMQ image, but I only want to expose websocket ports (80 and 443) and not the default AMQP port (5672). Seems like this should be a pretty reasonable thing to want to do?

Adding another use case. I want to build a git test environment using the gogs image but it's tedious to have the data persist since it's all stored in a volume. It'd be great if I'd be able to simply UNVOLUME the volume and build my image after setting up the environment.

+1

inheriting from the official php and want to use sockets instead of ports so need to remove the exposed 9000 port

Anyone who has used Docker in a non-trivial capacity will have found these limitations with inherited containers.

@shykes @icecrime how is this now closed? Is it too hard to solve with the current syntax and need for backwards compatibility? Whats the plan?

+1 - real world use case for EXPOSE override here.

Considering this has been going on for over 3 years (found issue dating back to 2013) when will we be able to removed exposed ports?

+1. need to be able to "UNEXPOSE" default nginx ports 80 and 443.

For those people here asking for UNEXPOSE; the EXPOSE statement only gives a hint which ports are exposed by the container, but does not actually _expose_ those ports; you need to _publish_ those ports (-p / -P) to expose them on the host. In other words; omitting the EXPOSE statement from a Dockerfile does have _no_ direct effect on the image (you can still, e.g. reach "port 80" of the container).

Additionally, if you want to expose additional ports, just make the service in the container run on those ports, and this will work.

True, but if use -P (to expose all ports) and your base image exposes a port you no longer want to expose then you're stuck. You'd need to switch to use -p and list all other ports.

@thaJeztah that is good to know. Nevertheless, what is the harm in adding UNEXPOSE? It would be useful for me as well, even if that is only to have better documented containers.

@kgx no harm (apart from possible feature bloat), but wanted to explain that not being able to "unexpose" is not a security issue (some people have that impression). UNVOLUME (or UNSET VOLUME) is still on my personal wish list. :-)

I'm amused I've run into this problem in the first 72 hours of trying docker. Every config or language for any other major tool I use that has any sort of inheritance has an "override the parent" type of capability.

Here's a use case: I'm using the default docker image for go-ethereum, and I need to be able to set up a test version that absolutely will never connect to the outside world. I do need to be able to connect to it from the host and other containers. The safest way to do this is to change the ports because the program over eagerly tries to connect to peers. I also need to be able to override CMD and ENTRYPOINT to make a "set up the database" version of the image that I run once to get the proper volume created. All of these are extremely difficult to pull off in the Dockerfile.

You can assign it to another IP address... Or bind a different host port. The later about over riding entry point and cmd is to just redefine them.

---Sent from Boxer | http://getboxer.com

On 20 February 2016 at 08:57:00 GMT, barkthins [email protected] wrote:Here's a use case: I'm using the default docker image for go-ethereum, and I need to be able to set up a test version that absolutely will never connect to the outside world. I do need to be able to connect to it from the host and other containers. The safest way to do this is to change the ports because the program over eagerly tries to connect to peers. I also need to be able to override CMD and ENTRYPOINT to make a "set up the database" version of the image that I run once to get the proper volume created. All of these are extremely difficult to pull off in the Dockerfile. —Reply to this email directly or view it on GitHub.

CMD and ENTRYPOINT can be overridden at runtime; docker run --entrypoint=foo myimage mycmd. Question here is if it's useful to have a different image, with a different entrypoint/cmd during testing, as you won't be _testing_ the actual image that'll run in production. (just a side note)

from these replies it seems the dockerfile is being deprecated in favor of command line options at least as far as entrypoint, cmd, expose, and probably a few others. The command line already does stuff that the Dockerfile can't do, so that appears to be the direction. If that's the intention, then I'll move as much of the Dockerfile info to instantation time as possible just to reduce the confusion. Is that the intention?

@barkthins no, the Dockerfile isn't deprecated. Using a Dockerfile still is the regular way to produce an image. Also, you can override CMD and ENTRYPOINT in inherited images. My example was to show that on certain cases (e.g. running an alternative command on your image), you can override them at runtime.

From the Dockerfile man page:

     -- EXPOSE <port> [<port>...]
     The EXPOSE instruction informs Docker that the container listens

on the
specified network ports at runtime. Docker uses this information to
interconnect containers using links and _to set up port
redirection on the host_

  • system.*
    [...]
    HISTORY
    *May 2014, Compiled by Zac Dover (zdover at redhat dot com) based
    on docker.com Dockerfile documentation. *Feb 2015, updated by Brian Goff (
    [email protected])
    for readability *Sept 2015, updated by Sally O'Malley (
    [email protected])

[my italics] seems misleading (or at least ambiguous) if what you say is
true.

On Thu, Jan 28, 2016 at 6:43 AM, Sebastiaan van Stijn <
[email protected]> wrote:

For those people here asking for UNEXPOSE; the EXPOSE statement only
gives a _hint_ which ports are exposed by the container, but does not
actually _expose_ those ports; you need to _publish_ those ports (-p / -P)
to expose them on the host. In other words; omitting the EXPOSE statement
from a Dockerfile does have _no_ direct effect on the image (you can
still, e.g. reach "port 80" of the container).

Additionally, if you want to expose additional ports, just make the
service in the container run on those ports, and this will work.


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/issues/3465#issuecomment-176012915.

Currently co-authoring a book on Docker: Get 39% off with the code 39miell
http://manning.com/miell/?a_aid=zwischenzugs&a_bid=e0d48f62

thaJeztah My point and I think this thread's point is the inheritance is inconsistent. Yes you can override ENTRYPOINT and CMD but EXPOSE adds exposure, you cannot replace the parent's exposure except on the command line. I haven't examined other commands to see if there's a third behavior. It's also not documented as to which commands extend or replace a parent's command.

@thaJeztah we need UNEXPOSE

There are several solutions that read the container metadata and provide additional upstream configuration. For example; with HAPROXY i have to set EXCLUDE_PORTS=8080 to block it trying to dynamically provide access to that port on my Tomcat apps.

Developers look at the exposed ports and make assumptions about the behaviour of the container. For example I have a base image that extends Tomcat (EXPOSE 8080) but the image uses a different port to the default (EXPOSE 8888). If you add a web server in a compound image (eg. running NGINX and Tomcat) you serve content via HTTP (EXPOSE 80 and EXPOSE 443).

In the latter example you can end up with an image that self documents as exposing 8080, 8888, 80 and 443 where only 80/443 is relevant.

These are real problems as evidenced by the fact I have to keep explaining things to developers in our community despite very specific documentation; who needs documentation when you can just look at the image? <-- everyone when the image is self-documenting the wrong thing.

There are workarounds for this problem but they are workarounds. Is this a major architectural issue? Why can't Docker consider a more elegant solution to this real problem.

What is the status on this?

@BillBrower status is they closed the issue without providing a rationale for why they believe its not an issue. Clearly for many of us it continues to be a real world problem that plagues our daily lives ;)

@modius @BillBrower even though it's closed, things can always be reconsidered; basically "no is temporary", but "yes is forever" when merging/implementing features, so if there's concerns about a feature, the correct choice for maintainers is to say "no".

The PR implementing this was closed because there were maintainers not sure about the feature, and looking for more actual examples for its use; https://github.com/docker/docker/pull/8177#issuecomment-93587164

We're closing this as we mostly disagree, but please feel free to prove us wrong in the comments and we can reconsider

That was over a year ago, so possibly things have changed; I'll reopen this issue, and will bring this up in the next maintainers session. (Note that due to DockerCon and the pending 1.12 release, this may be slightly longer away than usual)

Thanks, @thaJeztah. That makes a lot of sense. I appreciate you explaining the rationale behind the original decision and outlining what you need to see if we want this to happen.

One use case for the _UNVOLUME_ request

Specially because currently, when you use a custom volume driver it applies to ALL the volume of the given container.

I had the case with an EFS volume driver : works fine when I specify the volume binding at startup. If I don't set any binding, it fails because it try to mount NFS share from an auto-generated UUID. This mean I must provide a binding to all my volumes, even the one I don't care about, created by a parent image for instance.

The only workaround right now is to bind at startup all the volume I don't need to a rubbish empty subfolder of the same EFS share.

Note : I cannot use docker volume command because all of that is started by Marathon and should be usable in a single docker run command.

+1 UNEXPOSE required

+1 for UNEXPOSE

+1 for UNEXPOSE

+1 for UNEXPOSE

+100 for UNEXPOSE

+9000 for UNEXPOSE

+∞
For example, I use the official repository nginx (FROM nginx:stable) , that contains into Dockerfile:

EXPOSE 80 443

But I want to remove into another layer, the port 80. Eg:

UNEXPOSE 80

Please!
Add this feature!!!!

@frekele , if i was your mother or father, you wouldn't get. no way.

UNEXPOSE +++
Very necessary feature !

Please, you don't need to spam everyone else with email notifications and you definitely don't need to clutter up the discussion with all these '+1' comments. You can just react with 👍 to the issue description to express your agreement.

@underyx this is offtopic, but I'll bite because I keep seeing people saying this. It is more nuanced than you think. For many larger teams, the number of comments on an issue is the only way to gauge engagement, since reactions are designed as a social feature - not a voting mechanism, and hence are not a reliable way to run reports via the GH API (e.g. not sortable, amongst other things). Also, adding a comment automatically subscribes me to the thread, which is in most cases what I want - otherwise I'd have to click the 👍 _and_ also click "Subscribe".
See https://github.com/isaacs/github/issues/9#issuecomment-195120703 (the whole thread is good, but this is around the time GH added reactions).
Now let's not clutter up the discussion ;)
/offtopic.

We discussed this issue in the maintainers meeting, and generally, we're OK to start working on this again.

@duglin are you perhaps interested in working on this?

Don't know if I have time yet, but just to summarize.... based on the above comments, I believe the requirement is to ensure that people can clear/unset the following:

EXPOSE  (all or specific one)
ENV  (specific - not sure we need to clear all yet)
LABEL  (ditto)
VOLUME  (all or just specific paths? probably both)
CMD  (possible but only using the json format)
ENTRYPOINT  (possible with json format)

Did I miss anything?

+10000 for UNVOLUME

@duglin I think starting with the ones that are currently _not_ possible, and most requested would be best (EXPOSE, VOLUME). I haven't seen many requests for the others (but not against it).

The original PR used UNSET <SOMETHING>, but was later changed to UN<SOMETHING>. I _personally_ liked the first one more (more generic), but @shykes preferred UN<SOMETHING>, not sure if that has changed.

UNVOLUME would be nice.

My use case: I'm using the mysql image and want to commit my database contained in /var/lib/mysql directory to a new image but can't because it's declared a volume in the parent Dockerfile.

@thaJeztah I find UNSET <something> easier to read and not as weird; no need to start inventing words. It's also familiar to scripting people. One could also do

UNSET  EXPOSE VOLUME LABEL

My example,

I'm setting up a dokuwiki installation. The image I choose was exposing all the potential configuration volumes. What I would like to do is customize my installation from this base image. As volumes are exposed I cannot modify the PHP config files at image build time.

I can modify the base image to UNVOLUME those volumes but then I will need to maintain that image forever... the magic about using "latest" is gone :(

+1 for UNEXPOSE :)

+1 for UNVOLUME or beter yet UNSET VOLUME.

+1 for UNVOLUME. This could be useful for me right about now. Also could be useful in university scenarios where students spin up without worrying about having to mount volumes.

without worrying about having to mount volumes.

I don't think it will be needed for that; a VOLUME definition in a Dockerfile automatically creates an "anonymous" volume from the content at that location in the image.

@duglin, are you already working on this? If not, I'd take it and start with the most requested (VOLUME and EXPOSE). Let me know.

@runcom go for it - haven't been able to find the time yet.

As a reminder, would note that an UNENV command would still be useful, to unset environment variables selectively (for example, to match with a volume which is UNVOLUMEd at the same time). An unset variable is not the same as a variable set to blank, particularly when used with set -ue in shell.

It would be possible to remove VOLUME and EXPOSE from official images if
they are a problem.

On 28 Jan 2017 21:23, "henryptung" notifications@github.com wrote:

As a reminder, would note that an UNENV command would still be useful, to
unset environment variables selectively (for example, to match with a
volume which is UNVOLUMEd at the same time). An unset variable is not the
same as a variable set to blank, particularly when used with set -ue in
shell.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/3465#issuecomment-275875623, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAdcPAKP1tii706MY-8MxVPSFLTFme8Dks5rW7HggaJpZM4BXt2-
.

+1 for UNVOLUME

+1 for UNVOLUME

My use case for UNVOLUME:

Using the library/worpress image in an S2I scenario where the website source is copied into /var/www/html

VOLUME squashes that by mounting an empty FS on the resulting image. -> library/wordpress can't be used.

@groulot Actually if you don't mount a volume explicitly with --volume over it the contents of the image will be copied into the volume at container creation.

The docker run command initializes the newly created volume with any data that exists at the specified location within the base image.

https://docs.docker.com/engine/reference/builder/#/volume

There is a hacky workarround using docker save/load, see http://stackoverflow.com/q/42316614/808723

+1 for UNVOLUME

I want to echo @modius 's rationale for being able to UN-EXPOSE. One of the first things developers latch onto when learning docker is typing docker ps to see what is going on with their containers. They'll see the standard port listed as being available. Developers are used to the standard ports for things from their local or test environments they've setup in a pre-Docker world, so introducing Docker is difficult because they see the standard port and figure it will work -- but in fact they're not connecting to the docker container.

+1 for UNVOLUME, UNEXPOSE, UNENV, ...

Looks like this has been open for a while. Any traction here?
I also want to use the official PHP fpm alpine image and UNIX sockets instead of TCP port 9000.
Can't override the EXPOSE from parent, and would rather not build that image just to get rid of the EXPOSE.

+1

+1

Would love the ability to unset a VOLUME command. The Wordpress official image forcefully dumps its full codebase into a volume -- I would much prefer only having a volume for the wp-content/uploads directory so the rest of the codebase can be baked into the image.

While deploying an image to kubernetes cluster which restricts root access VOLUME directories cannot be accessed, the solution would be to overwrite volumes defined in parent image

+1 from me

Use case for UNEXPOSE

Suppose I have four docker hosts and I want to run 16 maven tomcat containers which all default to internal port 8080.

Now imagine I'm using registrator with rancher CNI - which locks me down to the internal port.
https://github.com/gliderlabs/registrator/issues/541#issuecomment-305012416
This means I can only run one 8080 internal port per host. (Since I have to do 8080:8080 port mappings)

In this situation - docker's internal->external port mapping isn't enough to solve my problem. I actually need to override the internal port mapping, preferably without rebuilding the original container.

Julian, I don't know how you have a 1 to 1 mapping. For me registrator as
nothing to do with routing traffic it just registers and unregisters
running containers. For example I use it in a way Registrator will maintain
a Etcd instance by putting Docker allocated IP and the exposed port in
there. Then using confd it will watch an Etcd instance and update the nginx
config in its own container.
On Sat, 17 Jun 2017 at 04:06, Julian Gamble notifications@github.com
wrote:

Use case for UNEXPOSE

Suppose I have four docker hosts and I want to run 16 maven tomcat
containers which all default to internal port 8080.

Now imagine I'm using registrator with rancher CNI - which locks me down
to the internal port.
gliderlabs/registrator#541 (comment)
https://github.com/gliderlabs/registrator/issues/541#issuecomment-305012416
This means I can only run one 8080 internal port per host. (Since I have
to do 8080:8080 port mappings)

In this situation - docker's internal->external port mapping isn't enough
to solve my problem. I actually need to override the internal port mapping,
preferably without rebuilding the original container.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/3465#issuecomment-309189149, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABrq2QGgY81wbePOBKbkTSjpUSoPIocuks5sE0LCgaJpZM4BXt2-
.

Hi Bradley,

Thanks for looking at this. I'm using registrator in combination with ipsecurity that is built into Rancher. As you can see from the link here:
https://github.com/gliderlabs/registrator/issues/541#issuecomment-305012416
The ability to view external ports in registrator in rancher scheduled containers was restricted. This meant you could only use internal ports.

You can see there are users kicking up angst looking for a solution here:
https://forums.rancher.com/t/do-you-kill-registrator/5152

And a proposed solution here:
https://github.com/cabrinoob/rancher-registrator
(Which wasn't feasible for some people).

You may find more if you google search "registrator rancher".

They recommend you run registrator in an 'internal' mode - where you map your internal ports 1:1 to your external ports. This leads to the issue with UNEXPOSE - running out of internal ports quickly.

My point is that ipsecurity used for intra-host docker container networking can lead to a use-case where you are locked into internal ports mapped 1:1 to external ports in docker. For this you need to have an UNEXPOSE command.

Thanks for looking at this.

Cheers
Julian

3.5 years have gone and no progress on this issue?...

+10086 for UNEXPOSE. sometimes the parent image may not be official, it uses unofficial ports, we should have the ability to overwrite the ports.

@pumba-lt
I bet the reason this sees no resolution is, that they don't know how to do it technically.

It also adds complexity to the Dockerfile language when there are clear workarounds. Don't push so much config in the parent Dockerfile and leave if for the inheriting images instead. (aka: stop sourcing random images on docker hub :D)

Since docker 17.05 there is also a new way to do multi-stage builds that removes most of the need for this issue (this is a single Dockerfile):

# First import the original image
FROM nginx AS source-image

# Second step of the build, start with an empty image
FROM scratch
# Copy the data from the original image
COPY --from=source-image / /
# Re-define all the config
EXPOSE 80
STOPSIGNAL SIGTERM
CMD ["nginx", "-g", "daemon off;"]

EDIT: Forgot to say, the second solution squashes all previous layers. I don't think it's a big deal but it's good to know.

@zimbatm - That is awesome!

I bet the reason this sees no resolution is, that they don't know how to do it technically.

The change itself is not too complicated; an implementation can be found in this PR; https://github.com/moby/moby/pull/8177. There was no consensus at the time, but if you follow my comment from January; https://github.com/moby/moby/issues/3465#issuecomment-247405438, things changed and (unless people changed their minds since), we'd accept a contribution to implement this.

As to why it isn't there yet; simply because nobody has had time to start working on it, but if someone is interested, it most likely will be accepted.

@zimbatm yes, your example would resolve the direct problem, just be aware that it also creates a different layer, and flattens all the image layers. Although this may in some cases reduce the image size, it also results in those layers no longer being shared with images that use nginx as their parent, so can result in more images needed to be downloaded. For example;

The original nginx image:

$ docker inspect nginx -f '{{json .RootFS.Layers}}' | jq .

[
  "sha256:54522c622682789028c72c5ba0b081d42a962b406cbc1eb35f3175c646ebf4dc",
  "sha256:1c3fae42c5007fd0e70309b5b964eb5d49046562bd425424da734784098894e7",
  "sha256:87823f21b7939eac6e099fa878871a806c1904a7698793edb63bf6e5f5371e1f"
]

And the nginx image you created;

$ docker inspect nginx2 -f '{{json .RootFS.Layers}}' | jq .
[
  "sha256:9a71ba430225d4f24e0d57837a71b6b2b68bf88ca7530c0a89c98783c98531b5"
]

Thanks for the update @thaJeztah

May I repeat my suggestion to use

UNSET XXXX

instead of inventing new and weird vocabulary words (e.g.: UNVOLUME).

We could also unset multiple properties in one line that way.

UNSET VOLUME EXPOSE LABEL

I'm personally okay with doing UNSET doing one or the other probably won't be a big change, so i'll leave that for the review process when a PR arrives

Hi, even If use FROM for second time, how do i retain everything from the parent image but except to not expose some of the ports exposed in parent docker image? is there any official resolution on this, lets either accept this and work on or reject

lets either accept this and work on or reject

@rajiff see my comment above https://github.com/moby/moby/issues/3465#issuecomment-313549657 contributions are welcome

The change itself is not too complicated; an implementation can be found in this PR; #8177. There was no consensus at the time, but if you follow my comment from January; #3465 (comment), things changed and (unless people changed their minds since), we'd accept a contribution to implement this.

So if the consensus might have changed, why not just reopen #8177?

So if the consensus might have changed, why not just reopen #8177?

That PR was opened over three years ago; the code no longer applies

Instead of having to specifically use a UNsomething command,
why not improve the FROM command and make it possible to list what we actually want to inherit?

We could use something like :
FROM baseimage (VOLUME, EXPOSE, PORT, ..)
or if you really want it with negation :
FROM baseimage (*, -VOLUME, -EXPOSE)
or have a better syntax ;)

It seems to me that all this should be part of the FROM command in the first place.

Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

That doesn't seem entirely true. You can still do this:

VOLUME /avolume/subdir
WORKDIR /avolume
COPY ./Dockerfile /avolume/subdir

I don't know if it could be used to undo a volume though.

Overriding parent image / container ENTRYPOINT doesn't work in latest versions.

17.09.1-ce version

$ docker run --name=experiment --entrypoint=/bin/bash ubuntu:16.04
$ docker inspect experiment --format "{{.Config.Entrypoint}}"
[/bin/bash]
$ IMAGE=$(docker commit -c "ENTRYPOINT []" experiment)
$ docker inspect $IMAGE --format "{{.Config.Entrypoint}}"
[]

since 17.10.0-ce version

$ docker run --name=experiment --entrypoint=/bin/bash ubuntu:16.04
$ docker inspect experiment --format "{{.Config.Entrypoint}}"
[/bin/bash]
$ IMAGE=$(docker commit -c "ENTRYPOINT []" experiment)
$ docker inspect $IMAGE --format "{{.Config.Entrypoint}}"
[/bin/bash]

UNSET ENTRYPOINT also doesn't work.
Is it bug?

@alexey-igrychev can you open a separate issue for that? The issue you're commenting on is a _feature request_ for implementing UNSET xx instructions in the Dockerfile. (The UNSET instruction is not implemented yet, so that's expected.)

As a workaround for that issue, using [""] instead of [] for the entrypoint seems to work;

IMAGE=$(docker commit -c "ENTRYPOINT [\"\"]" experiment)
docker inspect $IMAGE --format "{{.Config.Entrypoint}}"
[]

I also need a way to unset volume, so that I can create a database image preloaded with tables and data.
Unfortunately, the base image is private (oracle) and therefore I can't even copy the base dockerfile since I don't have access to it. I can only extend the image.
This issue has tons of +1's and real world use cases listed, and there have been multiple PR's made for it, and yet the PR's have been closed. So what do we need to do to get this feature?

@veqryn since reopening this issue, nobody started working on a pull-request; the existing pull request did no longer apply cleanly on the code-base so a new one has to be opened; if anyone is interested in working on this, then things can get going again.

See my earlier comment(s); https://github.com/moby/moby/issues/3465#issuecomment-247405438

We discussed this issue in the maintainers meeting, and generally, we're OK to start working on this again.

@duglin are you perhaps interested in working on this?

And https://github.com/moby/moby/issues/3465#issuecomment-313549657

As to why it isn't there yet; simply because nobody has had time to start working on it, but if someone is interested, it most likely will be accepted.

My use case comes from docker-compose.yaml: I'd like to have a compose file for development with overrides for production that add a TLS reverse-proxy, Maven repository, takes over PORT 80/443, unexposes port 80 and 5432 that the development compose file exposes. Or a compose file for production with development override.

The add-only nature of layering compose files inherited from Dockerfiles complicates system design. It would be really cool if some parameters could be retracted, or if I simply got a different container built with overrides active. I am not picky how it works under the hood wrt docker-compose.

Thanks @thaJeztah - could you please point us to the line of code that in your view is the place to start reading to look at fixing this with a pull request?

I haven't worked much on the builder code myself, but changes should likely be in the https://github.com/moby/moby/tree/master/builder package.

This issue has 3 years! Is it so hard to agree on a so basic feature, or am I missing something?

@caruccio yes, you're missing something: scroll up 4 comments https://github.com/moby/moby/issues/3465#issuecomment-356988520

I also have a couple of use cases (one a personal project, and the second a work project) where I wish to overload a VOLUME, EXPOSE and ENTRYPOINT statement from a parent image.

While I have a workaround for ENTRYPOINT by just setting a new empty entry point with ENTRYPOINT [], and can probably learn to live with ignoring EXPOSE, ... I am left scratching my head about how to not inherit VOLUME definitions.

I just came across this issue in my code base where the parent image had a VOLUME, meaning all my changes to this volume in a child image were thrown away. I thought I was going mad for 2 days until I finally found my way to this issue. Please could someone implement this.

There is a WORKAROUND.

You can always docker save image -o image.tar, unpack that archive, edit the metadata and repack for a docker load -i image2.tar. That way one can make an image2 that does not have any of the prior VOLUME declarations in it.

Since I have to do those steps quite regularly, I have created a little script to help with task of cleaning up a third party image. Have a look at docker-copyedit

Fantastic work @gdraheim ! A workable solution in <250 lines of python.

@gdraheim wow, this is great! From the README:

The wish to REMOVE ALL VOLUMES came from the fact that I did want to download a tested image for local tests where the data part should be committed to the history as well in order to turn back both program and data to a defined state so that another test run will start off the exact same checkpoint.

This is our use case as well.

I have expanded docker-copyedit to cover all metadata entries of an image, so it can work on all inheritered properties even beyond the problematic cases of the EXPOSE and VOLUME lists. That would be user, workingdir, labels, env settings for things seen often. Copying ENTRYPOINT to CMD is also a modification that I do quite regulary. No need to make an intermediate docker-build step anymore, just go for docker-copyedit. ;)

The time used by the Docker team to track and repeatedly ignore this issue would probably had been enough to fix it instead.

Can we now please reopen this after obviously a ton of users requesting this...
or at least give a reasonable argument against it not just ignoring all the uses cases (I wanna unexpose a port too and only to get a cleaner docker ps output without fucking (80/80/tcp) bevore i will build my own image.... (which is harder for non opensource Dockerfiles)

The issue is still open; It's open source; contributions are welcome https://github.com/moby/moby/issues/3465#issuecomment-356988520

Am I right thinking that this would be fixed in moby/buildkit now? I see most of the Dockerfile command infrastructure over there.

I'm also a fan of UNSET, like

UNSET EXPOSE 9000

or

UNSET LABEL foo

So, I'm looking at commands that have subcommand-ish forms, like the HEALTHCHECK CMD form and I notice that HEALTHCHECK already has an unset form...

HEALTHCHECK NONE

This is an interesting choice, but HEALTHCHECK also only defines 1 configuration (and overrides w/ the newest), it doesn't allow defining multiple, like LABEL, EXPOSE, and VOLUME do.

I just wonder how these should interact or if there's some other kind of NONE form that might work.

Some way to remove exposed ports is really needed for controlling what is exposed when using host networking.

+1 EXPOSE []

So....for the 5 years Docker team unable to implement UNSET operator, fantastic

As @AnthonyMastrean said, should we move this issue to the moby/buildkit project?

There is also a PR, well documented and tested but not merge, should we move/rebase this PR too?

This feature would be really appreciated and address a problem with nginx based images in Azure.

UNSET, CLEAR, RESET, OVERRIDE, IGNORE would be ok - I'd avoid UNxxx because that'd duplicate the list of reserved keys to support and document.

What to ignore/reset could also be specified when using FROM, e.g.

FROM nginx:1.13 IGNORE EXPOSE, ENTRYPOINT

I suggest one more workaround, using multistage builds.
It will copy all files from original image to new image, but without metadata.

FROM postgres as orig

FROM alpine:3.8 as postgres
COPY --from=orig / /
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]

I can't believe I hadn't thought of that. That's actually pretty good @kotofos. You lose the layers of the upstream container, but that's not a huge loss.

@kotofos Why not FROM scratch as postgres?

@farcaller good catch. scratch would definitely be better since you're overwriting the whole file system

Last I tested it, COPY --from=xxx ... would not preserve filesystem
ownership, so you might want to be careful with that workaround.

@tianon You're correct, though for single-process containers this shouldn't be an issue as you can use the --chown flag to set the user you're executing as in the container

https://docs.docker.com/engine/reference/builder/#copy

This dates back to 2014. It's been 5 years and it seems like there will be no generic "unset" or "reset" for all properties in near future. I also favor a generic approch but there are just to many things to consider and really: it won't happen anytime soon.

So: can we at least get a "UNEXPOSE" to close all those opend ports or get at very least the same behaviour as for CMD and ENTRYPOINT (last one wins)? It's the "unset" property most requested and a potential security risk for users unaware of the (non intuitive) behaviour, considering the "last one wins" behaviour from other commands.

I suggest one more workaround, using multistage builds.
It will copy all files from original image to new image, but without metadata.

FROM postgres as orig

FROM alpine:3.8 as postgres
COPY --from=orig / /
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]

I wanted to ignore the docker volume pointed to by PGDATA so I could have its contents bundled with the image and not as a volume.
A lighter solution for me was simply changing the value of PGDATA:

FROM postgres:11.2-alpine
ENV PGDATA /var/lib/postgresql/test-data

# stuff that will create all my schemas
COPY create-scripts /docker-entrypoint-initdb.d/

# a weird way to trigger the entrypoint script to run the stuff in docker-entrypoint-initdb.d but not hang after starting postgres
RUN /docker-entrypoint.sh postgres --version

Is there some benefit in having EXPOSE and VOLUME in dockerfiles at all? After all, you can define them easily in docker run (and in docker-compose files), with --expose (or -p) and named volumes or bind mounts. I was bitten because of them more than once and there is not a way to reset them (creating a new dockerfile isn't much viable, specially when extending official images or images created with makefiles). I see them as an anti-pattern and I think it would be better to deprecate them.

@lucasbasquerotto EXPOSE is used by tools such as the gitlab-runner to detect if ports declared exposed by the image are actually open or not. I agree that this should be more of a HEALTHCHECK duty to detect if the container is ready or not, but having a list of declared ports can be useful indeed for automation.
VOLUME is needed to provide the user friendly experience of stopping/starting a container and have the data persisted automtically. Again, I think that this can be worked out in some other way but having the data inspectable is good for the tooling (and for the humans too in this case).

p.s. not defending the Dockerfile syntax, I'm just pointing out that the anti-patterns are not caused by the keywords themselves but rather because the ecosystem is not moving forward to solve issues that may arise in common use case such as this one, or such as providing an image with a declared volume and some starting data in the volume (e.g. a mysql image with preloaded schema)

@zarelit I don't know exactly how gitlab-runner works, but I think there should be a way to specify the ports to be checked outside the Dockerfile (I found an issue that is probably due to that, because MySQL expose 2 ports but it should check only the port 3306: https://gitlab.com/gitlab-org/gitlab-runner/issues/4143). From what I see, it also only uses the exposed port of the last Dockerfile.

About VOLUME, you can persist stateful data using named volumes or mount it with -v to an existing directory (if you want to preserve data when recreating a conainer). I also think it would be better to use this approach because the VOLUME inside the dockerfile is kind of obscure for whoever extends that dockerfile, and if you don't know it is there you might think that your container is reproducible in different environments and that you can change versions without side effects, only to be bitten later because it is using persistent data under the hood.

It also doesn't allow you to move files to that directory during the build (assuming you don't want to use it as a volume, but it inherits a Dockerfile that defines the directory as a volume, like wordpress that defines /var/www/html/ as a VOLUME, and I needed to use some hacks to use another directory).

Using -v you declare explicitly that you want the volume and avoid undesirable surprises due to the black magic of VOLUME in the Dockerfile.

Also, it may end up creating a lot of anonymous volumes:

Defining a volume inside the image tells docker to store this data separate from the rest of the container, even if you don’t define the volume when you spin up the container. Docker’s way of storing this data is to create a local volume without a name. The name itself is a long unique id string that contains no reference to the image or container it’s attached to. And unless you explicitly tell docker to remove volumes when you remove the container, these volumes remain, unlikely to ever be used again

Source: https://boxboat.com/2017/01/23/volumes-and-dockerfiles-dont-mix/

@lucasbasquerotto I mostly agree with you, there are all long-standing issues, I think that a deprecation should follow a path where they don't become invalid, they become informative like... suggested paths that you could turn in a volume, suggested ports that a server could be listening to.

I think that the good part of the Docker journey has now been extracted into the OCI standard and thus we should write new tools that don't have all this legacy on the shoulder.

For anyone that finds it useful, feel free to use the tugboat.qa docker images. They are extensions of several official docker images with volumes removed. Minimally documented GitHub repository here where the heavy lifting is done: https://github.com/TugboatQA/images

Is there some benefit in having EXPOSE and VOLUME in dockerfiles at all? After all, you can define them easily in docker run (and in docker-compose files), with --expose (or -p) and named volumes or bind mounts. I was bitten because of them more than once and there is not a way to reset them (creating a new dockerfile isn't much viable, specially when extending official images or images created with makefiles). I see them as an anti-pattern and I think it would be better to deprecate them.

Many vendors nowadays provide their apps as a simple container image and provide dockerfiles along with it.
Having EXPOSE and VOLUME in those files enables the use of simple applications with a simple docker run in the app dir. You don't need to know anything about what params the app expects, it works with all the defaults provided in the dockerfile.
So yes: while we have better, bigger guns like compose or k8s for complex applications for simple, local applications dockerfiles are still a great fit. And having defaults that just work makes the use convenient.

@m451 A shorter docker run might be preferable than a longer one (with more options defined), but I don't consider that a good argument to expose ports and volumes in the dockerfile.

Instead of docker run some_image you can easily run docker run --expose 3000 -v my_volume:/container/dir some_image and having a clear understanding of ports exposed to the host and volumes that will persisted even after the container is destroyed.

Also, this is a trivial stuff to do, and you expose and map volumes only when you need to (maybe you don't need all the ports exposed in the dockerfile, nor all volumes defined. If it really is important to expose some port or use some volume, it's better to document it in the repository so that people don't know only what needs to be exposed or mapped, but why, after all this is something that will affect the outside of the container, and may persist after the container is destroyed).

If it's in the dockerfile, it might actually make it harder to know what is happening and cause unexpected surprises in the long run when a volume is persisted and you don't know (especially if a dockerfile is inherited from another, you might not know beforehand that a volume is defined unless you inspect more deeply what it does). So I would consider VOLUME and EXPOSE bad even if this issue is solved.

Furthermore, while this issue isn't solved (and I don't have an idea of how many years it will take to solve it, considering that it's opened for more than 5 and a half years) then I simply have NO WAY to reset them.

Is there some benefit in having EXPOSE and VOLUME in dockerfiles at all? After all, you can define them easily in docker run (and in docker-compose files), with --expose (or -p) and named volumes or bind mounts. I was bitten because of them more than once and there is not a way to reset them (creating a new dockerfile isn't much viable, specially when extending official images or images created with makefiles). I see them as an anti-pattern and I think it would be better to deprecate them.

Many vendors nowadays provide their apps as a simple container image and provide dockerfiles along with it.
Having EXPOSE and VOLUME in those files enables the use of simple applications with a simple docker run in the app dir. You don't need to know anything about what params the app expects, it works with all the defaults provided in the dockerfile.
So yes: while we have better, bigger guns like compose or k8s for complex applications for simple, local applications dockerfiles are still a great fit. And having defaults that just work makes the use convenient.

I would argue that vendors who do that, do it out of ignorance. Not understanding the problems this causes for people who want to use their product in a production environment. Sure it makes it easier for someone to stand up a test/demo instance of the product. But now, if I want to run it for real, I have to git clone/sed their Dockerfile or create my own just to get a working image.

@m451 A shorter docker run might be preferable than a longer one (with more options defined), but I don't consider that a good argument to expose ports and volumes in the dockerfile.

Instead of docker run some_image you can easily run docker run --expose 3000 -v my_volume:/container/dir some_image and having a clear understanding of ports exposed to the host and volumes that will persisted even after the container is destroyed.

Also, this is a trivial stuff to do, and you expose and map volumes only when you need to (maybe you don't need all the ports exposed in the dockerfile, nor all volumes defined. If it really is important to expose some port or use some volume, it's better to document it in the repository so that people don't know only what needs to be exposed or mapped, but why, after all this is something that will affect the outside of the container, and may persist after the container is destroyed).

If it's in the dockerfile, it might actually make it harder to know what is happening and cause unexpected surprises in the long run when a volume is persisted and you don't know (especially if a dockerfile is inherited from another, you might not know beforehand that a volume is defined unless you inspect more deeply what it does). So I would consider VOLUME and EXPOSE bad even if this issue is solved.

Furthermore, while this issue isn't solved (and I don't have an idea of how many years it will take to solve it, considering that it's opened for more than 5 and a half years) then I simply have NO WAY to reset them.

Agreed. So let's define a standard way to communicate what params are required.
If there is none we'll just end up with the same mess we had in legacy apps: vendor specific documents and documentation formats. Some tell you what ports to open, some dont. Some only tell you half of it, some tell you the wrong ports. Some only tell you the port number but not the protocol and so on.

Dockerfiles have been a great way to standardize that mess.

@m451 I agree with you in that, but it's good to consider that just exposing a port doesn't convey useful information. What kind of data / connections the exposed port expects? If there are several ports, what does each port do?

If you want to persist data you would want it to mount to a named volume or to some location on the host, VOLUME doesn't help with that. If you want temporary data to be stored with more performance then VOLUME helps (this is the only case that VOLUME might be helpful). Writing in the container is slower because it uses the copy on write strategy. But again, mapping the volume on docker run avoids CoW too (you don't need VOLUME), the only downside is that your instruction is longer and you need to know the path (but this is only if you are having a degraded performance due to CoW).

Using VOLUME and EXPOSE as a type of documentation doesn't justifies bad (or lack of a) documentation. And it also may (and probably will) harm some consumers of the image.

@m451 I agree with you in that, but it's good to consider that just exposing a port doesn't convey useful information. What kind of data / connections the exposed port expects? If there are several ports, what does each port do?

Correct. The hole idea of ports from a sec. perspective is outdated. Yet: here we are trying to define what ports need to be opend and what ports can stay closed and to understand what data is exchanged over every port. HTTPS has become a wrapper for pretty much everything today and besides the author of the code normaly there is nobody who knows exactly what data is transfered over a particular port. And even then it may change with every update.

In RL you don't care about what data / information is transported over what port besides you end up troubleshooing the application. You decide to trust the application. So if the application opens up port X and Y then you trust it with that as well.
For standard, daily operations it's good to just start an app and it works out of the box (secure defaults assumed).
Containers have become a form of packaging apps to have them work out of the box.

That said I agree that good docs are important. Yet in RL nobody wants to read docs for hours just to know what ports to open. It doesn't bring any benefit for daily ops tasks.

My piece of the pie.

Using VOLUME in the Dockerfile is worthless. If a user needs persistence, they will be sure to provide a volume mapping when running the specified container. It was very hard to track down that my issue of not being able to set a directory's ownership (/var/lib/influxdb) was due to the VOLUME declaration in InfluxDB's Dockerfile. Without an UNVOLUME type of option, or getting rid of it altogether, I am unable to change _anything_ related to the specified folder. This is less than ideal, especially when you are security-aware and desire to specify a certain UID the image should be ran as, in order to avoid a random user, with more permissions than necessary, running software on your host.

There should be a way to override these VOLUME directives when extending images for user customization. My only solution at the moment is to completely recreate the image myself, rendering the entire _Dockerfile FROM_ dynamic useless.

FROM nginx:latest
after deploy to heroku, the 80 port was exposed by nginx, but that will not allowed by heroku, so what i can do? copy all dockerfile from nginx and remove the EXPOSE 80?

I wont write any new... but It'd be very nice to override EXPOSE directive.

Sharing this - https://github.com/gdraheim/docker-copyedit/blob/master/docker-copyedit.py (not my creation, to be clear, thanks @gdraheim!)

This helped me with removing a volume from a postgres container that was set in the dockerfile with the command:

python docker-copyedit.py FROM postgres:11.5-alpine INTO postgres:11.5-alpine remove volume /var/lib/postgresql/data

Seems to have done the job with no damage to the container, took the original image, adapted it, create a new image without the volume (according to the docker inspect run on a container created form the adjusted image), not used it for anything else but the git README allows for a variety of operations, ie

 ./docker-copyedit.py FROM image1 INTO image2 -vv \
     REMOVE PORT 4444
 ./docker-copyedit.py FROM image1 INTO image2 -vv \
     remove port ldap and rm port ldaps
 ./docker-copyedit.py FROM image1 INTO image2 -vv \
     remove all ports
 ./docker-copyedit.py FROM image1 INTO image2 -vv \
     add port ldap and add port ldaps

LABEL is another 'property' (not mentioned yet) which could use the ability to 'unset'

As mentioned in this SO post: https://stackoverflow.com/questions/50978051/how-do-i-unset-a-docker-image-label

Another use-case for resetting / removing VOLUME from upstream images:

I am working on extending an oracle database image but it has a volume on /opt/oracle/oradata. In default oracle image, the database is created on container startup and writes to that volume. But that causes the container to need 25 minutes for first start what is not acceptable. So I am working on an image where the database is created on image build but therefor I need to remove the VOLUME /opt/oracle/oradata bacause my database gets created but when I start the container, the filesystem /opt/oracle/oradata is empty again and my database fails to start.

I am working on extending an oracle database image but it has a volume on /opt/oracle/oradata.

Exactly my use case too. I would like to generate a preallocated pluggable database that others can pull from our private docker registry... but the volume is missing. I'll have to dig deeper and decide for the least ugly workaround.

I have solved it as follows:

I have used docker-copyedit to remove the volume, then I have written a bash script to create the database (you can look at he startup scripts from oracle to see how it is done). Now with the preconfigured PDB, it takes only 25 seconds to launch a container from the image. But the image becomes really big.

Would a better approach be to extend the existing docker spec to include a concept of sticky env variables / override option, such as;

e.g. my project based on the upstream tomcat image:

--ENV CATALINA_HOME /some/other/path
FROM tomcat:8.5.54-jdk8-openjdk
...

Where an ENV is declared with -- it should be elevated to a sticky status and hold on to the value declared / override any value encountered for this variable later in the chain of processing the Dockerfile.

--ENV = from this point forward in the Dockerfile.

Therefore this could be used at the beginning of a Dockerfile if you wanted it to take precedence over any further encounter of the same variable such as in multistage builds. Would also allow for flexibility depending on where in the Dockerfile it was placed for references higher in the file to be independent.

Wouldn't need an UNVOLUME then as the correct approach would be for VOLUME to be declared with an ENV reference that someone else could override.

e.g.

ENV PROJ_VOL /some/path
VOLUME $PROJ_VOL

Looks like ENTRYPOINT [] and ENTRYPOINT [""] both invalidate the cache on each build when not using BuildKit. Simple Dockerfile to demonstrate:

FROM jrottenberg/ffmpeg:4.3-alpine311 as base

ENTRYPOINT []

RUN echo "HERE!"

Steps 2 and 3 will _never_ use cache. This is my workaround:

FROM jrottenberg/ffmpeg:4.3-alpine311 as base

ENTRYPOINT ["/usr/bin/env"]

RUN echo "HERE!"

I can't reproduce a cache failure on your first pattern: :confused:

$ cat Dockerfile
FROM alpine:3.12
ENTRYPOINT []
RUN echo 'HERE!'

$ docker build .
Sending build context to Docker daemon  17.25MB
Step 1/3 : FROM alpine:3.12
 ---> a24bb4013296
Step 2/3 : ENTRYPOINT []
 ---> Running in d921be2e563d
Removing intermediate container d921be2e563d
 ---> 7801c649d895
Step 3/3 : RUN echo 'HERE!'
 ---> Running in 9e2ca2cf1f9f
HERE!
Removing intermediate container 9e2ca2cf1f9f
 ---> d398fdd442b1
Successfully built d398fdd442b1

$ docker build .
Sending build context to Docker daemon  17.25MB
Step 1/3 : FROM alpine:3.12
 ---> a24bb4013296
Step 2/3 : ENTRYPOINT []
 ---> Using cache
 ---> 7801c649d895
Step 3/3 : RUN echo 'HERE!'
 ---> Using cache
 ---> d398fdd442b1
Successfully built d398fdd442b1

I think you have to use an image that defines ENTRYPOINT. Try using the image I did or mysql.

Oh interesting -- I can reproduce using mysql:8.0. Solid bug! :+1:

Was this page helpful?
0 / 5 - 0 ratings