Moby: Unable to remove a stopped container: `device or resource busy`

Created on 22 Apr 2016  ·  203Comments  ·  Source: moby/moby

Apologies if this is a duplicate issue, there seems to be several outstanding issues around a very similar error message but under different conditions. I initially added a comment on #21969 and was told to open a separate ticket, so here it is!


BUG REPORT INFORMATION

Output of docker version:

Client:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:34:23 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:34:23 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 51
Server Version: 1.11.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 81
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-74-generic
Operating System: Ubuntu 14.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.676 GiB
Name: ip-10-1-49-110
ID: 5GAP:SPRQ:UZS2:L5FP:Y4EL:RR54:R43L:JSST:ZGKB:6PBH:RQPO:PMQ5
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.):

Running on Ubuntu 14.04.3 LTS HVM in AWS on an m3.medium instance with an EBS root volume.

Steps to reproduce the issue:

  1. $ docker run --restart on-failure --log-driver syslog --log-opt syslog-address=udp://localhost:514 -d -p 80:80 -e SOME_APP_ENV_VAR myimage
  2. Container keeps shutting down and restarting due to a bug in the runtime and exiting with an error
  3. Manually running docker stop container
  4. Container is successfully stopped
  5. Trying to rm container then throws the error: Error response from daemon: Driver aufs failed to remove root filesystem 88189a16be60761a2c04a455206650048e784d750533ce2858bcabe2f528c92e: rename /var/lib/docker/aufs/diff/a48629f102d282572bb5df964eeec7951057b50f21df7abe162f8de386e76dc0 /var/lib/docker/aufs/diff/a48629f102d282572bb5df964eeec7951057b50f21df7abe162f8de386e76dc0-removing: device or resource busy
  6. Restart docker engine: $ sudo service docker restart
  7. $ docker ps -a shows that the container no longer exists.
arestoragaufs versio1.11

Most helpful comment

suffered from this issue for quite long time.

All 203 comments

Same here. Exact same OS, also running on AWS (different instance types) with aufs.

After stopping the container retrying docker rm several times and/or waiting a few seconds usually leads to "container not found" eventually. Issues exists in our stack at least since Docker 1.10.

suffered from this issue for quite long time.

Receiving this as well with Docker 1.10. I would very occasionally get something similar with 1.8 and 1.9 but it would clear up on it's own after a short time. With 1.10 it seems to be permanent until I can restart the service or VM. I saw that it _may_ be fixed in 1.11 and am anxiously awaiting the official update so I can find out.

"Device or resource busy" is a generic error message.
Please read your error messages and make sure it's exactly the error message above (ie, rename /var/lib/docker/aufs/diff/...

"Me too!" comments do not help.

@danielfoss There are many fixes in 1.11.0 that would resolve some device or resource busy issues on multiple storage drivers when trying to remove the container.
1.11.1 fixes only a specific case (mounting /var/run into a container).

I'm also seeing this problem on some machines and by taking a look at the code I think the original error is being obscured in here: https://github.com/docker/docker/blob/master/daemon/graphdriver/aufs/aufs.go#L275-L278

My guess is that the Rename error is happening due to an unsuccessful call to unmount. However, as the error message in unmount is logged using Debugf we won't see it unless the daemon is started in debug mode. I'll see if I can spin some servers with debug mode enabled and catch this error.

I tried to set my docker daemon in debug mode and got the following logs when reproducing the error:

Aug 23 10:49:58 vincent dockerd[14083]: time="2016-08-23T10:49:58.191330085+02:00" level=debug msg="Calling DELETE /v1.21/containers/fa781466a8117d690077d85cc06af025da1c9c9b13302b1efed65c21788d5a75?link=False&force=False&v=False"
Aug 23 10:49:58 vincent dockerd[14083]: time="2016-08-23T10:49:58.191478608+02:00" level=error msg="Error removing mounted layer fa781466a8117d690077d85cc06af025da1c9c9b13302b1efed65c21788d5a75: rename /var/lib/docker/aufs/mnt/007c204b5aa1708f628d9518bb83d51176446e0c3743587f72b9f6cde3b9ce24 /var/lib/docker/aufs/mnt/007c204b5aa1708f628d9518bb83d51176446e0c3743587f72b9f6cde3b9ce24-removing: device or resource busy"
Aug 23 10:49:58 vincent dockerd[14083]: time="2016-08-23T10:49:58.191519719+02:00" level=error msg="Handler for DELETE /v1.21/containers/fa781466a8117d690077d85cc06af025da1c9c9b13302b1efed65c21788d5a75 returned error: Driver aufs failed to remove root filesystem fa781466a8117d690077d85cc06af025da1c9c9b13302b1efed65c21788d5a75: rename /var/lib/docker/aufs/mnt/007c204b5aa1708f628d9518bb83d51176446e0c3743587f72b9f6cde3b9ce24 /var/lib/docker/aufs/mnt/007c204b5aa1708f628d9518bb83d51176446e0c3743587f72b9f6cde3b9ce24-removing: device or resource busy"

I could find the message Error removing mounted layer in https://github.com/docker/docker/blob/f6ff9acc63a0e8203a36e2e357059089923c2a49/layer/layer_store.go#L527 but I do not know Docker enough to tell if it is really related.

Version info:

Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:02:53 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:02:53 2016
 OS/Arch:      linux/amd64

I had the same problem using docker-compose rm

Driver aufs failed to remove root filesystem 88189a16be60761a2c04a455206650048e784d750533ce2858bcabe2f528c92e

What I did to fix the problem without restarting docker :

cat /sys/fs/cgroup/devices/docker/88189a16be60761a2c04a455206650048e784d750533ce2858bcabe2f528c92e/tasks

It give you the pid of the processes which run in devices subsystem (what is mounted and busy) located in the hierarchy in /docker/:containerid:

I succeeded to kill them :
kill $(cat /sys/fs/cgroup/devices/docker/88189a16be60761a2c04a455206650048e784d750533ce2858bcabe2f528c92e/tasks)

After their death, the container was gone (successfully removed)

Version

Client:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built: Thu Aug 18 05:02:53 2016
OS/Arch: linux/amd64

Server:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built: Thu Aug 18 05:02:53 2016
OS/Arch: linux/amd64

There seems to be 2 different problems here as I am unable to fix my issue using @simkim's solution.

# docker rm b1ed3bf7dd6e
Error response from daemon: Driver aufs failed to remove root filesystem b1ed3bf7dd6e5d0298088682516ec8796d93227e4b21b769b36e720a4cfcb353: rename /var/lib/docker/aufs/mnt/acf9b10e85b8ad53e05849d641a32e646739d4cfa49c1752ba93468dee03b0cf /var/lib/docker/aufs/mnt/acf9b10e85b8ad53e05849d641a32e646739d4cfa49c1752ba93468dee03b0cf-removing: device or resource busy
# ls /sys/fs/cgroup/devices/docker/b1ed3bf7dd6e5d0298088682516ec8796d93227e4b21b769b36e720a4cfcb353
ls: cannot access /sys/fs/cgroup/devices/docker/b1ed3bf7dd6e5d0298088682516ec8796d93227e4b21b769b36e720a4cfcb353: No such file or directory
# mount | grep acf9b10e85b8ad53e05849d641a32e646739d4cfa49c1752ba93468dee03b0cf

In my case, the cgroup associated with my container seems to be correctly deleted. The filesystem is also unmounted.

The only solution for me is still to restart the Docker daemon.

today same problem than @genezys

  • docker compose app with 4 container (rails, worker, redis, postgresql)
  • docker-compose rm lead to device busy error on the 4 containers with cgroup gone
  • fuser -m on one filesystem show a bunch of process :

    • the pid of dockerd with the m flag (mmap'ed file or shared library)

    • other pids

  • The other pids are the pids of another docker compose app with 4 container (django, rqworker, redis, postgresql). How is it possible ??
  • docker-compose rm on the second app lead to the same error
  • but now the first fuser -m show only the dockerd process with the m flag for all 8 containers

This appears to have gotten worse in 1.12... I have (some) idea of what may have caused this, but not quite sure of the solution (short of a revert).
One thing I have noticed is in kernel 3.16 and higher, we do not get the busy error from the kernel anymore.

Yes I upgraded to 1.12 yesterday from 1.11 and now I got this problem two times in 2 days, never had it before on this host

@genezys and myself are on debian 8, 3.16.7-ckt25-2+deb8u3

When @genezys and I run "docker-compose stop && docker-compose rm -f --all && docker-compose up -d", since docker 1.12 :

  • A lot of time after restarting docker : 0 failure
  • Everyday, the first time in the morning when arriving at work : 100% failure on rm

I tried to run all cron task during the day in case something was done during the night but it don't trigger the bug.

Same information with more details, we can provide more information as requested as it append every morning.

Stop and remove

Stopping tasappomatic_worker_1 ... done
Stopping tasappomatic_app_1 ... done
Stopping tasappomatic_redis_1 ... done
Stopping tasappomatic_db_1 ... done
WARNING: --all flag is obsolete. This is now the default behavior of `docker-compose rm`
Going to remove tasappomatic_worker_1, tasappomatic_app_1, tasappomatic_redis_1, tasappomatic_db_1
Removing tasappomatic_worker_1 ... error
Removing tasappomatic_app_1 ... error
Removing tasappomatic_redis_1 ... error
Removing tasappomatic_db_1 ... error

ERROR: for tasappomatic_app_1  Driver aufs failed to remove root filesystem a1aa9d42e425c16718def9e654dc700ff275d180434e32156230f4d1900cc417: rename /var/lib/docker/aufs/mnt/c243cc7329891de9584159b6ba8717850489b4010dfcc8b782c3c09b9f26f665 /var/lib/docker/aufs/mnt/c243cc7329891de9584159b6ba8717850489b4010dfcc8b782c3c09b9f26f665-removing: device or resource busy

ERROR: for tasappomatic_redis_1  Driver aufs failed to remove root filesystem b736349766266140e91780e3dbbcaf75edb9ad35902cbc7a6c8c5dcb2dfefe28: rename /var/lib/docker/aufs/mnt/b474a7c91ad77920dfb00dc3a0ab72bc22964ae3018e971d0d51e6ebe8566aeb /var/lib/docker/aufs/mnt/b474a7c91ad77920dfb00dc3a0ab72bc22964ae3018e971d0d51e6ebe8566aeb-removing: device or resource busy

ERROR: for tasappomatic_db_1  Driver aufs failed to remove root filesystem 1cc473718bd19d6df3239e84c74cd7322306486aa1d2252f30472216820fe96e: rename /var/lib/docker/aufs/mnt/d4162a6ef7a9e9e65bd460d13fcce8adf5f9552475b6366f14a19ebd3650952a /var/lib/docker/aufs/mnt/d4162a6ef7a9e9e65bd460d13fcce8adf5f9552475b6366f14a19ebd3650952a-removing: device or resource busy

ERROR: for tasappomatic_worker_1  Driver aufs failed to remove root filesystem eeadc938d6fb3857a02a990587a2dd791d0f0db62dc7a74e17d2c48c76bc2102: rename /var/lib/docker/aufs/mnt/adecfa9d22618665eba7aa4d92dd3ed1243f4287bd19c89617d297056f00453a /var/lib/docker/aufs/mnt/adecfa9d22618665eba7aa4d92dd3ed1243f4287bd19c89617d297056f00453a-removing: device or resource busy
Starting tasappomatic_db_1
Starting tasappomatic_redis_1

ERROR: for redis  Cannot start service redis: Container is marked for removal and cannot be started.

ERROR: for db  Cannot start service db: Container is marked for removal and cannot be started.
ERROR: Encountered errors while bringing up the project.

Inspecting mount

fuser -m /var/lib/docker/aufs/mnt/c243cc7329891de9584159b6ba8717850489b4010dfcc8b782c3c09b9f26f665
/var/lib/docker/aufs/mnt/c243cc7329891de9584159b6ba8717850489b4010dfcc8b782c3c09b9f26f665:  5620  5624  5658  6425  6434 14602m

Same set of process for the 4 containers

Inspecting process

5620 5624 6434 another postgresql container ()
5658 worker from another container
6425 django from another container
14602m dockerd

systemd,1
  └─dockerd,14602 -H fd://
      └─docker-containe,14611 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc
          └─docker-containe,5541 2486fd7f494940619b54fa9b4cedc52c8175988c5ae3bb1dca382f0aaee4f72a /var/run/docker/libcontainerd/2486fd7f494940619b54fa9b4cedc52c8175988c5ae3bb1dca382f0aaee4f72a docker-runc
              └─postgres,5565
                  └─postgres,5620
systemd,1
  └─dockerd,14602 -H fd://
      └─docker-containe,14611 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc
          └─docker-containe,5541 2486fd7f494940619b54fa9b4cedc52c8175988c5ae3bb1dca382f0aaee4f72a /var/run/docker/libcontainerd/2486fd7f494940619b54fa9b4cedc52c8175988c5ae3bb1dca382f0aaee4f72a docker-runc
              └─postgres,5565
                  └─postgres,5624
systemd,1
  └─dockerd,14602 -H fd://
      └─docker-containe,14611 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc
          └─docker-containe,5642 0364f4ace6e4d1746f8c3e31f872438a592ac07295dd232d92bf64cf729d7589 /var/run/docker/libcontainerd/0364f4ace6e4d1746f8c3e31f872438a592ac07295dd232d92bf64cf729d7589 docker-runc
              └─pootle,5658 /usr/local/bin/pootle rqworker
systemd,1
  └─dockerd,14602 -H fd://
      └─docker-containe,14611 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc
          └─docker-containe,5700 bd3fb1c8c36ec408bcf53c8501f95871950683c024919047f5423640e377326d /var/run/docker/libcontainerd/bd3fb1c8c36ec408bcf53c8501f95871950683c024919047f5423640e377326d docker-runc
              └─run-app.sh,5716 /run-app.sh
                  └─pootle,6425 /usr/local/bin/pootle runserver --insecure --noreload 0.0.0.0:8000
systemd,1
  └─dockerd,14602 -H fd://
      └─docker-containe,14611 -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc
          └─docker-containe,5541 2486fd7f494940619b54fa9b4cedc52c8175988c5ae3bb1dca382f0aaee4f72a /var/run/docker/libcontainerd/2486fd7f494940619b54fa9b4cedc52c8175988c5ae3bb1dca382f0aaee4f72a docker-runc
              └─postgres,5565
                  └─postgres,6434

has anyone a better solution then restarting the docker service (version 1.12)?

A workaround was proposed in #25718 to set MountFlags=private in the docker.service configuration file of systemd. See https://github.com/docker/docker/issues/25718#issuecomment-250254918 and my following comment.

So far, this has solved the problem for me.

@genezys : Note the side effect of this workaround that I've explained in https://github.com/docker/docker/issues/25718#issuecomment-250356570

I was getting something like this:

Error response from daemon: Driver aufs failed to remove root filesystem 6b583188bfa1bf7ecf2137b31478c1301e3ee2d5c98c9970e5811a3dd103016c: rename /var/lib/docker/aufs/mnt/6b583188bfa1bf7ecf2137b31478c1301e3ee2d5c98c9970e5811a3dd103016c /var/lib/docker/aufs/mnt/6b583188bfa1bf7ecf2137b31478c1301e3ee2d5c98c9970e5811a3dd103016c-removing: device or resource busy

I simply searched for "6b583188bfa1bf7ecf2137b31478c1301e3ee2d5c98c9970e5811a3dd103016c" and found it was located in multiple folders under docker/
Deleted all those files and attempted deleting docker container again using :sudo rm "containerId"
And it worked.

Hope it helps!

The thing is, I can't remove that file. And lsof doesn't show any user of that file. I suspect this kernel bug so I just did sudo apt-get install linux-image-generic-lts-xenial on my 14.04, hoping it'll help.

I encouter same problem and i google for while. It seems the cadvisor container lock the file.
After remove the cadvisor container, i can remove the files under [dockerroot]/containers/xxxxxx.

@oopschen yes, that's a known issue; the cAdvisor uses various bind-mounts, including /var/lib/docker, which causes mounts to leak, resulting in this problem.

@thaJeztah Is there any solution or alternative for cadvisor? Thanks.

@oopschen some hints are given in https://github.com/docker/docker.github.io/pull/412, but it depends on what you need cAdvisor for to be able to tell what alternatives there are. Discussing alternatives may be a good topic for forums.docker.com

Just got this error for the first time on OS X Sierra using docker-compose:

ERROR: for pay-local  Driver aufs failed to remove root filesystem
0f7a073e087e0a5458d28fd13d6fc840bfd2ccc28ff6fc2bd6a6bc7a2671a27f: rename
/var/lib/docker/aufs/mnt/a3faba12b32403aaf055a26f123f5002c52f2afde1bca28e9a1c459a18a22835
/var/lib/docker/aufs/mnt/a3faba12b32403aaf055a26f123f5002c52f2afde1bca28e9a1c459a18a22835-removing: 
structure needs cleaning

I had never seen it before the latest update last night.

$ docker-compose version
docker-compose version 1.9.0, build 2585387
docker-py version: 1.10.6
CPython version: 2.7.12
OpenSSL version: OpenSSL 1.0.2j  26 Sep 2016

$ docker version
Client:
 Version:      1.13.0-rc3
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   4d92237
 Built:        Tue Dec  6 01:15:44 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.13.0-rc3
 API version:  1.25 (minimum version 1.12)
 Go version:   go1.7.3
 Git commit:   4d92237
 Built:        Tue Dec  6 01:15:44 2016
 OS/Arch:      linux/amd64
 Experimental: true

I tried docker rm -fv a couple of times, but always received the same error.

$ docker ps -a
CONTAINER ID        IMAGE                              COMMAND             CREATED             STATUS              PORTS               NAMES
0f7a073e087e        pay-local                          "node app.js"       2 minutes ago       Dead                                    pay-local

In the amount of time it's taken me to type this out, the offending container is now gone.

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

I don't know if it's fixed itself, or if there's still a problem lurking...

EDIT: Just started and stopped the same set of containers using docker-compose several times with no errors, so... ?

@jeff-kilbride structure needs cleaning is a different message, and may refer to the underlying filesystem; could be specific to Docker for Mac

Just happened to me on every container, and went away after a few seconds (contained got deleted)

docker version

Client:
Version: 1.13.1
API version: 1.26
Go version: go1.7.5
Git commit: 092cba3
Built: Wed Feb 8 06:36:34 2017
OS/Arch: linux/amd64

Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 092cba3
Built: Wed Feb 8 06:36:34 2017
OS/Arch: linux/amd64
Experimental: false

docker-compose version
docker-compose version 1.9.0, build 2585387
docker-py version: 1.10.6
CPython version: 2.7.9
OpenSSL version: OpenSSL 1.0.1t 3 May 2016

On debian 8 3.16.0-4-amd64

I've been having this issue on a few of my Docker servers running 1.12.5

Client:
Version: 1.12.5
API version: 1.24
Go version: go1.6.4
Git commit: 7392c3b
Built: Fri Dec 16 02:23:59 2016
OS/Arch: linux/amd64

Server:
Version: 1.12.5
API version: 1.24
Go version: go1.6.4
Git commit: 7392c3b
Built: Fri Dec 16 02:23:59 2016
OS/Arch: linux/amd64

Last night in particular a developer tried to use docker-compose stop, rm and up -d (bash wrapper) and he encountered the issue reported above. Prior to using the docker-compose the developer pulled an updated "latest" tagged image from our local registry. When I started to investigate I could see the container was marked as Dead. I attempted 'docker rm' command and got the same results.

After 5-10 minutes of researching the issue on the web I went back and to observe the status of the container and could see that it was removed already. Following this observation I attempted to bring the container up "docker-compose up -d" and was successful in doing so.

Hello,

** I was getting errors during some of these commands because I had removed docker at one point; I re-installed it, now I can't seem to uninstall it. I'm still getting these errors as well:
sudo rm -rf /var/lib/docker rm: cannot remove '/var/lib/docker/overlay': Device or resource busy rm: cannot remove '/var/lib/docker/overlay2/5b04c89cac02bfebc6de9355808c905e149dd7cb2f324952750b49aa93393ef4/merged': Device or resource busy rm: cannot remove '/var/lib/docker/overlay2/4a17da45150a3e24ecef6babb933872f9aa403f3a072d5d37aff3b71b9eb936a/merged': Device or resource busy

docker -v Docker version 1.12.6, build 78d1802

MAIN ISSUE:
I tried out Rancher over the past week and it doesn't look it will be a good solution for me. I have a standard ubuntu 16.04 server on Digital Ocean, and I'm trying to completely remove rancher and docker; it took some digging on the internet to figure out how to do this, and I've finally got it whittled down but now I can't finish removing /var/lib/rancher and /var/lib/docker. Here are the outputs I get:

sudo rm -rf rancher rm: cannot remove 'rancher/volumes': Device or resource busy

I read that using this command might help track down the running processes so they can be killed, but no dice:
lsof +D ./ lsof: WARNING: can't stat() nsfs file system /run/docker/netns/c24324d8b667 Output information may be incomplete. lsof: WARNING: can't stat() nsfs file system /run/docker/netns/default Output information may be incomplete. COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME bash 16667 blakers757 cwd DIR 253,1 4096 267276 . lsof 27938 blakers757 cwd DIR 253,1 4096 267276 . lsof 27939 blakers757 cwd DIR 253,1 4096 267276 .

When I try to kill the processes by pid, it fails.

docker ps shows no running containers:
```docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

when I try to remove /var/lib/docker, I get the following:
sudo rm -rf /var/lib/docker rm: cannot remove '/var/lib/docker/overlay': Device or resource busy rm: cannot remove '/var/lib/docker/overlay2/5b04c89cac02bfebc6de9355808c905e149dd7cb2f324952750b49aa93393ef4/merged': Device or resource busy rm: cannot remove '/var/lib/docker/overlay2/4a17da45150a3e24ecef6babb933872f9aa403f3a072d5d37aff3b71b9eb936a/merged': Device or resource busy

whatever is running inside this overlay2 folder seems to be to blame.

Just wondering if you all have any ideas, thanks.

Is the docker service stopped when you try to remove? Looks like there's still something running

I'm totally new to Docker, unfortunately - thought this would be a good learning opportunity. I've tried every command I can find to kill and remove any remaining containers/services but still no luck:

```blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker ps -a -f status=exited
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker stop $(docker ps -a -q)
"docker stop" requires at least 1 argument(s).
See 'docker stop --help'.

Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...]

Stop one or more running containers
blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker rm $(docker ps -a -q)
"docker rm" requires at least 1 argument(s).
See 'docker rm --help'.

Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...]

Remove one or more containers
blakers757@ubuntu-1gb-sfo1-01:/var/lib$ cd ../../
blakers757@ubuntu-1gb-sfo1-01:/$ cd ~
blakers757@ubuntu-1gb-sfo1-01:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
blakers757@ubuntu-1gb-sfo1-01:~$ docker rm $(docker ps -a -q)
"docker rm" requires at least 1 argument(s).
See 'docker rm --help'.

Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...]

Remove one or more containers
blakers757@ubuntu-1gb-sfo1-01:~$ docker volume ls
DRIVER VOLUME NAME
blakers757@ubuntu-1gb-sfo1-01:~$ sudo rm -rf /var/lib/docker
rm: cannot remove '/var/lib/docker/overlay': Device or resource busy
rm: cannot remove '/var/lib/docker/overlay2/5b04c89cac02bfebc6de9355808c905e149dd7cb2f324952750b49aa93393ef4/merged': Device or resource busy
rm: cannot remove '/var/lib/docker/overlay2/4a17da45150a3e24ecef6babb933872f9aa403f3a072d5d37aff3b71b9eb936a/merged': Device or resource busy
blakers757@ubuntu-1gb-sfo1-01:~$ ```

Any help would be appreciated, thanks!

try stopping the service (systemctl stop docker), then remove /var/lib/docker

Thank you, unfortunately that's still not working though:

systemctl stop docker ==== AUTHENTICATING FOR org.freedesktop.systemd1.manage-units === Authentication is required to stop 'docker.service'. Authenticating as: Blake Schwartz,,, (blakers757) Password: ==== AUTHENTICATION COMPLETE === blakers757@ubuntu-1gb-sfo1-01:~$ sudo rm -rf /var/lib/docker [sudo] password for blakers757: rm: cannot remove '/var/lib/docker/overlay2/5b04c89cac02bfebc6de9355808c905e149dd7cb2f324952750b49aa93393ef4/merged': Device or resource busy rm: cannot remove '/var/lib/docker/overlay2/4a17da45150a3e24ecef6babb933872f9aa403f3a072d5d37aff3b71b9eb936a/merged': Device or resource busy

Maybe:

docker volume prune -f

Do you still have images? What does docker images show? If so, try to remove them:

docker rmi -f [container id]

Finally:

docker rmi $(docker images --quiet --filter "dangling=true")

If none of those work, I can't help you... (reboot the server, if you are able?)

Thanks, there aren't any images but after rebooting my server I was able to remove /var/lib/rancher. Still unable to remove /var/lib/docker though:

blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES blakers757@ubuntu-1gb-sfo1-01:/var/lib$ rm -rf docker rm: cannot remove 'docker': Permission denied blakers757@ubuntu-1gb-sfo1-01:/var/lib$ sudo rm -rf docker rm: cannot remove 'docker/overlay': Device or resource busy blakers757@ubuntu-1gb-sfo1-01:/var/lib$ docker kill overlay Error response from daemon: Cannot kill container overlay: No such container: overlay blakers757@ubuntu-1gb-sfo1-01:/var/lib$

This output is a little different than previous (it had been referring to docker/overlay2/some long sha number/merged. In any case though, still doesn't seem to want to remove docker entirely.

My team has been faced with the problem every time shutting down docker containers. We are running a service with more than 100 docker containers and container advisors through swarm system. The only solution what I found is that shutting down forcefully several times until the message which indicates containers do not exists anymore is shown. It's happening around 1 out of 5 containers. It seems 10 percent is really critical for business service.

OS: Ubuntu Xeniel
Docker: v1.13.1
CAdvisor: v0.24.1

We had to restart docker service or, unluckily, linux servers because of the combination of network allocation bug and this container advisor bug.
Luckily, the network allocation bug seems to be fixed in the latest docker binary.

@yunghoy What's the full error message that you see?

I came accross this one (I hope it's related):

~
root@docker2:/var/www# docker rm -f spiderweb
Error response from daemon: Unable to remove filesystem for 601d43bca2550c2916d2bf125f04b04b82423633fbed026393b99291d1ef0b08: remove /var/lib/docker/containers/601d43bca2550c2916d2bf125f04b04b82423633fbed026393b99291d1ef0b08/shm: device or resource busy
root@docker2:/var/www# docker rm -f spiderweb
Error response from daemon: No such container: spiderweb
~

The symptoms however were a bit off from what I could see. When I was running the container, it started to run the process but then it was as if the process was stuck in an infinite loop (as soon as it started with some I/O actually - I write out some small files with it on a volume mount). The process didn't react to 'docker stop', and I managed to do a pstree -a before killing it with docker rm -f and getting the above message, this was the last branch:

~
─docker run -i --rm --name spiderweb -v /var/www:/var/www -v /src/spiderweb/bin:/usr/src/app -w /usr/src/app node:4 node src/index.js docker2
└─11*[{docker}]
~

I'm not exactly sure how 11 docker children come into play here. Seeing typical container process trees leads me to believe that the application has stopped already, but docker engine didn't catch it somehow.

This is pastebin output for that location, which is still there and can't be removed: full output. I'm going to go with a service restart, to clean this up.

Edit: following @thaJeztah advice above, even after service restart docker there were folders in /var/lib/docker/containers that couldn't be deleted. They didn't show up in lsof, and I was running as root so it's a bit beyond me how this can happen apart from disk failure. A reboot solved this, and the files/folders containing only an empty "shm" folder could then be deleted. Attaching a docker version/info for extra information about my system:

~~~
Client:
Version: 17.03.0-ce
API version: 1.26
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 08:02:23 2017
OS/Arch: linux/amd64

Server:
Version: 17.03.0-ce
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 08:02:23 2017
OS/Arch: linux/amd64
Experimental: false
~~~

~
Containers: 4
Running: 4
Paused: 0
Stopped: 0
Images: 18
Server Version: 17.03.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.0-2-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.698 GiB
Name: docker2
ID: G2Z3:XLWE:P3V3:FTZR:U2Y6:2ABJ:6HTP:PIV2:KRHA:2ATV:ZMPQ:SHMJ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
~

I always have this message if I try to remove the container to quickly. It's really annoying I have to restart docker to get the port back:

ERROR: for nginx_nginx_1  Unable to remove filesystem for cfd48197bba6ee1ac91d7690b0567b56e61be03420768a5627936601b3ad6378: remove /var/lib/docker/containers/cfd48197bba6ee1ac91d7690b0567b56e61be03420768a5627936601b3ad6378/shm: device or resource busy

Docker version 1.12.5, build 7392c3b

Is there a way to avoid it ?

Same issue on docker 17.03.0-ce. No way to get it back to work unless restarting docker daemon...

Update
Stop cadvisor, then attempt to remove dead container & re-create it. Works for me.

Should avoid using this mountpoint if not necessary /var/lib/docker:/var/lib/docker:ro. It seems Cadvisor, with permission to access other container's volumes could lock them on the run.

Docker version 1.12.6 with devicemapper on RHEL 7.3 results the same. docker rm -f <container> fails. No mounts, no special settings.

This is happening under high load. That is, I start several scripts that

Loop:
1. run container with `cat`
2. exec some commands
3. rm -f container

Things go well for a while and then removal starts getting buffered and batched. When the error happens, the file system of the container remain intact but the container is "forgotten" by the server. The scenario finally fails with the tdata device being totally fragmented and the tmeta device being full.

Hi I'm having this issue using docker-compose - I've opened up an issue here https://github.com/docker/compose/issues/4781. Could this be related? Restarting the Docker Daemon didn't help and my only solution was to force remove the dead container. Even though it triggered the error the container is still removed. Looking in /var/lib/docker/mnt/aufs I suspect the dead folders are still there but it does circumvent the issue...at least until the next time you need to recreate a container with a fresh image

I am having this issue with RHEL 7.3 and Docker version 17.05.0-ce, build 89658be

We do a "systemctl restart docker" as work around or reboot the RHEL 7.3 docker host virtual machine.

This does appear to be more frequent in later releases of Docker.

df -Th shows /boot drive is ext4 and all the other drives are xfs. (not sure why the boot drive is ext4, going to ask my team).

@bignay2000 Typically you'd get this error if some mount leaked from the container state dir into a container. This can happen quite easily if you do something like -v /var/lib/docker:/var/lib/docker.

I've got this when I removed some dirs from /var/lib/docker/volumes to free up some space. Because docker volume rm doesn't work when you're out of space.

May it be related?

I've also introduced --rmi local -v options to docker-compose down. Will try to remove these options to see whether it was the cause.

P.S. This is happening on jenkins, with multiple parallel docker-compose runs.

I restart server,and then remove /var/lib/docker successfully.

I just experienced this multiple times. I added MountFlags=private to the docker service to prevent further mount leaks, but I was sick of restarting the machine, so I went hunting for a way to get rid of the leaked mounts without restarting.

Looking for these leaked mounts, I noticed here that each pid has its own mountinfo at /proc/[pid]/mountinfo. So, to see where the leaks were I did:

$ grep docker /proc/*/mountinfo
/proc/13731/mountinfo:521 460 8:3 /var/lib/docker/overlay /var/lib/docker/overlay rw,relatime shared:309 - xfs /dev/sda3 rw,seclabel,attr2,inode64,noquota
/proc/13731/mountinfo:522 521 0:46 / /var/lib/docker/overlay/2a2dd584da9858fc9e5928d55ee47328712c43e52320b050ef64db87ef4d545a/merged rw,relatime shared:310 - overlay overlay rw,seclabel,lowerdir=/var/lib/docker/overlay/7cbf3db2f8b860ba964c88539402f35c464c36013efcb845bce2ee307348649f/root,upperdir=/var/lib/docker/overlay/2a2dd584da9858fc9e5928d55ee47328712c43e52320b050ef64db87ef4d545a/upper,workdir=/var/lib/docker/overlay/2a2dd584da9858fc9e5928d55ee47328712c43e52320b050ef64db87ef4d545a/work
/proc/13731/mountinfo:523 521 0:47 / /var/lib/docker/overlay/12f139bad50b1837a6eda1fe6ea5833853746825bd55ab0924d70cfefc057b54/merged rw,relatime shared:311 - overlay overlay rw,seclabel,lowerdir=/var/lib/docker/overlay/d607050a3f9cdf004c6d9dc9739a29a88c78356580db90a83c1d49720baa0e5d/root,upperdir=/var/lib/docker/overlay/12f139bad50b1837a6eda1fe6ea5833853746825bd55ab0924d70cfefc057b54/upper,workdir=/var/lib/docker/overlay/12f139bad50b1837a6eda1fe6ea5833853746825bd55ab0924d70cfefc057b54/work
/proc/13731/mountinfo:524 521 0:48 / /var/lib/docker/overlay/33fb78580b0525c97cde8f23c585b31a004c51becb0ceb191276985d6f2ba69f/merged rw,relatime shared:312 - overlay overlay rw,seclabel,lowerdir=/var/lib/docker/overlay/5e8f5833ef21c482df3d80629dd28fd11de187d1cbbfe8d00c0500470c4f4af2/root,upperdir=/var/lib/docker/overlay/33fb78580b0525c97cde8f23c585b31a004c51becb0ceb191276985d6f2ba69f/upper,workdir=/var/lib/docker/overlay/33fb78580b0525c97cde8f23c585b31a004c51becb0ceb191276985d6f2ba69f/work
/proc/13731/mountinfo:525 521 0:49 / /var/lib/docker/overlay/e6306bbab8a29f715a0d9f89f9105605565d26777fe0072f73d5b1eb0d39df26/merged rw,relatime shared:313 - overlay overlay rw,seclabel,lowerdir=/var/lib/docker/overlay/409a9e5c05600faa82d34e8b8e7b6d71bffe78f3e9eff30846200b7a568ecef0/root,upperdir=/var/lib/docker/overlay/e6306bbab8a29f715a0d9f89f9105605565d26777fe0072f73d5b1eb0d39df26/upper,workdir=/var/lib/docker/overlay/e6306bbab8a29f715a0d9f89f9105605565d26777fe0072f73d5b1eb0d39df26/work
/proc/13731/mountinfo:526 521 0:50 / /var/lib/docker/overlay/7b56a0220212d9785bbb3ca32a933647bac5bc8985520d6437a41bde06959740/merged rw,relatime shared:314 - overlay overlay rw,seclabel,lowerdir=/var/lib/docker/overlay/d601cf06e1682c4c30611d90b67db748472d399aec8c84487c96cfb118c060c5/root,upperdir=/var/lib/docker/overlay/7b56a0220212d9785bbb3ca32a933647bac5bc8985520d6437a41bde06959740/upper,workdir=/var/lib/docker/overlay/7b56a0220212d9785bbb3ca32a933647bac5bc8985520d6437a41bde06959740/work

That told me that process 13731 still had references to /var/lib/docker/overlay, so I (as root) entered the mount namespace of that process and removed the mounts:

$ nsenter -m -t 13731 /bin/bash
$ mount
<snipped mount output that verifies that it does see those mount points>
$ umount /var/lib/docker/overlay/*
$ umount /var/lib/docker/overlay
$ exit

At which point I could finally delete /var/lib/docker, restart the docker service (thus recreating everything in /var/lib/docker), and have no more issues.

After removing --rmi local -v I didn't have this problem. Probably it tries to remove shared images. I'll try with option -v

did encounter similar issue but still on docker 1.10.3 in this case.

  • docker stop makes container go to "dead" status
  • docker rm does not remove the container with message: "Failed to remove .... Errror response from daemon: Driver devicemapper failed to remove root filesystem ...: remover /var/lib/docker-latest/devicemapper/mnt/....: device or resource busy
  • docker rm -f gives the same error message but does remove the container after all.
  • In some cases the container can be removed normally after a more or less long wait (minutes - hour)
# docker info
Containers: 15
 Running: 15
 Paused: 0
 Stopped: 0
Images: 12
Server Version: 1.10.3
Storage Driver: devicemapper
 Pool Name: docker-thin-pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 6.501 GB
 Data Space Total: 21.47 GB
 Data Space Available: 14.97 GB
 Metadata Space Used: 1.982 MB
 Metadata Space Total: 4.194 MB
 Metadata Space Available: 2.212 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 5
 Library Version: 1.02.107-RHEL7 (2015-12-01)
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
 Authorization: rhel-push-plugin
Kernel Version: 3.10.0-327.13.1.el7.x86_64
Operating System: Red Hat Enterprise Linux
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 0
CPUs: 4
Total Memory: 15.51 GiB

Is it possible that the container won't stop and goes dead in this condition because some other service is still trying to send data to it? (sounds a bit far-fledged)

@lievendp No.
The container is stopping just fine, it just can't be removed because the mount has leaked into some other namespace, and you are running a super old kernel where this kind of thing just breaks down.

The reason you can remove it after waiting some amount of time is because the thing that was holding the mount has exited.

In this case, docker rm -f is not removing the container, it's only removing the metadata stored in the daemon about the container.

Btw, there are more fixes coming in 17.06 that should hopefully help alleviate (or completely resolve) this situation, especially if you are on a relatively recent kernel... though you don't have to be on a recent kernel for some of the changes.

Is there anything that can be done in the meantime? I'm running into the same issue with the docker-container module from Ansible. That is on Debian Jessie with Docker 17.05.0

@cpuguy83 yet it's the only kernel "version" you're going to see in enterprise level linux like RHEL7. I guess they backport a lot of fixes in their kernel version.
Is there anything specific in the newer kernels we're looking for to fix this issue? I could have a look if it's in the version I have?

From what I see here there are supported docker versions on RHEL7 so the kernel shouldn't be the problem then, no?

Anyhow, it appears that fixing one thing seems to be leading to another problem in a newer version, it makes me think that maybe docker (the way I use it which is quite basic) isn't that production-ready (stable) as it's still too young a technology? Won't keep me from using it considering the advantages it can bring.

The "leaking" of mounts in namespaces, is there any method to pinpoint this is the problem? ATM, it's just a description from my end leading to your conclusion but is it possible for me to actually test and see the leaking problem? Maybe seeing it may help solving or working around it?

Docker 17.03.1-ce on a new CentOS 7.3 install running kernel 3.10.0-514.21.1.el7.x86_64

Problem remains.

Commands:

docker stop 95aa09d90aaf
docker rm 95aa09d90aaf

Result in:

Error response from daemon: Driver overlay failed to remove root filesystem 95aa09d90aaf870301163e19bf9bb73eff055e7a2c3e3d22d09604fb41361608: remove /var/lib/docker/overlay/8479d2fd0c0e7ec06c17af0b00bb004baeb0c6fbe92ed1b858b741c9458bb499/merged: device or resource busy

Followed by:

Message from syslogd@ik1-331-25960 at Jun  5 10:24:47 ...
 kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

Problem solved by:

systemctl restart docker

Here is my docker info:

Containers: 2
 Running: 0
 Paused: 0
 Stopped: 2
Images: 1
Server Version: 17.03.1-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.21.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 3
Total Memory: 1.954 GiB
Name: ik1-331-25960.vs.sakura.ne.jp
ID: LFE3:D55E:EXWU:JFGN:GKZ4:QLKI:3CX7:7YG4:U2OQ:LLSI:LNRE:D5UU
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

@lievendp The particular kernel feature is being able to do a detached mount while the mount exists in another namespace.
I believe RH is planning on including this in RHEL 7.4, btw.

Generally speaking the only time you would really encounter this issue is if you've mounted the /var/lib/docker, or one of it's parents into a container.

One potential work-around for this is to set MountFlags=slave in the docker systemd unit file. The reason this isn't in by default is it can cause problems for certain use-cases.

Yeah, I'm using CentOS 7.3 too. Happens once every 20-40 times.

Generally speaking the only time you would really encounter this issue is if you've mounted the /var/lib/docker, or one of it's parents into a container.

Looks similar to what I did. I'm using docker-compose inside ssh (jenkins slave) container with mounted sock file.

Generally speaking the only time you would really encounter this issue is if you've mounted the /var/lib/docker, or one of it's parents into a container.

In my tests I created and deleted containers, about 50/min. I mounted nothing and only started and removed the container running cat. The result was device busy errors on /*/shm paths and leaked meta and data storage. So the above statement may not cover all use cases.

had this issue for the first time. Not sure of root cause. Docker 17.03.1. Ubuntu 14.04.5 LTS. Restarted docker service to resolve.

This happened again on another system. Suspect it was due to a container being taken down in an un-clean way.
Waited a while and eventually the 'Dead' container wet away on its own.

@cpuguy83 Regarding the improvements in RHEL 7.4, I'm guessing we're talking about this:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7-Beta/html/7.4_Release_Notes/technology_previews_virtualization.html
Will it impact the way docker can be used?

User namespace
This feature provides additional security to servers running Linux containers by providing better isolation between the host and the containers. Administrators of a container are no longer able to perform administrative operations on the host, which increases security. (BZ#1138782)

@lievendp No, I doubt it would be listed on the release notes.

I don't know anything about /var/lib/docker but I'm not using any composer. It happens as often as 1 in 5 times that I stop and remove a container during development.

I experience the same on Centos 7. It somehow started to appear recently. I guess it is affected by the load.

Containers: 29
Running: 26
Paused: 0
Stopped: 3
Images: 20
Server Version: 1.12.6
Storage Driver: devicemapper
Pool Name: docker-202:3-390377-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/vg-docker/data
Metadata file: /dev/vg-docker/metadata
Data Space Used: 7.502 GB
Data Space Total: 273.8 GB
Data Space Available: 266.3 GB
Metadata Space Used: 13.37 MB
Metadata Space Total: 17.05 GB
Metadata Space Available: 17.03 GB
Thin Pool Minimum Free Space: 27.38 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: syslog
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 3.10.0-514.6.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.26 GiB
Name: production-rh2
ID: 7MS3:2QDM:45GK:7COB:YY7C:WEIK:IQ3S:APQJ:ZAFQ:6JGJ:X6LT:7Q5B
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
127.0.0.0/8

@cpuguy83

The particular kernel feature is being able to do a detached mount while the mount exists in another namespace.
I believe RH is planning on including this in RHEL 7.4, btw.

Do you happen to know which kernel this feature was mainlined in?

Is it related to device mapper or overlay? Or is it completely independent from the storage driver?

Will installing newer kernel from http://elrepo.org/tiki/tiki-index.php help?
The newest one is 4.11: kernel-ml-4.11.6-1.el7.elrepo.x86_64.rpm

@Vanuan It would, though I'd stick with the LTS kernel, which would be 4.9.
The version it was mainlined is 3.15 (IIRC). It's more of a bug fix than a feature. It lets you unmount resources that are mounted in other namespaces.

It's independent of storage driver.

Is anyone with this sort of error using healthcheck in their docker-compose.yml ? I had no problems with deployment until I added healthcheck to the yml file. I'm on docker 17.03.1-ce

@itsafire That's interesting... I'll take a look at that code path.

@cpuguy83 As soon as I got rid of the healthcheck definition in my compose file, the problem went away. I'll investigate later for re-production of that issue.

The production system is a dual server setup with current Debian jessie. Both behaved identical leaving dead container behind. The docker daemon restart resolved the issue.

I have no health check in the docker-compose.

Still have this issue using composer.

Starts with a mkdir file exists, then after restarting/killing/etc which didn't work I get this error message.

I'm having the same issue. After upgrading my Docker Mastodon Containers (Update instructions) I got for each container:

for web  driver "overlay" failed to remove root filesystem for xxxx: remove /var/lib/docker/overlay/xxxx/merged: device or resource busy
ERROR: Encountered errors while bringing up the project.

Docker version: 17.06.0-ce

@owhen Which kernel version?

Seeing the same issue on CentOS 7, using Docker-CE 17.06.
kitchen/docker isn't happy.

@xmj As already mentioned, this is a kernel issue. Workaround is to use mainline: https://gist.github.com/pgporada/bee21b339b6ca750f1de

@Vanuan 3.10.0-514.2.2.el7.x86_64
Will this kernel issue fixed by centos? I'm not very happy with this workaround..

Should be fixed in the centos 7.4 kernel... but I haven't tested it.

Ok folks, I've found where these mounts are leaking.
It's the systemd-udevd service (at least in my testing).
If you restart systemd-udevd you should find that the container is removable without issue.

You can also see a related issue in rkt: https://github.com/rkt/rkt/issues/1922

systemctl restart systemd-udevd

rm -r -f 8d2bce5c13b882ea16ce012b639e646b31c95d722486bab056d0a39e974ad746
rm: cannot remove ‘8d2bce5c13b882ea16ce012b639e646b31c95d722486bab056d0a39e974ad746/merged’: Device or resource busy

Like I tried to hint, mileage may vary.
The (almost?) root cause is that the usage of MountFlags in systemd units causes systemd to create a new mount namespace for the service that is being started.
Depending on what is set in MountFlags will change how mounts are progated from the host (or back to the host).

In the case of systemd-udevd, it is using MountFlags=slave, which means that any changes to mounts on the host will propagate to the systemd-udevd mount ns (where the service is running).

What should be happening is when an unmount occurs, that should propagate to systemd-udevd's mount ns... however for some reason either this propagation isn't happening or something in the mount ns is keeping the mount active, preventing removal even if the mount appears to be gone in the host's mount ns.

I'm using systemd-udevd as an example here as I can reproduce it 100% of the time specifically with systemd-udevd, and can mitigate it either by stopping the service or disabling the MountFlags in the unit file for that service (and thus it will live in the host mount ns).

There could be a myriad of things causing the resource to remain busy on your system, including how other containers are running and what they are mounting in.
For instance if you are mounting /var/lib/docker into a container on one of these old kernels it is likely going to cause the same problem.

Blaming this problem on the kernel of the most up-to-date release of a current operating system is rather disingenuous. Kernels are not written and tested to meet the specs of any particular app or service (or shouldn't be). Rather, it might be better to specify which operating systems, kernels, and configurations are necessary to have this working correctly. Such as the environment where docker was developed and tested without error.

Incidentally, I nearly solved this problem by inserting delays in my scripts between docker commands. Its ugly, but I haven't seen the problem in a while.

@wdch-nseeley To be fair, naming CentOS "current" is deceptive. It's designed to be old. Red Hat does not update the kernel version, but instead backports new features to the same kernel version. The last kernel version released with CentOS 7 is 3.10, released in 2013 and has end of life in October 2017.

It's natural that Red Hat messes up sometimes.

Docker is a product that relies on a particular set of kernel features. Not the other way around.

@wdch-nseeley No one is blaming the kernel, other than that we can't take advantage of newer kernel abilities (could be considered a bug fix) on centos/rhel until 7.4 comes out like we do on other distros with newer kernels. There's just a lot of moving parts to sort out, plus one can get this error from different places and for different reasons.

Now...
The real issue here seems to be the usage of MountFlags (in the systemd unit config) on system services to run in their own mount namespace and eat up mounts from Docker.
What is strange is that with MountFlags=slave, the changes to mounts on the root mount namespace (where systemd and docker are running by default) are supposed to propagate to the the service's mount namespace... it's getting the new mount, but it's not getting the unmount request. I can actually even nsenter into the service's mount namespace and manually unmount the affected mountpoint with no issue and then call docker rm (or whatever) and it removes successfully... this issue with the unmount not propagating feels like a kernel bug, but I need to do some more tests and see if it's actually even really fixed in newer kernels at all or if we're working around it with our usage of MNT_DETACH on unmount.
I found that not using MountFlags at all seems to clear this issue up.

I really ran into this recently because while testing the new metrics plugins on rhel/centos I have found 100% of the time I would get a device or resource busy error on remove, even with absolutely nothing else running, and bare system services.
The interesting bit about metrics plugins is it creates a unix socket at /run/docker/metrics.sock on the host which is then bind-mounted into the plugin container's rootfs whereas other plugins don't really get any special mounts like this.
The solution for this was to mount --make-private /var/lib/docker/plugins which we merged yesterday.
It's obviously not a perfect solution since we've been doing this for a long time for container filesystem mounts, and yet somehow these mounts are still leaked on occasion... but in any case it fixed the immediate issue for metrics plugins failing 100% of the time on remove on these old kernels.

So, the workaround is to remove MountFlags from udevd?

sudo vim /usr/lib/systemd/system/systemd-udevd.service # remove MountFlags
sudo systemctl daemon-reload
sudo systemctl restart systemd-udevd

docker.service doesn't have MountFlags.

@Vanuan This fixed the issue on my very minimal installation, it may not be the only thing.

@Vanuan This does not fixed the issue on my system.

I am on Centos 7.3 with Docker 17.06.0 CE and faced this issue with GitLab container.

I tried following:

docker rm $(docker ps -a -q)
Error response from daemon: driver "overlay" failed to remove root filesystem for 3f42a98df9a22c37cf18db35eb353f0ff90e0430aec6d6419706e3dd90a91c2d: remove /opt/docker-data/overlay/92b81e6f8c4dbfbedc1f99d349c1b9c7209be7f9d8a3602a00a5bb30707da638/merged: device or resource busy

So no luck. So I did the stuff suggested by @cognifloyd 👍

grep docker /proc/*/mountinfo
nsenter -m -t ${PROC_ID} /bin/bash
mount 
unmount (use output from previous mount command)
exit

solved my issue...

Hi

had the same issue with 1 of my containers, and i remembered that i have this in my Splunk container:

volumes:
 - /apps/splunk-data/etc:/opt/splunk/etc
 - /apps/splunk-data/var:/opt/splunk/var
 - /var/lib/docker/containers:/host/containers:ro
 - /var/run/docker.sock:/var/run/docker.sock:ro

As you can see i mount /var/lib/docker/containers, so i stopped the splunk container, "rm -f" the faulty container, with success, and started up the splunk one again

We have the same issue with docker 17.06-ce.
Had to umount shm. But I don't know why this appear in the first place.

[trana@integ-storage-002 ~]$ sudo docker ps -a|grep ntp
111ff029e499        proto-pi-cm.ts-l2pf.cloud-omc.org:5000/thales/ntpserver:origin-feature-0.4   "/bin/sh -c /var/n..."    4 weeks ago         Dead                                                                                         integ_ntpclient_8

[trana@integ-storage-002 ~]$ sudo docker rm integ_ntpclient_8
ERROR: for integ_ntpclient_8  Error response from daemon: unable to remove filesystem for ...: remove /var/.../shm: device or resource busy

[trana@integ-storage-002 ~]$ mount |grep 111ff029e499
shm on /var/lib/docker/containers/111ff029e499984233a5c8c43f6c8e21471240be7721a35b8e15d8c6da7a757d/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /var/lib/docker/containers/111ff029e499984233a5c8c43f6c8e21471240be7721a35b8e15d8c6da7a757d/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)

# Twice because twice mount.
sudo umount /var/lib/docker/containers/111ff029e499984233a5c8c43f6c8e21471240be7721a35b8e15d8c6da7a757d/shm
sudo umount /var/lib/docker/containers/111ff029e499984233a5c8c43f6c8e21471240be7721a35b8e15d8c6da7a757d/shm

[trana@integ-storage-002 ~]$ docker rm integ_ntpclient_8

@antoinetran which kernel/os?

CentOS Linux release 7.3.1611 (Core) 
Name        : kernel
Arch        : x86_64
Version     : 3.10.0
Release     : 327.13.1.el7

in my case, workaround by trying what @cognifloyd mentioned above:

  1. info
    ````bash
    [root@test_node_02 ~]# docker info

Server Version: 17.06.0-ce
Storage Driver: overlay
Backing Filesystem: extfs
Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs

Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active

Kernel Version: 3.10.0-514.21.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.797GiB

Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
````

  1. problem:
    bash Error response from daemon: driver "overlay" failed to remove root filesystem for xxx: remove /var/lib/docker/overlay/xxx/merged: device or resource busy

  2. workaround
    ````bash

  3. try to remove "dead" containers
    [root@test_node_02 ~]# docker rm -f $(docker ps -a --filter status=dead -q |head -n 1)
    Error response from daemon: driver "overlay" failed to remove root filesystem for 808acab2716420275cdb135ab964071cfc33406a34481354127635d3a282fa31: remove /var/lib/docker/overlay/88440438ea95b47e7459049fd765b51282afee4ad974107b0bf96d08d9c7763e/merged: device or resource busy

  4. find pid in /proc//mountinfo
    [root@test_node_02 ~]# grep -l --color docker ps -a --filter status=dead -q |head -n 1 /proc/
    /mountinfo

  5. whois pid
    [root@test_node_02 ~]# ps -f 7360
    UID PID PPID C STIME TTY STAT TIME CMD
    root 7360 7344 1 Aug16 ? Ssl 73:57 /usr/bin/cadvisor -logtostderr

  6. also, we can determine they are on different mount namespaces
    [root@test_node_02 ~]# ls -l /proc/$(cat /var/run/docker.pid)/ns/mnt /proc/7360/ns/mnt
    lrwxrwxrwx 1 root root 0 Aug 21 15:55 /proc/11460/ns/mnt -> mnt:[4026531840]
    lrwxrwxrwx 1 root root 0 Aug 21 15:55 /proc/7360/ns/mnt -> mnt:[4026532279]
    [root@test_node_02 ~]#

  7. try to restart cadvisor
    [root@test_node_01 ~]# docker service ls |grep cadvisor
    5f001c9293cf cadvisor global 3/3 google/cadvisor:latest

[root@test_node_01 ~]# docker service update --force cadvisor
[root@test_node_01 ~]#

  1. remove again
    [root@test_node_02 ~]# docker rm -f $(docker ps -a --filter status=dead -q |head -n 1)
    808acab27164
    [root@test_node_02 ~]#

````

conclusion: cadvisor or other containers which using volume contains '/var/lib/docker' or '/' will cause the problem.
workaround: find the container/service, restart it.
how to fix it: unknown.

Some information could be useful. I have two hardware servers with same Centos 7 and almost same core versions, same storage drivers, but the bug still reproduce on one server and never reproduce on another:

# reproduce here:
Server Version: 17.06.0-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.26.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.1GiB
Docker Root Dir: /home/_docker
Debug Mode (client): false
Debug Mode (server): false
Username: 6626070
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

but

# does not reproduce here
Server Version: 17.03.1-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.10.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.13 GiB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Just to add to my previous comment, I still have the issue and always enforced to reboot or umount the way I described... my docker instance info:

Containers: 2
 Running: 1
 Paused: 0
 Stopped: 1
Images: 4
Server Version: 17.06.0-ce
Storage Driver: overlay
 Backing Filesystem: xfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.26.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 19.61GiB
Name: lgh-dev01
ID: OI4M:BVZK:YGCD:M7DS:TD7X:WO3Q:WFHQ:UECY:N5A6:NHSX:4THI:HE5T
Docker Root Dir: /opt/docker-data
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 26
 Goroutines: 31
 System Time: 2017-08-21T17:29:34.436543815+02:00
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

@dobryakov If I can see properly, the only difference in those instances is docker version?

@archenroot right, the main difference is in docker versions. There is a little difference in minor kernel version's numbers (10.2.el7 vs 26.2.el7), but I think it doesn't matter.

And this is the reason why I doubt about kernel issue...

@dobryakov if you are using docker rm -f you would not see the error on 17.03, but the error would still occur.

I tried to reinstall oldest version of docker (17.03), but it was failed on my servers by dependency problems :( Does anybody know, would manual kernel update fix this issue?

@dobryakov read the message from @cpuguy83 - he suggested that the error actually appear on 17.03 as well, but is not reported...

@cpuguy83 - are you saying that the error cannot be seen via dmesg or /var/log/messages neither? or just not propagated via docker daemon on service level? no sure here... if some error occurs, but is not visible than it is even worse :dagger:

@cpuguy83 how could I reproduce this issue on 17.03? Please help me to find and see an error. My containers on 17.03 are stopping, exiting, restarting, and removing (even without -f) without any problem.

@dobryakov
I think the issue would manifest in container not being deleted. So to reproduce it you'd have to run docker ps immediately after docker rm -f and grep the container you've tried to delete.

Well, it would be good to have a flag "ignore shm error". Until then we can grep the error and ignore it manually to mimic 17.03 behavior.

@Vanuan So is it "real" error or just some "indicator" which processed in wrong way by docker? I thought it is really related to unmountable points (as it looks like on my system)...

@archenroot It's a real error. The implementation of docker rm -f prior to 17.06 ignored any errors.

I'm also seeing this issue quite often on some CentOS servers when doing:

docker pull someimage
docker stop somecontainer
docker rm somecontainer
docker run -v /some/vol:/data --name=somecontainer someimage

Only on the CentOS servers that results in the following error. On my other servers running Debian, Ubuntu, CoreOS, etc. that never happened before.

Error response from daemon: Driver overlay failed to remove root filesystem d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a: remove /var/lib/docker/overlay/ae721444d8a5ab1619f96d32bef83151931bf8635af6ce6e627f42c4ea6484c6/merged: device or resource busy
docker: Error response from daemon: Conflict. The container name "/typo3-dev" is already in use by container d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a. You have to remove (or rename) that container to be able to reuse that name..
See 'docker run --help'.
Error response from daemon: Container d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a is not running
Error response from daemon: Container d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a is not running

My environment:

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 17.03.1-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.26.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.51 GiB
Name: bc25355
ID: 5X6A:7BKO:7DIA:TILX:MLKD:3J5M:7OHJ:ZDHQ:PW2S:J4J6:5GR4:ACPO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Has switching the storage driver maybe helped anyone?

I have started seeing this on the Container Linux alpha channel, during the "docker build" process.
It only happens when build multiple containers in parallel.

Best regards / Med venlig hilsen
Tue S. Dissing

On 7 Sep 2017, at 17.15, Hans Höchtl notifications@github.com wrote:

I'm also seeing this issue quite often on some CentOS servers when doing:

docker pull someimage
docker stop somecontainer
docker rm somecontainer
docker run -v /some/vol:/data --name=somecontainer someimage
Resulting in:

Error response from daemon: Driver overlay failed to remove root filesystem d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a: remove /var/lib/docker/overlay/ae721444d8a5ab1619f96d32bef83151931bf8635af6ce6e627f42c4ea6484c6/merged: device or resource busy
docker: Error response from daemon: Conflict. The container name "/typo3-dev" is already in use by container d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a. You have to remove (or rename) that container to be able to reuse that name..
See 'docker run --help'.
Error response from daemon: Container d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a is not running
Error response from daemon: Container d0279a9ba87b127d2ff5926d08e2f5126d0c33aeca68845a36028851a00b053a is not running
My environment:

Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 2
Server Version: 17.03.1-ce
Storage Driver: overlay
Backing Filesystem: extfs
Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-514.26.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.51 GiB
Name: bc25355
ID: 5X6A:7BKO:7DIA:TILX:MLKD:3J5M:7OHJ:ZDHQ:PW2S:J4J6:5GR4:ACPO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Has switching the storage driver maybe helped anyone?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

Best thing to do, if you are not using live-restore, is set MountFlags=slave in the systemd unit for docker. If you don't need to bind-mount mounts that exist on the host into a container, you can do MountFlags=private to be a bit more isolated as well.

Looks like RHEL/CentOS 7.4 has a "detached mount" option:
https://bugzilla.redhat.com/show_bug.cgi?id=1441737
It is "0" by default. Does it mean we should set it to "1"? Or does a recent docker yum package has this option included?

RHEL 7.4 kernel has introduced a new sysctl knob to control kernel behavior. This is called /proc/sys/fs/may_detach_mounts. This knob is set to value 0 by default. Container run times (docker and others) need the new behavior and
want it to be set to 1.

So modify runc package to drop a file say /usr/lib/sysctl.d/99-docker.conf. Contents of this file can be say following.

fs.may_detach_mounts=1

Does anyone with CentOS 7.4 have this problem? Is it fixed with CentOS 7.4 by default?

I don't think CentOS 7.4 exist yet.

+1

Did not have this problem before upgrading to CentOS 7.4. Going to try setting fs.may_detach_mounts=1.

fs.may_detach_mounts=1 should resolve this on 7.4

Hello,

The option fs.may_detach_mounts=1 fixed my problem in CentOS 7.4.

Regards

Working on a patch to make Docker set this param on startup.

34886

So creating a file like /usr/lib/sysctl.d/99-docker.conf with content fs.may_detach_mounts=1fixed your problem in CentOS 7.4?

If this correct the issue, shouldn't this be default configuration, with a comment stating the necessity of CentOs/RedHat 7.4?

@antoinetran I think the main reason the configuration even exists is so that RedHat does not break any existing users (beyond dockerd users), as it is a behavior change.

Upstream kernels support this since 3.15, just without the configuration.

I confirm I have the same experience: with fs.may_detach_mounts=0 the error is easy to reproduce, but I can't reproduce it once the sysctl is set to 1.

In fact, even RHEL sets it to 1 once you install their version of runc (which is not used by docker-ce).

Hi @kolyshkin,

Right, and it's a good solution? Can i trust?

Regards

@cpuguy83 thanks for working on #34886. A few questions:

  1. What can I do to fix this in (CentOS) 7.4 _today_? (I realize I have to set fs.may_detach_mounts=1, but I'm a noob who doesn't know how to do that.) (Update - Answer: https://github.com/moby/moby/issues/34538#issuecomment-331533940)
  2. If I do the above, would I need to undo anything when this fix hits the repo?
  3. Where's the best place to watch so that I know that this fix has hit the Docker (edge) repo (so I can remove the workaround from my Docker host provisioning scripts)?

Hi, we found an workaround by restarting ntpd daemon. After that no container are in 'removal in progress' or 'dead' state. Strange, but for us worked..

Still having the same issue under CentOS 7.4
ERROR: for blubb driver "overlay" failed to remove root filesystem for 7ce02bff8873d4ae7c04481579e67b0a1ff4ffddbfd8b3af739bb87920b8ec43: remove /var/lib/docker/overlay/f54408dd5947eb3d3b6b9321f730d8c5ed6ef6a3a7b3308bcab5dbf549444194/merged: device or resource busy
One good thing: I can move the /var/lib/docker/overlay/f54408dd5947eb3d3b6b9321f730d8c5ed6ef6a3a7b3308bcab5dbf549444194 and then I can run the command successfull..

@owhen have you set the kernel option fs.may_detach_mounts=1?

@Vanuan I created the file /usr/lib/sysctl.d/99-docker.conf and wrote in this file: fs.may_detach_mounts=1

Have you restarted your server after this, or at least be sure that configuration successfully reloaded?

I executed systemctl daemon-reload. That should reload the configuration right?

Not necessarily. Take a look at the contents of /proc/sys/fs/may_detach_mounts to find out the current value. If it's not correct, sysctl -p should reload the sysctl configuration.

/proc/sys/fs/may_detach_mounts does not exists on my system. If I run sysctl -p the property does not appear.
If I add fs.may_detach_mounts = 1 to /etc/sysctl.conf and run sysctl -p I get an error:
sysctl: cannot stat /proc/sys/fs/may_detach_mounts: No such file or directory

Are you sure you're booted into a CentOS 7.4 kernel?

cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)

Have you restarted your system? Please show us uname -a

What does uname -a say for the kernel version?

uname -a
Linux mail 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

That's a CentOS 7.3 kernel. You'll need to reboot into the 7.4 kernel.

The kernel in 7.4 should be the following:

3.10.0-693.2.2.el7.x86_64

Looks like you need to reboot into the CentOS 7.4 kernel, contents of /usr/lib/sysctl.d/99-docker.conf should be fs.may_detach_mounts=1 not fs.may_detach_mounts = 1 then do sysctl -p.

Did a restart. Now the server is running on the new kernel 3.10.0-693.2.2.el7.x86_64
Now, /proc/sys/fs/may_detach_mounts exists with value 1

I second @victorvarza 's comment. I had this issue, and notably had started the ntpd service while the containers were running. I was able to resolve the problem by stopping the ntpd service.

@davestimpert I've create a separate issue for ntpd: docker_ntpd_issue Stopping ntpd/chronyd daemon is not a solution, just an workaround..

It seems to be a very special-case workaround... Does not help in my environment.

oh, goosh, Why're you doing this with me, docker?!

_tearing out one's hair while adding ExecStartPre=-/usr/bin/systemctl restart ntpd.service in puppet template for start of every single docker container_

[*** ~]$ docker info 
Containers: 6
 Running: 6
 Paused: 0
 Stopped: 0
Images: 11
Server Version: 17.06.2-ce
Storage Driver: devicemapper
 Pool Name: docker-thinpool
 Pool Blocksize: 524.3kB
 Base Device Size: 10.74GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 4.999GB
 Data Space Total: 20.4GB
 Data Space Available: 15.4GB
 Metadata Space Used: 1.585MB
 Metadata Space Total: 213.9MB
 Metadata Space Available: 212.3MB
 Thin Pool Minimum Free Space: 2.039GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.140-RHEL7 (2017-05-03)
Logging Driver: syslog
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-693.2.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.632GiB
Name: ***
ID: ***
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: true

It seems like there could be several different causes for this issue, but in my case following From @cognifloyd 's comment revealed that I had (quite a lot of) left over sleeping nginx processes in the host. In my setup nginx is being used as a proxy for various services that are on the docker containers.

Stopping nginx, removing the containers and starting it again was the fastest way to get rid of them for me.

systemctl stop nginx
docker container prune
systemctl start nginx

"resource busy" means some process is using it. Restarting the OS can release the lock. Then you can remove them.

CentOS Linux release 7.4.1708 (Core)
3.10.0-693.5.2.el7.x86_64
17.06.2-ce

LOOKS LIKE IT IS FINALLY FIXED FOR RHEL/CENTOS !!!
https://access.redhat.com/articles/2938171
https://seven.centos.org/2017/08/centos-linux-7-1708-based-on-rhel-7-4-source-code/

In my case the docker volume was unexpectedly blocked by auditd process.
I had to restart instance to be able to remove docker volume.

@fitz123 did you have to set the detach option in 7.4?

Looks like RHEL/CentOS 7.4 has a "detached mount" option:
https://bugzilla.redhat.com/show_bug.cgi?id=1441737
It is "0" by default. Does it mean we should set it to "1"? Or does a recent docker yum package has this option included?

RHEL 7.4 kernel has introduced a new sysctl knob to control kernel behavior. This is called /proc/sys/fs/may_detach_mounts. This knob is set to value 0 by default. Container run times (docker and others) need the new behavior and
want it to be set to 1.

So modify runc package to drop a file say /usr/lib/sysctl.d/99-docker.conf. Contents of this file can be say following.

fs.may_detach_mounts=1

@antoinetran Interesting.

Actually it will be good if there is some sanity check of the system where docker service is installed and maybe on startup report so kind of OK, WARNING, EXPECT_ISSUES :-))) types of messages

@archenroot Docker sets fs.may_detach_mounts=1 on startup... I want to say starting with 17.10, but this may have been backported, I can't remember.

Yes, it was backported to 17.09 https://github.com/docker/docker-ce/pull/247

@antoinetran. no I DON'T have to setup may_detach_mounts option. But I have to remove _--live-restore_

If I setup may_detach_mounts with _--live-restore_ - it still sucks, though error is different))

Sequence to reproduce is:

  1. echo 1 | sudo tee /proc/sys/fs/may_detach_mounts
  2. enable --live-restore
  3. restart docker daemon
  4. docker stop, docker rm container
  5. docker run container
  6. got binding error :(
/usr/bin/docker: Error response from daemon: driver failed programming external connectivity on endpoint container (42367*941dc): Bind for 0.0.0.0:8000 failed: port is already allocated.

_p.s. at least for_ 17.06.2-ce
_p.p.s._ but now it's possibly to kinda be sure your container might be re-created :+1:
This is something). Please Keep moving!

My solution steps:

$ docker -v
Docker version 17.09.0-ce, build afdb6d4
$ docker rm 805c245dad45
Error response from daemon: driver "overlay" failed to remove root filesystem for 805c245dad451542b44bb1b58c60887fa98a64a61f2f0b8de32fa5b13ccc8ce4: remove /var/lib/docker/overlay/8f666b802f418f4a3dc4a6cafbefa79afc81491a5cb23da8084dd14e33afbea0/merged: device or resource busy
$ grep docker /proc/*/mountinfo
/proc/21163/mountinfo:137 107 253:0 /var/lib/docker/overlay /var/lib/docker/overlay rw,relatime shared:91 - xfs /dev/mapper/cl-root rw,attr2,inode64,noquota
/proc/21163/mountinfo:138 137 0:35 / /var/lib/docker/overlay/8f666b802f418f4a3dc4a6cafbefa79afc81491a5cb23da8084dd14e33afbea0/merged rw,relatime shared:92 - overlay overlay rw,lowerdir=/var/lib/docker/overlay/ad94365f2c83432c97dbcb91eba688d4f8158d01c48b8d6135843abd451d4c4c/root,upperdir=/var/lib/docker/overlay/8f666b802f418f4a3dc4a6cafbefa79afc81491a5cb23da8084dd14e33afbea0/upper,workdir=/var/lib/docker/overlay/8f666b802f418f4a3dc4a6cafbefa79afc81491a5cb23da8084dd14e33afbea0/work

$ ps -p 21163 -o comm=
httpd
$ systemctl stop httpd
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
805c245dad45 dperson/samba "samba.sh -u usr;1..." 2 days ago Dead sambaserver
$ docker rm 805c245dad45
805c245dad45
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$

@kakuilan Use nsenter enter to the 21163 mnt namespace, and then umount those merged path also resolve this problem. No need stop the 21163 process.

I met this issue on RHEL 7.2 with Docker 17.06.
when I run docker rm xxx it says

Error response from daemon: driver "overlay2" failed to remove root filesystem for 22136564e833e518579ecc856408194614904dfa2b2adb10bd9e95d7fd75bf15: remove /var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/merged: device or resource busy

it says 3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59 is busy.
So I run grep docker /proc/*/mountinfo | grep 3cece, the result is

/proc/1687/mountinfo:641 549 0:48 / /var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/merged rw,relatime shared:198 - overlay overlay rw,lowerdir=/var/lib/docker/overlay2/l/TDQYOJHG4PQI27TKLUKLUVQFE6:/var/lib/docker/overlay2/l/O23B5NBM2RDWRDWROCHGCAZ4M4:/var/lib/docker/overlay2/l/BQ53Z7BQVPMXTA65L44VEZIILI:/var/lib/docker/overlay2/l/46PJNPP32OGYORW42CTXTRUH2Z:/var/lib/docker/overlay2/l/RYJEGP5BUO5U6MFZPO7LCUMBA7:/var/lib/docker/overlay2/l/53I4CSDHFNLGPE3NGEUDYMWLWA:/var/lib/docker/overlay2/l/WO4EMCT272IRDF4S6B2HLTU3EP:/var/lib/docker/overlay2/l/RPASQ2PDBKFRPOOUVYVW5H5EIQ:/var/lib/docker/overlay2/l/QIYYCMFFNA3365DOPN3K6KSCUN:/var/lib/docker/overlay2/l/VLIQBSYVPJNSSAJXPU2OHKNLGD:/var/lib/docker/overlay2/l/M3FVK2JFV5PGPKMP2TIXJBC2KD,upperdir=/var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/diff,workdir=/var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/work
/proc/1696/mountinfo:727 692 0:48 / /var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/merged rw,relatime shared:256 - overlay overlay rw,lowerdir=/var/lib/docker/overlay2/l/TDQYOJHG4PQI27TKLUKLUVQFE6:/var/lib/docker/overlay2/l/O23B5NBM2RDWRDWROCHGCAZ4M4:/var/lib/docker/overlay2/l/BQ53Z7BQVPMXTA65L44VEZIILI:/var/lib/docker/overlay2/l/46PJNPP32OGYORW42CTXTRUH2Z:/var/lib/docker/overlay2/l/RYJEGP5BUO5U6MFZPO7LCUMBA7:/var/lib/docker/overlay2/l/53I4CSDHFNLGPE3NGEUDYMWLWA:/var/lib/docker/overlay2/l/WO4EMCT272IRDF4S6B2HLTU3EP:/var/lib/docker/overlay2/l/RPASQ2PDBKFRPOOUVYVW5H5EIQ:/var/lib/docker/overlay2/l/QIYYCMFFNA3365DOPN3K6KSCUN:/var/lib/docker/overlay2/l/VLIQBSYVPJNSSAJXPU2OHKNLGD:/var/lib/docker/overlay2/l/M3FVK2JFV5PGPKMP2TIXJBC2KD,upperdir=/var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/diff,workdir=/var/lib/docker/overlay2/3cece7f31f5c379174679e185167c867c1ec393fbf80451c2b18f78a2720aa59/work

so I got two process 1696 and 1687, then I run ps -elf | grep '1696 and 1687, and stop the two process, then I can run docker rm xxx successful, also I start the two processes.
Hope this will help someone.

@imaemo it won't help as it is established that this is fixed in RHEL 7.4

This answer saved me:
$ grep 656cfd09aee399c8ae8c8d3e735fe48d70be6672773616e15579c8de18e2a3b3 /proc/*/mountinfo
then find the related pid and kill it

which outputs something like this, where the number after /proc/ is the pid:

/proc/10001/mountinfo:179...

https://stackoverflow.com/a/47965269/2803344

This may related to a kernel bug. here is the centos side bug [0]. There are several commits to fix this issue.

  • In CentOS7, this issue is fixed since kernel 3.10.0-693.2.2.el7.x86_64
  • If you are using a upstream kernel, please try to upgrade to kernel-3.18.0

[0] https://bugs.centos.org/view.php?id=10414&nbn=8

@jeffrey4l Yes, this has been mentioned several times in this thread. It's fixed in CentOS 7.4
It also requires /proc/sys/fs/may_detach_mounts to be set to 1 (done automatically by installing docker).

We have a series of patches coming in that should resolve this for all kernels:

  • [x] #36047
  • [x] #36055
  • [x] #36096
  • [x] #36107
  • [x] #35830

I'm not sure if any of these will land for 18.02, but 18.03 should be do-able, though not all of these are in yet.

@warmchang : please do not add +1 but use the dedicated smiley button to avoid spam.

Hey - I have this issue on Debian. Will the fixes mentioned by @cpuguy83 apply and resolve even though I'm not on CentOS?

@dylanrhysscott which kernel version is it?

@dylanrhysscott Yes, these would apply to all distros/kernels... although some kernels may experience Device or resource busy errors less frequently, or even never... technically the underlying issue still exists on those kernels and results in inodes which cannot be freed, generally until a reboot occurs.

+1

@SalamiArmy : please do not add +1 but use the dedicated smiley button to avoid spam.

I have only experienced this issue on RHEL 7 in AWS. The following steps always work:
1) Stop VM
2) Start VM
3) docker container rm [CONTAINER_ID]

You said RHEL 7, but did you read the part of this thread that says this is fixed in RedHat >= 7.4 and docker-ce >= 17.09?

@antoinetran , sorry, I should have been clearer. My system is as follows:

lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 7.3 (Maipo)
Release:    7.3
Codename:   Maipo
docker --version
Docker version 17.12.0-ce, build c97c6d6

This may have been fixed in RedHat >= 7.4 but it is still an issue in Docker version 17.12.0-ce

The issue is at kernel level. So this is not exactly a docker issue. Maybe some kernel developpers will try to backport it to 7.3. However, I think that docker is still working on some kind of workaround so that it works for all kernel.

Please upgrade and try on 17.12.1
This should be resolved there.

I have occured the problem
Error response from daemon: driver "overlay" failed to remove root filesystem for 35af9765643bc4608ffff5618b928e59dd6641299defdc77e2cb1bfa5d5e8370: remove /var/lib/docker/overlay/cfe5783bd0d02d7f1fe89f4193e63c9c2d5bdf4dc8b184356278e71e1dec95ce/upper/etc/*

So I execute the command

chattr -i /var/lib/docker/overlay/cfe5783bd0d02d7f1fe89f4193e63c9c2d5bdf4dc8b184356278e71e1dec95ce/upper/etc/group

Removed it successfully

@dwq676 please tell us your environment (OS + kernel). What you did was a workaround. The correct solution is to upgrade docker-ce >= 17.12.1 accordign to cpuguy83 and/or kernel.

@antoinetran
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 178
Server Version: 17.06.1-ce
Storage Driver: overlay
Backing Filesystem: xfs
Supports d_type: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-514.26.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.623GiB
Name: localhost.novalocal
ID: G4IV:TUZU:UUTL:FUKE:TOFN:3QV3:LZ2L:HY3E:X32F:Q2MS:A2TM:XVJE
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

WARNING: overlay: the backing xfs filesystem is formatted without d_type support, which leads to incorrect behavior.
Reformat the filesystem with ftype=1 to enable d_type support.
Running without d_type support will not be supported in future releases.
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Ok, so as I told you, you have an old docker version and you didn't answer me about your OS. The solution is the same, just read the last post.

This issue is fixed and the author should close it.

I'm going to close this as resolved in 17.12.1.

Thanks!

Just for the sake of readiness for other people looking and asking the same question over:

  1. this is fixed in docker-ce >= 17.09 AND updated kernel (for CentOs familty, >=7.4)
  2. this is fixed (to be confirmed) in docker-ce 17.12.1 in all kernels.

This answer saved me
grep docker /proc/*/mountinfo | grep 3cece

Thanks @imaemo
I am going to upgrade my centos and docker

Hi @antoinetran,

I'm running into the same issue with docker 1.13.1 and CentOS 7.4.

My current workaround is,

 # ps -eaf | grep docker
 # kill -9 docker_process
 # reboot vm
 # systemctl stop docker

Thanks in Advance.

Docker 1.13 is an old version and reached end of life in March last year. Current versions of Docker should have fixes for this (but make sure your kernel and distro are up to date as well)

@thaJeztah Can you point me to the url that has the end of line information? Docker 1.13.1 support overlay2 by default, why would it be end of life?

@bamb00 https://docs.docker.com/install/#time-based-release-schedule

Time-based release schedule

Starting with Docker 17.03, Docker uses a time-based release schedule.

  • Docker CE Edge releases generally happen monthly.
  • Docker CE Stable releases generally happen quarterly, with patch releases as needed.

Updates, and patches

  • A given Docker CE Stable release receives patches and updates for one month after the next Docker CE Stable release.
  • A given Docker CE Edge release does not receive any patches or updates after a subsequent Docker CE Edge or Stable release.

@thaJeztah I thought this was a kernel issue and not a docker issue. I'm running CentOS 7.4 - 3.10.0-862.3.2.el7.x86_64.

It's a combination; on CentOS and RHEL, certain kernel features have been backported, but won't be enabled by default; current versions of docker take advantage of those features (in addition to many other improvements that prevent the issue)

Has anyone been able to verify that this does not occur in EL7.4 or up?

@xmj Haven't seen this issue since upgrade to 7.4 in November 2017. At the time was using Docker 17.06. Use 18.03 to be extra sure.

With docker 1.13.1 and CentOS 7.5 (), Do I have to explicitly set /proc/sys/fs/may_detach_mounts to 1? I'm unable to upgrade docker from 1.13.1 for the time being.

i have docker >18 ... i have same problem
rm: impossibile rimuovere "/var/lib/docker/containers/b29f1c32d0fe007feb0ed0ff3c6005a4815af4a6359232e706865762cfe1df73/mounts/shm": Dispositivo o risorsa occupata
rm: impossibile rimuovere "/var/lib/docker/overlay2/ddea08b3871e6d658e3591cc71d40db9bddd4f2ae7d1c9488ac768530ff162d8/merged": Dispositivo o risorsa occupata
docker is stopped

@publicocean0 Did you try these command?

   cat /proc/mounts |grep docker
   sudo umount /path

@publicocean0 Did you try these command?

cat /proc/mounts |grep docker
sudo umount /path

...then clear your /etc/sysconfig/docker-storage file.

Hi @cpuguy83

I'm also getting the error in docker 1.12.6 running kernel 3.10.0-693.21.1.el7.x86_64

   failed: [172.21.56.145] (item=/var/lib/docker) => {"changed": false, "item": "/var/lib/docker", "msg": "rmtree failed: [Errno 16] Device or resource busy: '/var/lib/docker/devicemapper/mnt/31d464385880ecb0972b36040ce912d3018fc91ba2b4f1f4cbf730baad7fa99c'

Unfortunately, I cannot afford to upgrade out of 1.12.6 for the time being. Is there a workaround like upgrading the kernel and/or using 1.12.6-cs13?

Thanks in advance.

umount the docker partition/volume. remove the fedora version. install
straight from docker.

On Mon, Jun 18, 2018, 13:44 bamb00 notifications@github.com wrote:

Hi @cpuguy83 https://github.com/cpuguy83

I'm also getting the error in docker 1.12.6 running kernel
3.10.0-693.21.1.el7.x86_64

failed: [172.22.64.126] (item=/var/lib/docker) => {"changed": false, "item": "/var/lib/docker", "msg": "rmtree failed: [Errno 16] Device or resource busy: '/var/lib/docker/devicemapper/mnt/31d464385880ecb0972b36040ce912d3018fc91ba2b4f1f4cbf730baad7fa99c'

Unfortunately, I'm unable to upgrade out of 1.12.6 for the time being. Is
there a workaround like upgrading the kernel and/or using 1.12.6cs13?

Thanks in advance.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/22260#issuecomment-398154662, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AMqMOet0MyJjEFLtlN9Yqb-IVcZ0-WGFks5t9_UZgaJpZM4IN4Nz
.

This need to be resolve from a production server.

umount the partition/volume that docker_storage_setup created and you
should be able to remove that docker folder.

On Mon, Jun 18, 2018, 15:11 bamb00 notifications@github.com wrote:

This need to be resolve from a production server.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/22260#issuecomment-398179455, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AMqMOe_D0K0kofXHYE0osGvzpve-AsJrks5t-AmJgaJpZM4IN4Nz
.

It's racey, but you can set a PostStart command on the systemd unit to make
sure to set the graphdriver root (e.g. /var/lib/docker/devicemapper) to
"rshared"... mount --make-rshared <driver root>

On Mon, Jun 18, 2018 at 1:32 PM LiverWurst notifications@github.com wrote:

umount the partition/volume that docker_storage_setup created and you
should be able to remove that docker folder.

On Mon, Jun 18, 2018, 15:11 bamb00 notifications@github.com wrote:

This need to be resolve from a production server.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/22260#issuecomment-398179455, or
mute
the thread
<
https://github.com/notifications/unsubscribe-auth/AMqMOe_D0K0kofXHYE0osGvzpve-AsJrks5t-AmJgaJpZM4IN4Nz

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/22260#issuecomment-398185876, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAwxZoEY9okYZDln_SyqlrL20BSi5fLkks5t-A5ugaJpZM4IN4Nz
.

@cpuguy83

You are referring to setting the 'mount --make-rshared /var/lib/docker/devicemapper' command in /etc/systemd/system/docker.service.d/

I wasn't sure if I need to create any arbitrary .conf file in /etc/systemd/system/docker.service.d to run the Poststart mount command.

I wasn't sure if I need to create any arbitrary .conf file in /etc/systemd/system/docker.service.d to run the Poststart mount command.

Yes, to make changes to a systemd unit, it's always recommended to create a "drop-in" ("override") file, and never modify the original unit-file; modifying the original file will cause it to not be updated if an updated version becomes available (i.e., when updating the version of docker you're running)

@LiverWurst Can you explain what you mean by "remove the fedora version"? I'm running CentOS 7.3.

Thanks.

I had installed from the fedora repository with your basic dnf install
docker. I then removed via dnf remove docker and followed the instructions
@ docker.com/community-edition.

On Mon, Jun 25, 2018, 19:02 bamb00 notifications@github.com wrote:

@LiverWurst https://github.com/LiverWurst Can you explain what you mean
by "remove the fedora version"? I'm running CentOS 7.3.

Thanks.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/22260#issuecomment-400132496, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AMqMOY9PDFrzIyY0VFEU_HpLDUQSWsf5ks5uAXoDgaJpZM4IN4Nz
.

Restart the VM.. should work!

docker rm -f embedded-java
Error response from daemon: container 8b91b15cb7939c05fb16cd26d13ee67bd33ca04af3a574193cee95f21e27ad2b: driver "aufs" failed to remove root filesystem: could not remove diff path for id 4119de17501b169eb0e4901dae4bc68e388d92a92f371ee53db9b93ec6970b2d: lstat /var/snap/docker/common/var-lib-docker/aufs/diff/4119de17501b169eb0e4901dae4bc68e388d92a92f371ee53db9b93ec6970b2d-removing/home/yocto/build/tmp/work/x86_64-linux/coreutils-native/8.30-r0/build/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3/confdir3: file name too long

Solution(reboot not help, only rm via sudo):

sudo su
rm -rf /var/snap/docker/common/var-lib-docker/aufs/diff/4119de17501b169eb0e4901dae4bc68e388d92a92f371ee53db9b93ec6970b2d-removing
docker rm -fv embedded-java

If you get such error:

Unable to remove filesystem: /var/lib/docker/container/11667ef16239.../

The solution here(No need to execute service docker restart to restart docker):

# 1. find which process(pid) occupy the fs system
$ find /proc/*/mounts  |xargs -n1 grep -l -E '^shm.*/docker/.*/11667ef16239' | cut -d"/" -f3
1302   # /proc/1302/mounts

# 2. kill this process
$ sudo kill -9 1302
Was this page helpful?
0 / 5 - 0 ratings