Kubernetes: Pods stuck on terminating

Created on 2 Sep 2017  ·  181Comments  ·  Source: kubernetes/kubernetes

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Pods stuck on terminating for a long time

What you expected to happen:
Pods get terminated

How to reproduce it (as minimally and precisely as possible):

  1. Run a deployment
  2. Delete it
  3. Pods are still terminating

Anything else we need to know?:
Kubernetes pods stuck as Terminating for a few hours after getting deleted.

Logs:
kubectl describe pod my-pod-3854038851-r1hc3

Name:               my-pod-3854038851-r1hc3
Namespace:          container-4-production
Node:               ip-172-16-30-204.ec2.internal/172.16.30.204
Start Time:         Fri, 01 Sep 2017 11:58:24 -0300
Labels:             pod-template-hash=3854038851
                release=stable
                run=my-pod-3
Annotations:            kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"container-4-production","name":"my-pod-3-3854038851","uid":"5816c...
                prometheus.io/scrape=true
Status:             Terminating (expires Fri, 01 Sep 2017 14:17:53 -0300)
Termination Grace Period:   30s
IP:
Created By:         ReplicaSet/my-pod-3-3854038851
Controlled By:          ReplicaSet/my-pod-3-3854038851
Init Containers:
  ensure-network:
    Container ID:   docker://guid-1
    Image:      XXXXX
    Image ID:       docker-pullable://repo/ensure-network@sha256:guid-0
    Port:       <none>
    State:      Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:      True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxxx (ro)
Containers:
  container-1:
    Container ID:   docker://container-id-guid-1
    Image:      XXXXX
    Image ID:       docker-pullable://repo/container-1@sha256:guid-2
    Port:       <none>
    State:      Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:      False
    Restart Count:  0
    Limits:
      cpu:  100m
      memory:   1G
    Requests:
      cpu:  100m
      memory:   1G
    Environment:
      XXXX
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxxx (ro)
  container-2:
    Container ID:   docker://container-id-guid-2
    Image:      alpine:3.4
    Image ID:       docker-pullable://alpine@sha256:alpine-container-id-1
    Port:       <none>
    Command:
      X
    State:      Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:      False
    Restart Count:  0
    Limits:
      cpu:  20m
      memory:   40M
    Requests:
      cpu:      10m
      memory:       20M
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxxx (ro)
  container-3:
    Container ID:   docker://container-id-guid-3
    Image:      XXXXX
    Image ID:       docker-pullable://repo/container-3@sha256:guid-3
    Port:       <none>
    State:      Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:      False
    Restart Count:  0
    Limits:
      cpu:  100m
      memory:   200M
    Requests:
      cpu:  100m
      memory:   100M
    Readiness:  exec [nc -zv localhost 80] delay=1s timeout=1s period=5s #success=1 #failure=3
    Environment:
      XXXX
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxxx (ro)
  container-4:
    Container ID:   docker://container-id-guid-4
    Image:      XXXX
    Image ID:       docker-pullable://repo/container-4@sha256:guid-4
    Port:       9102/TCP
    State:      Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:      False
    Restart Count:  0
    Limits:
      cpu:  600m
      memory:   1500M
    Requests:
      cpu:  600m
      memory:   1500M
    Readiness:  http-get http://:8080/healthy delay=1s timeout=1s period=10s #success=1 #failure=3
    Environment:
      XXXX
    Mounts:
      /app/config/external from volume-2 (ro)
      /data/volume-1 from volume-1 (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxxx (ro)
Conditions:
  Type      Status
  Initialized   True
  Ready     False
  PodScheduled  True
Volumes:
  volume-1:
    Type:   Secret (a volume populated by a Secret)
    SecretName: volume-1
    Optional:   false
  volume-2:
    Type:   ConfigMap (a volume populated by a ConfigMap)
    Name:   external
    Optional:   false
  default-token-xxxxx:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-xxxxx
    Optional:   false
QoS Class:  Burstable
Node-Selectors: <none>

sudo journalctl -u kubelet | grep "my-pod"

[...]
Sep 01 17:17:56 ip-172-16-30-204 kubelet[9619]: time="2017-09-01T17:17:56Z" level=info msg="Releasing address using workloadID" Workload=my-pod-3854038851-r1hc3
Sep 01 17:17:56 ip-172-16-30-204 kubelet[9619]: time="2017-09-01T17:17:56Z" level=info msg="Releasing all IPs with handle 'my-pod-3854038851-r1hc3'"
Sep 01 17:17:56 ip-172-16-30-204 kubelet[9619]: time="2017-09-01T17:17:56Z" level=warning msg="Asked to release address but it doesn't exist. Ignoring" Workload=my-pod-3854038851-r1hc3 workloadId=my-pod-3854038851-r1hc3
Sep 01 17:17:56 ip-172-16-30-204 kubelet[9619]: time="2017-09-01T17:17:56Z" level=info msg="Teardown processing complete." Workload=my-pod-3854038851-r1hc3 endpoint=<nil>
Sep 01 17:19:06 ip-172-16-30-204 kubelet[9619]: I0901 17:19:06.591946    9619 kubelet.go:1824] SyncLoop (DELETE, "api"):my-pod-3854038851(b8cf2ecd-8f25-11e7-ba86-0a27a44c875)"

sudo journalctl -u docker | grep "docker-id-for-my-pod"

Sep 01 17:17:55 ip-172-16-30-204 dockerd[9385]: time="2017-09-01T17:17:55.695834447Z" level=error msg="Handler for POST /v1.24/containers/docker-id-for-my-pod/stop returned error: Container docker-id-for-my-pod is already stopped"
Sep 01 17:17:56 ip-172-16-30-204 dockerd[9385]: time="2017-09-01T17:17:56.698913805Z" level=error msg="Handler for POST /v1.24/containers/docker-id-for-my-pod/stop returned error: Container docker-id-for-my-pod is already stopped"

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T15:13:53Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
    Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration**:
    AWS

  • OS (e.g. from /etc/os-release):
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:7"
    HOME_URL="https://www.centos.org/"
    BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

  • Kernel (e.g. uname -a):
    Linux ip-172-16-30-204 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools:
    Kops

  • Others:
    Docker version 1.12.6, build 78d1802

@kubernetes/sig-aws @kubernetes/sig-scheduling

kinbug sinode sistorage

Most helpful comment

I have the same issue on Kubernetes 1.8.2 on IBM Cloud. After new pods are started the old pods are stuck in terminating.

kubectl version
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.2-1+d150e4525193f1", GitCommit:"d150e4525193f1c79569c04efc14599d7deb5f3e", GitTreeState:"clean", BuildDate:"2017-10-27T08:15:17Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

I have used kubectl delete pod xxx --now as well as kubectl delete pod foo --grace-period=0 --force to no avail.

All 181 comments

@kubernetes/sig-aws @kubernetes/sig-scheduling

Usually volume and network cleanup consume more time in termination. Can you find in which phase your pod is stuck? Volume cleanup for example?

Usually volume and network cleanup consume more time in termination.

Correct. They are always suspect.

@igorleao You can try kubectl delete pod xxx --now as well.

Hi @resouer and @dixudx
I'm not sure. Looking at kubelet logs for a different pod with the same problem, I found:

Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: time="2017-09-02T15:31:57Z" level=info msg="Releasing address using workloadID" Workload=my-pod-969733955-rbxhn
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: time="2017-09-02T15:31:57Z" level=info msg="Releasing all IPs with handle 'my-pod-969733955-rbxhn'"
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: time="2017-09-02T15:31:57Z" level=warning msg="Asked to release address but it doesn't exist. Ignoring" Workload=my-pod-969733955-rbxhn workloadId=my-pod-969733955-rbxhn
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: time="2017-09-02T15:31:57Z" level=info msg="Teardown processing complete." Workload=my-pod-969733955-rbxhn endpoint=<nil>
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: I0902 15:31:57.496132    9620 qos_container_manager_linux.go:285] [ContainerManager]: Updated QoS cgroup configuration
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: I0902 15:31:57.968147    9620 reconciler.go:201] UnmountVolume operation started for volume "kubernetes.io/secret/GUID-default-token-wrlv3" (spec.Name: "default-token-wrlv3") from pod "GUID" (UID: "GUID").
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: I0902 15:31:57.968245    9620 reconciler.go:201] UnmountVolume operation started for volume "kubernetes.io/secret/GUID-token-key" (spec.Name: "token-key") from pod "GUID" (UID: "GUID").
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: E0902 15:31:57.968537    9620 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/secret/GUID-token-key\" (\"GUID\")" failed. No retries permitted until 2017-09-02 15:31:59.968508761 +0000 UTC (durationBeforeRetry 2s). Error: UnmountVolume.TearDown failed for volume "kubernetes.io/secret/GUID-token-key" (volume.spec.Name: "token-key") pod "GUID" (UID: "GUID") with: rename /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/token-key /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/wrapped_token-key.deleting~818780979: device or resource busy
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: E0902 15:31:57.968744    9620 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/secret/GUID-default-token-wrlv3\" (\"GUID\")" failed. No retries permitted until 2017-09-02 15:31:59.968719924 +0000 UTC (durationBeforeRetry 2s). Error: UnmountVolume.TearDown failed for volume "kubernetes.io/secret/GUID-default-token-wrlv3" (volume.spec.Name: "default-token-wrlv3") pod "GUID" (UID: "GUID") with: rename /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/default-token-wrlv3 /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/wrapped_default-token-wrlv3.deleting~940140790: device or resource busy
--
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778742    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_default-token-wrlv3.deleting~940140790" (spec.Name: "wrapped_default-token-wrlv3.deleting~940140790") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778753    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_token-key.deleting~850807831" (spec.Name: "wrapped_token-key.deleting~850807831") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778764    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_token-key.deleting~413655961" (spec.Name: "wrapped_token-key.deleting~413655961") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778774    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_token-key.deleting~818780979" (spec.Name: "wrapped_token-key.deleting~818780979") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778784    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_token-key.deleting~348212189" (spec.Name: "wrapped_token-key.deleting~348212189") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778796    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_token-key.deleting~848395852" (spec.Name: "wrapped_token-key.deleting~848395852") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778808    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_default-token-wrlv3.deleting~610264100" (spec.Name: "wrapped_default-token-wrlv3.deleting~610264100") devicePath: ""
Sep 02 15:33:04 ip-172-16-30-208 kubelet[9620]: I0902 15:33:04.778820    9620 reconciler.go:363] Detached volume "kubernetes.io/secret/GUID-wrapped_token-key.deleting~960022821" (spec.Name: "wrapped_token-key.deleting~960022821") devicePath: ""
Sep 02 15:33:05 ip-172-16-30-208 kubelet[9620]: I0902 15:33:05.081380    9620 server.go:778] GET /stats/summary/: (37.027756ms) 200 [[Go-http-client/1.1] 10.0.46.202:54644]
Sep 02 15:33:05 ip-172-16-30-208 kubelet[9620]: I0902 15:33:05.185367    9620 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/GUID-calico-token-w8tzx" (spec.Name: "calico-token-w8tzx") pod "GUID" (UID: "GUID").
Sep 02 15:33:07 ip-172-16-30-208 kubelet[9620]: I0902 15:33:07.187953    9620 kubelet.go:1824] SyncLoop (DELETE, "api"): "my-pod-969733955-rbxhn_container-4-production(GUID)"
Sep 02 15:33:13 ip-172-16-30-208 kubelet[9620]: I0902 15:33:13.879940    9620 aws.go:937] Could not determine public DNS from AWS metadata.
Sep 02 15:33:20 ip-172-16-30-208 kubelet[9620]: I0902 15:33:20.736601    9620 server.go:778] GET /metrics: (53.063679ms) 200 [[Prometheus/1.7.1] 10.0.46.198:43576]
Sep 02 15:33:23 ip-172-16-30-208 kubelet[9620]: I0902 15:33:23.898078    9620 aws.go:937] Could not determine public DNS from AWS metadata.

As you can see, this cluster has Calico for CNI.
The following lines bring my attention:

Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: I0902 15:31:57.968245    9620 reconciler.go:201] UnmountVolume operation started for volume "kubernetes.io/secret/GUID-token-key" (spec.Name: "token-key") from pod "GUID" (UID: "GUID").
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: E0902 15:31:57.968537    9620 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/secret/GUID-token-key\" (\"GUID\")" failed. No retries permitted until 2017-09-02 15:31:59.968508761 +0000 UTC (durationBeforeRetry 2s). Error: UnmountVolume.TearDown failed for volume "kubernetes.io/secret/GUID-token-key" (volume.spec.Name: "token-key") pod "GUID" (UID: "GUID") with: rename /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/token-key /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/wrapped_token-key.deleting~818780979: device or resource busy
Sep 02 15:31:57 ip-172-16-30-208 kubelet[9620]: E0902 15:31:57.968744    9620 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/secret/GUID-default-token-wrlv3\" (\"GUID\")" failed. No retries permitted until 2017-09-02 15:31:59.968719924 +0000 UTC (durationBeforeRetry 2s). Error: UnmountVolume.TearDown failed for volume "kubernetes.io/secret/GUID-default-token-wrlv3" (volume.spec.Name: "default-token-wrlv3") pod "GUID" (UID: "GUID") with: rename 

Is there a better way find out which phase a pod is stuck?

kubectl delete pod xxx --now seems to work pretty well, but I really wish to find out its root cause and avoid human interaction.

rename /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/token-key /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/wrapped_token-key.deleting~818780979: device or resource busy

Seems kubelet/mount failed to mount configmap as a volume due to such file renaming.

@igorleao Is this reproducible? Or it is just not that stable, happening occasionally. I've met such errors before, just to make sure.

@dixudx it happens several times a day for a certain cluster. Others clusters created with the same verstion of kops and kubernetes, in the same week, work just fine.

@igorleao As the log shows that the volume manager failed to remove the secrete directory because device is busy.
Could you please check whether the directory /var/lib/kubelet/pods/GUID/volumes/kubernetes.io~secret/token-key is still mounted or not? Thanks!

@igorleao how do you run kubelet? in container? if so can you please post your systemd unit or docker config for kubelet?

We see similar behaviour. We run kubelet as container and problem was partially mitigated by mounting /var/lib/kubelet as shared (by default docker mounts volume as rslave). But still we see similar issues, but less frequent. Currently i suspect that some other mounts should be done different way (e.g. /var/lib/docker or /rootfs)

@stormltf Can you please post your kubelet container configuration?

@stormltf you're running kubelet in container and don't use --containerized flag (which do some tricks with mounts). Which basically means that all mounts that kubelet does will be done in container mount namespace. Good thing that they will be proposed back to host machine's namespace (as you have /var/lib/kubelet as shared), but i'm not sure what happens is namespace removed (when kubelet container removed).

Can you please for stuck pods do following:

on node where pod is running

  • docker exec -ti /kubelet /bin/bash -c "mount | grep STUCK_POD_UUID"
  • and same on node itself mount | grep STUCK_POD_UUID.

Please also do same for freshly created pod. I excpect to see some /var/lib/kubelet mounts (e.g. default-secret)

@stormltf did you restart kubelet after first two pods were created?

@stormltf You can try to make /var/lib/docker and /rootfs as shared (which i don't see in your docker inspect, but see inside container) mountpoint.

/sig storage

For some it might help. We are running kubelet in docker container with --containerized flag and were able to solve this issue with mounting /rootfs, /var/lib/docker and /var/lib/kubelet as shared mounts. Final mounts look like this

      -v /:/rootfs:ro,shared \
      -v /sys:/sys:ro \
      -v /dev:/dev:rw \
      -v /var/log:/var/log:rw \
      -v /run/calico/:/run/calico/:rw \
      -v /run/docker/:/run/docker/:rw \
      -v /run/docker.sock:/run/docker.sock:rw \
      -v /usr/lib/os-release:/etc/os-release \
      -v /usr/share/ca-certificates/:/etc/ssl/certs \
      -v /var/lib/docker/:/var/lib/docker:rw,shared \
      -v /var/lib/kubelet/:/var/lib/kubelet:rw,shared \
      -v /etc/kubernetes/ssl/:/etc/kubernetes/ssl/ \
      -v /etc/kubernetes/config/:/etc/kubernetes/config/ \
      -v /etc/cni/net.d/:/etc/cni/net.d/ \
      -v /opt/cni/bin/:/opt/cni/bin/ \

For some more details. This does not properly solve the problem as for every bind mount you'll get 3 mounts inside kubelet container (2 parasite). But at least shared mount allow to easily unmount them with one shot.

CoreOS does not have this problem. Because the use rkt and not docker for kubelet container. In case our case kubelet runs in Docker and every mount inside kubelet continer gets proposed into /var/lib/docker/overlay/... and /rootfs that's why we have two parasite mounts for every bind mount volume:

  • one from /rootfs in /rootfs/var/lib/kubelet/<mount>
  • one from /var/lib/docker in /var/lib/docker/overlay/.../rootfs/var/lib/kubelet/<mount>
-v /dev:/dev:rw 
-v /etc/cni:/etc/cni:ro 
-v /opt/cni:/opt/cni:ro 
-v /etc/ssl:/etc/ssl:ro 
-v /etc/resolv.conf:/etc/resolv.conf 
-v /etc/pki/tls:/etc/pki/tls:ro 
-v /etc/pki/ca-trust:/etc/pki/ca-trust:ro
-v /sys:/sys:ro 
-v /var/lib/docker:/var/lib/docker:rw 
-v /var/log:/var/log:rw
-v /var/lib/kubelet:/var/lib/kubelet:shared 
-v /var/lib/cni:/var/lib/cni:shared 
-v /var/run:/var/run:rw 
-v /www:/www:rw 
-v /etc/kubernetes:/etc/kubernetes:ro 
-v /etc/os-release:/etc/os-release:ro 
-v /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro

I have the same issue with Kubernetes 1.8.1 on Azure - after deployment is changed and new pods are have been started, the old pods are stuck at terminating.

I have the same issue on Kubernetes 1.8.2 on IBM Cloud. After new pods are started the old pods are stuck in terminating.

kubectl version
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.2-1+d150e4525193f1", GitCommit:"d150e4525193f1c79569c04efc14599d7deb5f3e", GitTreeState:"clean", BuildDate:"2017-10-27T08:15:17Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

I have used kubectl delete pod xxx --now as well as kubectl delete pod foo --grace-period=0 --force to no avail.

If root cause still the same (improperly proposed mounts) then this is distribution specific bug imo.

Please describe how you run kubelet run in IBM cloud? systemd unit? does it have --containerized flag?

it is run with --containerized flag set to false.

systemctl status kubelet.service kubelet.service - Kubernetes Kubelet Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2017-11-19 21:48:48 UTC; 4 days ago

--containerized flag: No

ok, i need more info, please see my comment above https://github.com/kubernetes/kubernetes/issues/51835#issuecomment-333090349

and also please show contents of /lib/systemd/system/kubelet.service and if there anything about kubelet in /etc/systemd/system please share too.

In particual, if kubelet runs in docker i want to see all bind mounts -v.

Today I encountered an issue that may be the same as the one described, where we had pods on one of our customer systems getting stuck in the terminating state for several day's. We were also seeing the errors about "Error: UnmountVolume.TearDown failed for volume" with "device or resource busy" repeated for each of the stuck pods.

In our case, it appears to be an issue with docker on RHEL/Centos 7.4 based systems covered in this moby issue: https://github.com/moby/moby/issues/22260 and this moby PR: https://github.com/moby/moby/pull/34886/files

For us, once we set the sysctl option fs.may_detach_mounts=1 within a couple minutes all our Terminating pods cleaned up.

I'm also facing this problem: Pods got stuck in Terminating state on 1.8.3.

Relevant kubelet logs from the node:

Nov 28 22:48:51 <my-node> kubelet[1010]: I1128 22:48:51.616749    1010 reconciler.go:186] operationExecutor.UnmountVolume started for volume "nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw" (UniqueName: "kubernetes.io/nfs/58dc413c-d4d1-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw") pod "58dc413c-d4d1-11e7-870d-3c970e298d91" (UID: "58dc413c-d4d1-11e7-870d-3c970e298d91")
Nov 28 22:48:51 <my-node> kubelet[1010]: W1128 22:48:51.616762    1010 util.go:112] Warning: "/var/lib/kubelet/pods/58dc413c-d4d1-11e7-870d-3c970e298d91/volumes/kubernetes.io~nfs/nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw" is not a mountpoint, deleting
Nov 28 22:48:51 <my-node> kubelet[1010]: E1128 22:48:51.616828    1010 nestedpendingoperations.go:264] Operation for "\"kubernetes.io/nfs/58dc413c-d4d1-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw\" (\"58dc413c-d4d1-11e7-870d-3c970e298d91\")" failed. No retries permitted until 2017-11-28 22:48:52.616806562 -0800 PST (durationBeforeRetry 1s). Error: UnmountVolume.TearDown failed for volume "nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw" (UniqueName: "kubernetes.io/nfs/58dc413c-d4d1-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw") pod "58dc413c-d4d1-11e7-870d-3c970e298d91" (UID: "58dc413c-d4d1-11e7-870d-3c970e298d91") : remove /var/lib/kubelet/pods/58dc413c-d4d1-11e7-870d-3c970e298d91/volumes/kubernetes.io~nfs/nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw: directory not empty
Nov 28 22:48:51 <my-node> kubelet[1010]: W1128 22:48:51.673774    1010 docker_sandbox.go:343] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "<pod>": CNI failed to retrieve network namespace path: Cannot find network namespace for the terminated container "f58ab11527aef5133bdb320349fe14fd94211aa0d35a1da006aa003a78ce0653"

Kubelet is running as systemd unit (not in container) on Ubuntu 16.04.
As you can see, there was a mount to NFS server and somehow kubelet tried to delete the mount directory because it considers this directory as non-mounted.

Volumes spec from the pod:

volumes:
  - name: nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw
    nfs:
      path: /<path>
      server: <IP>
  - name: default-token-rzqtt
    secret:
      defaultMode: 420
      secretName: default-token-rzqtt

UPD: I faced this problem before as well on 1.6.6

Experiencing the same on Azure..

NAME                        READY     STATUS        RESTARTS   AGE       IP             NODE
busybox2-7db6d5d795-fl6h9   0/1       Terminating   25         1d        10.200.1.136   worker-1
busybox3-69d4f5b66c-2lcs6   0/1       Terminating   26         1d        <none>         worker-2
busybox7-797cc644bc-n5sv2   0/1       Terminating   26         1d        <none>         worker-2
busybox8-c8f95d979-8lk27    0/1       Terminating   25         1d        10.200.1.137   worker-1
nginx-56ccc998dd-hvpng      0/1       Terminating   0          2h        <none>         worker-1
nginx-56ccc998dd-nnsvj      0/1       Terminating   0          2h        <none>         worker-2
nginx-56ccc998dd-rsrvq      0/1       Terminating   0          2h        <none>         worker-1

kubectl version

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:57:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:46:41Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

describe pod nginx-56ccc998dd-nnsvj

Name:                      nginx-56ccc998dd-nnsvj
Namespace:                 default
Node:                      worker-2/10.240.0.22
Start Time:                Wed, 29 Nov 2017 13:33:39 +0400
Labels:                    pod-template-hash=1277755488
                           run=nginx
Annotations:               kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"nginx-56ccc998dd","uid":"614f71db-d4e8-11e7-9c45-000d3a25e3c0","...
Status:                    Terminating (expires Wed, 29 Nov 2017 15:13:44 +0400)
Termination Grace Period:  30s
IP:
Created By:                ReplicaSet/nginx-56ccc998dd
Controlled By:             ReplicaSet/nginx-56ccc998dd
Containers:
  nginx:
    Container ID:   containerd://d00709dfb00ed5ac99dcd092978e44fc018f44cca5229307c37d11c1a4fe3f07
    Image:          nginx:1.12
    Image ID:       docker.io/library/nginx@sha256:5269659b61c4f19a3528a9c22f9fa8f4003e186d6cb528d21e411578d1e16bdb
    Port:           <none>
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-jm7h5 (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  default-token-jm7h5:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-jm7h5
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     <none>
Events:
  Type    Reason   Age   From               Message
  ----    ------   ----  ----               -------
  Normal  Killing  41m   kubelet, worker-2  Killing container with id containerd://nginx:Need to kill Pod

sudo journalctl -u kubelet | grep "nginx-56ccc998dd-nnsvj"

Nov 29 09:33:39 worker-2 kubelet[64794]: I1129 09:33:39.124779   64794 kubelet.go:1837] SyncLoop (ADD, "api"): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)"
Nov 29 09:33:39 worker-2 kubelet[64794]: I1129 09:33:39.160444   64794 reconciler.go:212] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-jm7h5" (UniqueName: "kubernetes.io/secret/6171e2a7-d4e8-11e7-9c45-000d3a25e3c0-default-token-jm7h5") pod "nginx-56ccc998dd-nnsvj" (UID: "6171e2a7-d4e8-11e7-9c45-000d3a25e3c0")
Nov 29 09:33:39 worker-2 kubelet[64794]: I1129 09:33:39.261128   64794 reconciler.go:257] operationExecutor.MountVolume started for volume "default-token-jm7h5" (UniqueName: "kubernetes.io/secret/6171e2a7-d4e8-11e7-9c45-000d3a25e3c0-default-token-jm7h5") pod "nginx-56ccc998dd-nnsvj" (UID: "6171e2a7-d4e8-11e7-9c45-000d3a25e3c0")
Nov 29 09:33:39 worker-2 kubelet[64794]: I1129 09:33:39.286574   64794 operation_generator.go:484] MountVolume.SetUp succeeded for volume "default-token-jm7h5" (UniqueName: "kubernetes.io/secret/6171e2a7-d4e8-11e7-9c45-000d3a25e3c0-default-token-jm7h5") pod "nginx-56ccc998dd-nnsvj" (UID: "6171e2a7-d4e8-11e7-9c45-000d3a25e3c0")
Nov 29 09:33:39 worker-2 kubelet[64794]: I1129 09:33:39.431485   64794 kuberuntime_manager.go:370] No sandbox for pod "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)" can be found. Need to start a new one
Nov 29 09:33:42 worker-2 kubelet[64794]: I1129 09:33:42.449592   64794 kubelet.go:1871] SyncLoop (PLEG): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)", event: &pleg.PodLifecycleEvent{ID:"6171e2a7-d4e8-11e7-9c45-000d3a25e3c0", Type:"ContainerStarted", Data:"0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af"}
Nov 29 09:33:47 worker-2 kubelet[64794]: I1129 09:33:47.637988   64794 kubelet.go:1871] SyncLoop (PLEG): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)", event: &pleg.PodLifecycleEvent{ID:"6171e2a7-d4e8-11e7-9c45-000d3a25e3c0", Type:"ContainerStarted", Data:"d00709dfb00ed5ac99dcd092978e44fc018f44cca5229307c37d11c1a4fe3f07"}
Nov 29 11:13:14 worker-2 kubelet[64794]: I1129 11:13:14.468137   64794 kubelet.go:1853] SyncLoop (DELETE, "api"): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)"
Nov 29 11:13:14 worker-2 kubelet[64794]: E1129 11:13:14.711891   64794 kuberuntime_manager.go:840] PodSandboxStatus of sandbox "0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af" for pod "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)" error: rpc error: code = Unknown desc = failed to get task status for sandbox container "0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af": process id 0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af not found: not found
Nov 29 11:13:14 worker-2 kubelet[64794]: E1129 11:13:14.711933   64794 generic.go:241] PLEG: Ignoring events for pod nginx-56ccc998dd-nnsvj/default: rpc error: code = Unknown desc = failed to get task status for sandbox container "0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af": process id 0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af not found: not found
Nov 29 11:13:15 worker-2 kubelet[64794]: I1129 11:13:15.788179   64794 kubelet.go:1871] SyncLoop (PLEG): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)", event: &pleg.PodLifecycleEvent{ID:"6171e2a7-d4e8-11e7-9c45-000d3a25e3c0", Type:"ContainerDied", Data:"d00709dfb00ed5ac99dcd092978e44fc018f44cca5229307c37d11c1a4fe3f07"}
Nov 29 11:13:15 worker-2 kubelet[64794]: I1129 11:13:15.788221   64794 kubelet.go:1871] SyncLoop (PLEG): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)", event: &pleg.PodLifecycleEvent{ID:"6171e2a7-d4e8-11e7-9c45-000d3a25e3c0", Type:"ContainerDied", Data:"0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af"}
Nov 29 11:46:45 worker-2 kubelet[42337]: I1129 11:46:45.384411   42337 kubelet.go:1837] SyncLoop (ADD, "api"): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0), kubernetes-dashboard-7486b894c6-2xmd5_kube-system(e55ca22c-d416-11e7-9c45-000d3a25e3c0), busybox3-69d4f5b66c-2lcs6_default(adb05024-d412-11e7-9c45-000d3a25e3c0), kube-dns-7797cb8758-zblzt_kube-system(e925cbec-d40b-11e7-9c45-000d3a25e3c0), busybox7-797cc644bc-n5sv2_default(b7135a8f-d412-11e7-9c45-000d3a25e3c0)"
Nov 29 11:46:45 worker-2 kubelet[42337]: I1129 11:46:45.387169   42337 kubelet.go:1871] SyncLoop (PLEG): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)", event: &pleg.PodLifecycleEvent{ID:"6171e2a7-d4e8-11e7-9c45-000d3a25e3c0", Type:"ContainerDied", Data:"d00709dfb00ed5ac99dcd092978e44fc018f44cca5229307c37d11c1a4fe3f07"}
Nov 29 11:46:45 worker-2 kubelet[42337]: I1129 11:46:45.387245   42337 kubelet.go:1871] SyncLoop (PLEG): "nginx-56ccc998dd-nnsvj_default(6171e2a7-d4e8-11e7-9c45-000d3a25e3c0)", event: &pleg.PodLifecycleEvent{ID:"6171e2a7-d4e8-11e7-9c45-000d3a25e3c0", Type:"ContainerDied", Data:"0f539a84b96814651bb199e91f71157bc90c6e0c26340001c3f1c9f7bd9165af"}

cat /etc/systemd/system/kubelet.service

[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=cri-containerd.service
Requires=cri-containerd.service

[Service]
ExecStart=/usr/local/bin/kubelet \
  --allow-privileged=true \
  --anonymous-auth=false \
  --authorization-mode=Webhook \
  --client-ca-file=/var/lib/kubernetes/ca.pem \
  --cluster-dns=10.32.0.10 \
  --cluster-domain=cluster.local \
  --container-runtime=remote \
  --container-runtime-endpoint=unix:///var/run/cri-containerd.sock \
  --image-pull-progress-deadline=2m \
  --kubeconfig=/var/lib/kubelet/kubeconfig \
  --network-plugin=cni \
  --pod-cidr=10.200.2.0/24 \
  --register-node=true \
  --require-kubeconfig \
  --runtime-request-timeout=15m \
  --tls-cert-file=/var/lib/kubelet/worker-2.pem \
  --tls-private-key-file=/var/lib/kubelet/worker-2-key.pem \
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Looks there are to different bug related to with issue. We have both on the our 1.8.3 cluster.

  1. https://github.com/moby/moby/issues/31768 .It's docker bug. Reproducible on the docker-ce=17.09.0~ce-0~ubuntu.
  2. Second is more interesting and maybe related to some race condition inside kubelet.
    We have a lot of pods that used NFS persistence volume with specified subpath in container mounts, somehow some of them is getting stuck in a terminating state after deleting deployments. And there is a lot of messages in the syslog:
 Error: UnmountVolume.TearDown failed for volume "nfs-test" (UniqueName: "kubernetes.io/nfs/39dada78-d9cc-11e7-870d-3c970e298d91-nfs-test") pod "39dada78-d9cc-11e7-870d-3c970e298d91" (UID: "39dada78-d9cc-11e7-870d-3c970e298d91") : remove /var/lib/kubelet/pods/39dada78-d9cc-11e7-870d-3c970e298d91/volumes/kubernetes.io~nfs/nfs-test: directory not empty

And it's true directory is not empty, it's unmounted and contains our "subpath" directory!
One of the explanation of such behavior:

  1. P1: Start create pod or sync pod
  2. P1: Send signal to volume manger to make mounts/remounts.
  3. P1: Is waiting for mount to be completed.
  4. P1: Receive success mount signal(Actually just check that all volumes are mounted)
  5. Somehow volume become unmounted. May be another deletion process unmount it or some OS bug, or some garbage collector action.
  6. P1: Continue creating container and creates subdirectory in the mount point(already unmounted).
  7. After all previous step pod can't be deleted, because the mount directory isn't empty.

More logs:

Dec  5 15:57:08 ASRock kubelet[2941]: I1205 15:57:08.333877    2941 reconciler.go:212] operationExecutor.VerifyControllerAttachedVolume started for volume "nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw" (UniqueName: "kubernetes.io/nfs/005b4bb9-da18-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw") pod "test-df5d868fc-sclj5" (UID: "005b4bb9-da18-11e7-870d-3c970e298d91")
Dec  5 15:57:08 ASRock systemd[1]: Started Kubernetes transient mount for /var/lib/kubelet/pods/005b4bb9-da18-11e7-870d-3c970e298d91/volumes/kubernetes.io~nfs/nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw.
Dec  5 15:57:12 ASRock kubelet[2941]: I1205 15:57:12.266404    2941 reconciler.go:186] operationExecutor.UnmountVolume started for volume "nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw" (UniqueName: "kubernetes.io/nfs/005b4bb9-da18-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw") pod "005b4bb9-da18-11e7-870d-3c970e298d91" (UID: "005b4bb9-da18-11e7-870d-3c970e298d91")
Dec  5 15:57:12 ASRock kubelet[2941]: E1205 15:57:12.387179    2941 nestedpendingoperations.go:264] Operation for "\"kubernetes.io/nfs/005b4bb9-da18-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw\" (\"005b4bb9-da18-11e7-870d-3c970e298d91\")" failed. No retries permitted until 2017-12-05 15:57:12.887062059 -0800 PST (durationBeforeRetry 500ms). Error: UnmountVolume.TearDown failed for volume "nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw" (UniqueName: "kubernetes.io/nfs/005b4bb9-da18-11e7-870d-3c970e298d91-nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw") pod "005b4bb9-da18-11e7-870d-3c970e298d91" (UID: "005b4bb9-da18-11e7-870d-3c970e298d91") : remove /var/lib/kubelet/pods/005b4bb9-da18-11e7-870d-3c970e298d91/volumes/kubernetes.io~nfs/nfs-mtkylje2oc4xlju1ls9rdwjlcmxhyi1ydw: directory not empty

Somehow some cleanup process((dswp *desiredStateOfWorldPopulator) findAndRemoveDeletedPods()) start unmount volumes while pod is in the initialization state:

Dec  6 14:40:20 ASRock kubelet[15875]: I1206 14:40:20.620655   15875 kubelet_pods.go:886] Pod "test-84cd5ff8dc-kpv7b_4281-kuberlab-test(6e99a8df-dad6-11e7-b35c-3c970e298d91)" is terminated, but some volumes have not been cleaned up
Dec  6 14:40:20 ASRock kubelet[15875]: I1206 14:40:20.686449   15875 kubelet_pods.go:1730] Orphaned pod "6e99a8df-dad6-11e7-b35c-3c970e298d91" found, but volumes not yet removed
Dec  6 14:40:20 ASRock kubelet[15875]: I1206 14:40:20.790719   15875 kuberuntime_container.go:100] Generating ref for container test: &v1.ObjectReference{Kind:"Pod", Namespace:"4281-kuberlab-test", Name:"test-84cd5ff8dc-kpv7b", UID:"6e99a8df-dad6-11e7-b35c-3c970e298d91", APIVersion:"v1", ResourceVersion:"2639758", FieldPath:"spec.containers{test}"}
Dec  6 14:40:20 ASRock kubelet[15875]: I1206 14:40:20.796643   15875 docker_service.go:407] Setting cgroup parent to: "/kubepods/burstable/pod6e99a8df-dad6-11e7-b35c-3c970e298d91"

Initialization and deleting pod is executing in the same time.
To repeat with bug you should start and immediately delete/update about 10 deployments (tested for single minion) and perhaps your mount operation must be not very fast.

Affected by the same bug on GKE. Are there any known workarounds for this issue? Using --now does not work.

I have fix for this bug, but i am not sure that it will merged by kubernetes team.

@dreyk Could you please provide more details about what you discovered for this bug and what is your fix so storage team can take a look? Thanks!

@gm42 I was able to manually work around this issue on GKE by:

  1. SSH into the node the stuck pod was scheduled on
  2. Running docker ps | grep {pod name} to get the Docker Container ID
  3. Running docker rm -f {container id}

On GKE, upgrading nodes helped instantly.

Have the same bug on my local cluster set up using kubeadm.

docker ps | grep {pod name} on the node shows nothing, and pod stuck in terminating state. I currently have two pods in this state.

What can I do to forcefully delete the pod? Or maybe change name of the pod? I cannot spin up another pod under the same name. Thanks!

I have found the reason in my 1.7.2 Cluster,
because another monitor program mount the root path /
the root path contain /var/lib/kubelet/pods/ddc66e10-0711-11e8-b905-6c92bf70b164/volumes/kubernetes.io~secret/default-token-bnttf
so when kubelet delete pod , but It's can't release the volume, the message is :
device or resource busy

steps:
1) sudo journalctl -u kubelet 
this shell help me find the error mesage,
2) sudo docker inspect
find the io.kubernetes.pod.uid": "ddc66e10-0711-11e8-b905-6c92bf70b164"
and
HostConfig-->Binds--> "/var/lib/kubelet/pods/ddc66e10-0711-11e8-b905-6c92bf70b164/volumes/kubernetes.io~secret/default-token-bnttf:/var/run/secrets/kubernetes.io/serviceaccount:ro"

3) grep -l ddc66e10-0711-11e8-b905-6c92bf70b164 /proc/*/mountinfo

/proc/90225/mountinfo
5) ps aux | grep 90225
root 90225 1.3 0.0 2837164 42580 ? Ssl Feb01 72:40 ./monitor_program

Have the same bug on my 1.7.2

operationExecutor.UnmountVolume started for volume "default-token-bnttf" (UniqueName: "kubernetes.io/secret/ddc66e10-0711-11e8-b905-6c92bf70b164-default-token-bnttf") pod "ddc66e10-0711-11e8-b905-6c92bf70b164" kubelet[94382]: E0205 11:35:50.509169 94382 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/secret/ddc66e10-0711-11e8-b905-6c92bf70b164-default-token-bnttf\" (\"ddc66e10-0711-11e8-b905-6c92bf70b164\")" failed. No retries permitted until 2018-02-05 11:37:52.509148953 +0800 CST (durationBeforeRetry 2m2s). Error: UnmountVolume.TearDown failed for volume "default-token-bnttf" (UniqueName: "kubernetes.io/secret/ddc66e10-0711-11e8-b905-6c92bf70b164-default-token-bnttf") pod "ddc66e10-0711-11e8-b905-6c92bf70b164" (UID: "ddc66e10-0711-11e8-b905-6c92bf70b164") : remove /var/lib/kubelet/pods/ddc66e10-0711-11e8-b905-6c92bf70b164/volumes/kubernetes.io~secret/default-token-bnttf: device or resource busy

Restarting docker service releases the lock and pods gets removed within few minutes. This is a bug. Using docker 17.03

Same issue here on Azure, Kube 1.8.7

Happened to us a few minutes ago on 1.8.9 too - anybody is looking for resolving this ? Restarting docker helps, but it's a bit ridiculous.

This has been happening to me a lot on the latest 1.9.4 release on GKE. Been doing this for now:

kubectl delete pod NAME --grace-period=0 --force

Same problem here on GKE 1.9.4-gke.1
seems to be related to volume mounts.
It happens every time with filebeats set up as described here:
https://github.com/elastic/beats/tree/master/deploy/kubernetes/filebeat

Kubelet log shows this:

Mar 23 19:44:16 gke-testing-c2m4-1-97b57429-40jp kubelet[1361]: I0323 19:44:16.380949    1361 reconciler.go:191] operationExecutor.UnmountVolume started for volume "config" (UniqueName: "kubernetes.io/configmap/9a5f1519-2d39-11e8-bec8-42010a8400f3-config") pod "9a5f1519-2d39-11e8-bec8-42010a8400f3" (UID: "9a5f1519-2d39-11e8-bec8-42010a8400f3")
Mar 23 19:44:16 gke-testing-c2m4-1-97b57429-40jp kubelet[1361]: E0323 19:44:16.382032    1361 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/configmap/9a5f1519-2d39-11e8-bec8-42010a8400f3-config\" (\"9a5f1519-2d39-11e8-bec8-42010a8400f3\")" failed. No retries permitted until 2018-03-23 19:44:32.381982706 +0000 UTC m=+176292.263058344 (durationBeforeRetry 16s). Error: "error cleaning subPath mounts for volume \"config\" (UniqueName: \"kubernetes.io/configmap/9a5f1519-2d39-11e8-bec8-42010a8400f3-config\") pod \"9a5f1519-2d39-11e8-bec8-42010a8400f3\" (UID: \"9a5f1519-2d39-11e8-bec8-42010a8400f3\") : error checking /var/lib/kubelet/pods/9a5f1519-2d39-11e8-bec8-42010a8400f3/volume-subpaths/config/filebeat/0 for mount: lstat /var/lib/kubelet/pods/9a5f1519-2d39-11e8-bec8-42010a8400f3/volume-ubpaths/config/filebeat/0/..: not a directory"

kubectl delete pod NAME --grace-period=0 --force
seems to work.
also restarting kubelet works.

Same problem here on GKE 1.9.4-gke.1
Only happens with a specific filebeat daemonset, but recreating all nodes doesn't help either, it just keeps happening.

Also hitting this problem on GKE 1.9.4-gke.1 like @Tapppi - the pods were removed from the docker daemon on the host node but kubernetes had it stuck in TERMINATING

Events:
  Type    Reason                 Age        From                                                      Message
  ----    ------                 ----       ----                                                      -------
  Normal  SuccessfulMountVolume  43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  MountVolume.SetUp succeeded for volume "data"
  Normal  SuccessfulMountVolume  43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  MountVolume.SetUp succeeded for volume "varlibdockercontainers"
  Normal  SuccessfulMountVolume  43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  MountVolume.SetUp succeeded for volume "prospectors"
  Normal  SuccessfulMountVolume  43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  MountVolume.SetUp succeeded for volume "config"
  Normal  SuccessfulMountVolume  43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  MountVolume.SetUp succeeded for volume "filebeat-token-v74k6"
  Normal  Pulled                 43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  Container image "docker.elastic.co/beats/filebeat:6.1.2" already present on machine
  Normal  Created                43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  Created container
  Normal  Started                43m        kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  Started container
  Normal  Killing                <invalid>  kubelet, gke-delivery-platform-custom-pool-c9b9fe86-fgvh  Killing container with id docker://filebeat:Need to kill Pod
/Users/karl.stoney/git/autotrader/terraform-gcp git/master

For us something new just happened a moment ago, when I forcibly removed a stuck pod using kubectl delete pod NAME --grace-period=0 --force, the node that this pod was on just went into unhealthy. We're running docker 17-12CE, and restarting docker deamon on that box helped and uncorked the node.

For folks seeing this issue on 1.9.4-gke.1, it is most likely due to https://github.com/kubernetes/kubernetes/issues/61178, which is fixed in 1.9.5 and is being rolled out in GKE this week. The issue is related to the cleanup of subpath mounts of a file (not a directory). @zackify @nodefactory-bk @Tapppi @Stono

IIUC, the original problem in this bug is related to configuration of containerized kubelet, which is different.

BTW, creating a new node pool with version v1.9.3-gke.0 was our workaround for this, since v1.9.5 is still not rolled out on gke and it's easter already.

Can somebody confirm that this is fixed in version 1.9.3+ please? We have some serious trouble because of this behaviour, and restarting docker each time this happens is soo st00pid.

Fixed on 1.9.6 for me

On Wed, 4 Apr 2018, 11:43 am sokoow, notifications@github.com wrote:

Can somebody confirm that this is fixed in version 1.9.3+ please? We have
some serious trouble because of this behaviour, and restarting docker each
time this happens is soo st00pid.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/51835#issuecomment-378557636,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABaviW5yfj64zVjBYFGUToe2MH3dKwpTks5tlKPNgaJpZM4PKs9r
.

Okay, thanks @Stono . One more thing to confirm here. Here's our kubespray template for containerized kubelet:

#!/bin/bash /usr/bin/docker run \ --net=host \ --pid=host \ --privileged \ --name=kubelet \ --restart=on-failure:5 \ --memory={{ kubelet_memory_limit|regex_replace('Mi', 'M') }} \ --cpu-shares={{ kubelet_cpu_limit|regex_replace('m', '') }} \ -v /dev:/dev:rw \ -v /etc/cni:/etc/cni:ro \ -v /opt/cni:/opt/cni:ro \ -v /etc/ssl:/etc/ssl:ro \ -v /etc/resolv.conf:/etc/resolv.conf \ {% for dir in ssl_ca_dirs -%} -v {{ dir }}:{{ dir }}:ro \ {% endfor -%} -v /:/rootfs:ro,shared \ -v /sys:/sys:ro \ -v /var/lib/docker:/var/lib/docker:rw,shared \ -v /var/log:/var/log:rw,shared \ -v /var/lib/kubelet:/var/lib/kubelet:rw,shared \ -v /var/lib/cni:/var/lib/cni:rw,shared \ -v /var/run:/var/run:rw,shared \ -v /etc/kubernetes:/etc/kubernetes:ro \ -v /etc/os-release:/etc/os-release:ro \ {{ hyperkube_image_repo }}:{{ hyperkube_image_tag}} \ ./hyperkube kubelet --containerized \ "$@"

Does that look ok? Is anybody else using similar ?

I spoke too soon.

  Type    Reason   Age   From                                                      Message                                                                                                             [53/7752]
  ----    ------   ----  ----                                                      -------
  Normal  Killing  4m    kubelet, gke-delivery-platform-custom-pool-560b2b96-gcmb  Killing container with id docker://filebeat:Need to kill Pod

Had to destroy it in the brutal fashion.

❯ kks delete pod filebeat-x56v8 --force --grace-period 0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "filebeat-x56v8" deleted

@Stono which docker version are you using ? for us with docker 17.12CE, doing pod deletion with --force --grace-period 0 is quite drastic - it almost always ends up with node being unavailable due to docker hang

I'm still having this problem on 1.9.6 on Azure AKS managed cluster.

Using this workaround at the moment to select all stuck pods and delete them (as I end up having swathes of Terminating pods in my dev/scratch cluster):

kubectl get pods | awk '$3=="Terminating" {print "kubectl delete pod " $1 " --grace-period=0 --force"}' | xargs -0 bash -c

ran into this on both may Azure and AWS clusters - workaround was provided by Mike Elliot

https://jira.onap.org/browse/OOM-946

ubuntu@ip-10-0-0-22:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system heapster-76b8cd7b5-4r88h 1/1 Running 0 25d
kube-system kube-dns-5d7b4487c9-s4rsg 3/3 Running 0 25d
kube-system kubernetes-dashboard-f9577fffd-298r6 1/1 Running 0 25d
kube-system monitoring-grafana-997796fcf-wtz7n 1/1 Running 0 25d
kube-system monitoring-influxdb-56fdcd96b-2phd2 1/1 Running 0 25d
kube-system tiller-deploy-cc96d4f6b-jzqmz 1/1 Running 0 25d
onap dev-sms-857f6dbd87-pds58 0/1 Terminating 0 3h
onap dev-vfc-zte-sdnc-driver-5b6c7cbd6b-5vdvp 0/1 Terminating 0 3h
ubuntu@ip-10-0-0-22:~$ kubectl delete pod dev-vfc-zte-sdnc-driver-5b6c7cbd6b-5vdvp -n onap --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "dev-vfc-zte-sdnc-driver-5b6c7cbd6b-5vdvp" deleted
ubuntu@ip-10-0-0-22:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system heapster-76b8cd7b5-4r88h 1/1 Running 0 25d
kube-system kube-dns-5d7b4487c9-s4rsg 3/3 Running 0 25d
kube-system kubernetes-dashboard-f9577fffd-298r6 1/1 Running 0 25d
kube-system monitoring-grafana-997796fcf-wtz7n 1/1 Running 0 25d
kube-system monitoring-influxdb-56fdcd96b-2phd2 1/1 Running 0 25d
kube-system tiller-deploy-cc96d4f6b-jzqmz 1/1 Running 0 25d
onap dev-sms-857f6dbd87-pds58 0/1 Terminating 0 3h
ubuntu@ip-10-0-0-22:~$ kubectl delete pod dev-sms-857f6dbd87-pds58 -n onap --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "dev-sms-857f6dbd87-pds58" deleted
ubuntu@ip-10-0-0-22:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system heapster-76b8cd7b5-4r88h 1/1 Running 0 25d
kube-system kube-dns-5d7b4487c9-s4rsg 3/3 Running 0 25d
kube-system kubernetes-dashboard-f9577fffd-298r6 1/1 Running 0 25d
kube-system monitoring-grafana-997796fcf-wtz7n 1/1 Running 0 25d
kube-system monitoring-influxdb-56fdcd96b-2phd2 1/1 Running 0 25d
kube-system tiller-deploy-cc96d4f6b-jzqmz 1/1 Running 0 25d

I'm not sure if this is the same issue, but we have started noticing this behaviour _since_ upgrading from 1.9.3 to 10.10.1. It never happened before that. We're using glusterfs volumes, with SubPath. Kubelet continously logs things like

Apr 23 08:21:11 int-kube-01 kubelet[13018]: I0423 08:21:11.106779   13018 reconciler.go:181] operationExecutor.UnmountVolume started for volume "dev-static" (UniqueName: "kubernetes.io/glusterfs/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f-dev-static") pod "ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f" (UID: "ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f")
Apr 23 08:21:11 int-kube-01 kubelet[13018]: E0423 08:21:11.122027   13018 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/glusterfs/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f-dev-static\" (\"ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f\")" failed. No retries permitted until 2018-04-23 08:23:13.121821027 +1000 AEST m=+408681.605939042 (durationBeforeRetry 2m2s). Error: "UnmountVolume.TearDown failed for volume \"dev-static\" (UniqueName: \"kubernetes.io/glusterfs/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f-dev-static\") pod \"ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f\" (UID: \"ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f\") : Unmount failed: exit status 32\nUnmounting arguments: /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static\nOutput: umount: /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static: target is busy.\n        (In some cases useful info about processes that use\n         the device is found by lsof(8) or fuser(1))\n\n"

and lsof shows indeed that the directory under the glusterfs volumes is still in use:

glusterfs  71570                     root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterti  71570  71571              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glustersi  71570  71572              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterme  71570  71573              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glustersp  71570  71574              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glustersp  71570  71575              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71579              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterio  71570  71580              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71581              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71582              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71583              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71584              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71585              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71586              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterep  71570  71587              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterfu  71570  71592              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere
glusterfu  71570  71593              root   10u      DIR              0,264      4096  9380607748984626555 /var/lib/kubelet/pods/ad8fabbe-4449-11e8-b21a-a2bfb3c62d0f/volumes/kubernetes.io~glusterfs/dev-static/subpathhere

This was all fine on 1.9.3, so it's as if the fix for this issue has broken our use case :(

@ross-w this signature looks different from the others. Could you open a new issue and also include your pod spec?

Any updates on these issues?
In our case (Kubernetes 1.9.7, docker 17.03), pods are in state Terminating, after node goes goes out of memory and pods are rescheduled. Eventually there is a lot of ghost pods in kubernetes dashboard and in deployments tab, we can see deployments with 4/1 pods.
Restarting kubelet or killing all pods in namespace helps, but it's very poor solution.

@Adiqq For it was a problem with Docker it self.

Have a look to journalctl -u kubelet -f on one of your node. I had a message like 'Cannot kill container gRpc error" (I haven't the real message since I've fixed this).

To fix that I have restarted docker on each note. During booting up Docker clean containers in a broken state and remove all this stale pods.

I had this yesterday in 1.9.7, with a pod stuck in terminating state and in the logs it just had "needs to kill pod", i had to --force --grace-period=0 to get rid.

Just got this aswell with 1.9.7-gke.0.
Didn't have problems with 1.9.6-gke.1.
But did have it with 1.9.4 and 1.9.5

The pod getting stuck has a PV attached.

Redeploying or deleting a pod has the same effect.
Restarting kubelet on the offending node didn't work. kubelet didn't start again and I had to restart the entire node.

During this the pod couldn't be scheduled on any other node since it said the PV was already mounted elsewhere.

@Stono @nodefactory-bk could you guys take a look at your kubelet logs on the offending nodes to see if there are any detailed logs that could point to the issue?

cc @dashpole

Just had one app get stuck in terminating.
This is on 1.9.7-gke.1
Here is kubectl describe pod with secrets redacted:

Name:                      sharespine-cloud-6b78cbfb8d-xcbh5
Namespace:                 shsp-cloud-dev
Node:                      gke-testing-std4-1-0f83e7c0-qrxg/10.132.0.4
Start Time:                Tue, 22 May 2018 11:14:22 +0200
Labels:                    app=sharespine-cloud
                           pod-template-hash=2634769648
Annotations:               <none>
Status:                    Terminating (expires Wed, 23 May 2018 10:02:01 +0200)
Termination Grace Period:  60s
IP:                        10.40.7.29
Controlled By:             ReplicaSet/sharespine-cloud-6b78cbfb8d
Containers:
  sharespine-cloud:
    Container ID:   docker://4cf402b5dc3ea728fcbff87b57e0ec504093ea3cf7277f6ca83fde726a4bba48
    Image:          ...
    Image ID:       ...
    Ports:          9000/TCP, 9500/TCP
    State:          Running
      Started:      Tue, 22 May 2018 11:16:36 +0200
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  1500M
    Requests:
      cpu:      500m
      memory:   1024M
    Liveness:   http-get http://:9000/ delay=240s timeout=1s period=30s #success=1 #failure=3
    Readiness:  http-get http://:9000/ delay=30s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      sharespine-cloud-secrets  Secret  Optional: false
    Environment:
      APP_NAME:  sharespine-cloud
      APP_ENV:   shsp-cloud-dev (v1:metadata.namespace)
      JAVA_XMS:  128M
      JAVA_XMX:  1024M
    Mounts:
      /home/app/sharespine-cloud-home/ from sharespine-cloud-home (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-x7vzr (ro)
  sharespine-cloud-elker:
    Container ID:   docker://88a5a2bfd6804b5f40534ecdb6953771ac3181cf12df407baa81a34a7215d142
    Image:          ...
    Image ID:       ...
    Port:           <none>
    State:          Running
      Started:      Tue, 22 May 2018 11:16:36 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  200Mi
    Requests:
      cpu:     10m
      memory:  100Mi
    Environment Variables from:
      sharespine-cloud-secrets  Secret  Optional: false
    Environment:
      APP_NAME:                     sharespine-cloud
      APP_ENV:                      shsp-cloud-dev (v1:metadata.namespace)
      ELASTICSEARCH_LOGBACK_PATH:   /home/app/sharespine-cloud-home/logs/stash/stash.json
      ELASTICSEARCH_LOGBACK_INDEX:  cloud-dev
    Mounts:
      /home/app/sharespine-cloud-home/ from sharespine-cloud-home (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-x7vzr (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  sharespine-cloud-home:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  sharespine-cloud-home
    ReadOnly:   false
  default-token-x7vzr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-x7vzr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason         Age                From                                       Message
  ----     ------         ----               ----                                       -------
  Normal   Killing        20m                kubelet, gke-testing-std4-1-0f83e7c0-qrxg  Killing container with id docker://sharespine-cloud-elker:Need to kill Pod
  Normal   Killing        20m                kubelet, gke-testing-std4-1-0f83e7c0-qrxg  Killing container with id docker://sharespine-cloud:Need to kill Pod
  Warning  FailedKillPod  18m                kubelet, gke-testing-std4-1-0f83e7c0-qrxg  error killing pod: failed to "KillPodSandbox" for "83d05e96-5da0-11e8-ba51-42010a840176" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
  Warning  FailedSync     1m (x53 over 16m)  kubelet, gke-testing-std4-1-0f83e7c0-qrxg  error determining status: rpc error: code = DeadlineExceeded desc = context deadline exceeded

Not sure where to find kubelet.log in gke on googles images. Found something that I'm attaching.
kube.log

kubectl -n shsp-cloud-dev delete pod sharespine-cloud-6b78cbfb8d-xcbh5 --force --grace-period 0
killed it and removed it.
It started fine after but took a little longer than usual.

Mind you, this doesn't happen EVERY time for that app.
I'd say roughly 1/4 times probably.

Hitting this with k8s 1.9.6, when kubelet is unable to umount Cephfs mount, all pods on the node stay Terminated forever. Had to restart node to recover, kubelet or docker restart did not help.

@tuminoid the ceph issue sounds different. Could you open a new issue and also provide Pod events and kubelet logs for that pod?

FYI, updating my clusters (to k8s v1.10.2) seems to have eliminated this issue for us.

The attached reproduces this for me on gke

kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.2-gke.1", GitCommit:"75d2af854b1df023c7ce10a8795b85d3dd1f8d37", GitTreeState:"clean", BuildDate:"2018-05-10T17:23:18Z", GoVersion:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}

k8s-nfs-test.yaml.txt

run it, then delete it. You will get a 'nfs-client' stuck in deleting. The reason is the hard mount on the node, and the 'server' is deleted first.

@donbowman for the nfs unmount issue when you delete the nfs server first, you can set the "soft" mount option in the StorageClass or PV.

I don't see how? I can set it on a PersistentVolumeClaim, but that doesn't apply here.
I don't think StorageClass applies here (that would be the under-disk, underneath the nfs server).

THe issue is on the nfs-client.
am i missing something?

For your nfs PV, you can set the mountOptions field starting in 1.8 to specify a soft mount. If you dynamically provision nfs volumes, you can also set it in your StorageClass.mountOptions

yes, but its not the PV that is being mounted w/ NFS.
Its from my NFS-server container.
There is no dynamic provisioning.

This is using Google GCP+GKE. The PVC selects a PV which is block IO, mounted as an ext4 into a container which re-exports it with NFS.

The 2nd set of containers, which mount from the nfs-server (which is itself a pod), they don't see it as a PV. They see it as a volume like below.

I don't see a way to make this nfs-client see a 'pvc' for the mount, so I cannot set the mount options. Nor can I see it as a StorageClass.

Am i missing something?

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: nfs-client
  labels:
    app: nfs-client
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-client
    spec:
      containers:
        - name: nfs-client
          image: busybox:latest
          imagePullPolicy: IfNotPresent
          command: ["sleep", "3600"]
          volumeMounts:
            - name: nfs
              mountPath: /registry
      volumes:
        - name: nfs
          nfs:
            server: nfs-server.default.svc.cluster.local
            path: /

@donbowman for your 2nd set of containers, which uses the nfs mount, you can manually create a PV for that nfs volume with the mountOptions set, and share the PVC for that nfs PV in all your pods. No dynamic provisioning involved.

Something like this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv
spec:
  storageClassName: ""
  capacity:
    # Capacity doesn't actually matter for nfs
    storage: 500G 
  accessModes:
    - ReadWriteMany
  mountOptions:
    - soft
  nfs:
    server: nfs-server.default.svc.cluster.local
    path: /
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-claim
spec:
  # It's necessary to specify "" as the storageClassName
  # so that the default storage class won't be used
  storageClassName: ""
  volumeName: nfs-pv
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 500G

Thanks! So that worked (in the sense its now a soft-mount) but doesn't fix the issue:

The mount (as observed on the Node) is now soft:

nfs-server.default.svc.cluster.local:/ on /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/cbeda204-638d-11e8-9758-42010aa200b4/volumes/kubernetes.io~nfs/nfs-pv type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.162.0.2,local_lock=none,addr=10.19.241.155)

But when I delete the whole thing, I still get the nfs-client stuck forever in Terminating state.

k8s-nfs-test.yaml.txt

attached is the yaml i used. I did a 'create', waited for it to come up, observed that the client had the mount, could read/write files in it, then did a 'delete' on it.

The nfs-server pod is deleted, but the nfs-client is not.

Looking on the pod, the mount has remained:

# umount -f /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/cbeda204-638d-11e8-9758-42010aa200b4/volumes/kubernetes.io~nfs/nfs-pv
umount: /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/cbeda204-638d-11e8-9758-42010aa200b4/volumes/kubernetes.io~nfs/nfs-pv: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)

@donbowman ah so sorry, I was wrong about the soft option. The soft option only prevents filesystem calls from hanging when the server is inaccessible, but doesn't actually help unmount the nfs volume. A force unmount would need to be done for that, which we currently don't have a way to pass through. For now, you'll have to manually cleanup those mounts and ensure that you delete your pods in the correct order (nfs clients first, then nfs server).

i tried adding timeo=30 and intr, but same issue.
this locks it up, must log in to the node and do a umount -f -l on the underlying mount and then can do a kubectl delete --force --grace-period 0 on the pod.

it seems like since this was mounted on behalf of the pod that it could possible be umount ( or force umount after some timeout) on delete automatically.

I had a bunch of Pod like that so I had to come up with a command that would clean up all the terminating pods:

kubectl get pods -o json | jq -c '.items[] | select(.metadata.deletionTimestamp) | .metadata.name' | xargs -I '{}' kubectl delete pod --force --grace-period 0 '{}'

I think with Google's new filestore we'll have the same issue, no umount.

@donbowman iirc, your issue is because you were terminating the nfs server pod before the nfs client pod. If you use filestore, you no longer need a pod to host your nfs server, so you should not have this issue as long as you don't delete the entire firestore isntance.

won't i have the same issue if i'm orchestrating the filestore tho? e.g. if i'm bringing it up for a specific kubernetes deployment, and then down at the end, the order is not guaranteed.

But also i think the issue is not just the order, the nfs client pod delete does not umount at all, it just leaves the mount dangling on the node. So regardless of whether the filestore/server is present or not there is a dangling mount.

When a pod is terminated, we do unmount the volume (assuming that the server is still there). If you are seeing dangling mounts even when the server exists, then that is a bug.

If you use dynamic provisioning with PVCs and PVs, then we don't allow the PVC (and underlying storage) to be deleted until all Pods referencing it are done using it. If you want to orchestrate the provisioning yourself, then you need to ensure you don't delete the server until all pods are done using it.

Maybe this is a possible workaround: #65936

Forcing the delete worked for kubectl delete po $pod --grace-period=0 --force. The --now flag wasn't working. I am not sure about #65936 but I would like to not kill the node when Unknown states happen.

Having the same problem (pods remain in terminating because a file within the pod can't be unmounted because the device is 'busy') on 1.10.5. For me using the --grace-period=0 --force will result in the mountpoint to continue to exist. Eventually I ended up with over 90000 mountpoints, which severely slowed down the cluster. The workaround here is to do a find in the pod's folder and recursively unmounting those files, then recursively deleting the pod folder.
In my case, I'm mounting a configmap using subpath into an existing folder with existing files, overwriting one of the existing files. This used to work fine for me on 1.8.6.
The original poster mentions pods stay in 'terminating' for a few hours, in my case it's days. I have not seen them get cleaned up eventually, except when I do the manual workaround.

Having same problem, caused by log aggregator (similar to fluentd), it mounts /var/lib/docker/containers folder and pod has lots of mounts:

shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/6691cb9460df75579915fd881342931b98b4bfb7a6fbb0733cc6132d7c17710c/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/4cbbdf53ee5122565c6e118a049c93543dcc93bfd586a3456ff4ca98d59810a3/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/b2968b63a7a1f673577e5ada5f2cda50e1203934467b7c6573e21b341d80810a/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/4d54a4eabed68b136b0aa3d385093e4a32424d18a08c7f39f5179440166de95f/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/0e5487465abc2857446940902d9b9754b3447e587eefc2436b2bb78fd4d5ce4d/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/c73ed0942d77bf43f9ba016728834c47339793f9f1f31c4e566d73be492cf859/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/f9ab13f7f145b44beccc40c158287c4cfcc9dc465850f30d691961a2cabcfc14/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/aa449af555702d04f95fed04d09a3f1d5ae38d677484fc6cc9fc6d4b42182820/shm
shm                      64.0M         0     64.0M   0% /var/lib/docker/containers/f6608e507348b43ade3faa05d0a11b674c29f2038308f138174e8b7b8233633f/shm

In my case some pods can be removed well by kubernetes, but some stuck in "terminating" status.

It could be related to https://github.com/kubernetes/kubernetes/issues/45688 (i'm also using docker 17)

I just had the problem that the pods were not terminating because a secret was missing. After I created that secret in that namespace everything was back to normal.

I removed my stuck pods like this:

user@laptop:~$ kubectl -n storage get pod
NAME                     READY     STATUS        RESTARTS   AGE
minio-65b869c776-47hql   0/1       Terminating   5          1d
minio-65b869c776-bppl6   0/1       Terminating   33         1d
minio-778f4665cd-btnf5   1/1       Running       0          1h
sftp-775b578d9b-pqk5x    1/1       Running       0          28m
user@laptop:~$ kubectl -n storage delete pod minio-65b869c776-47hql --grace-period 0 --force
pod "minio-65b869c776-47hql" deleted
user@laptop:~$ kubectl -n storage delete pod minio-65b869c776-bppl6 --grace-period 0 --force
pod "minio-65b869c776-bppl6" deleted
user@laptop:~$ kubectl -n storage get pod
NAME                     READY     STATUS    RESTARTS   AGE
minio-778f4665cd-btnf5   1/1       Running   0          2h
sftp-775b578d9b-pqk5x    1/1       Running   0          30m
user@laptop:~$

Got a similar problem running on Azure ACS.

10:12 $ kubectl describe pod -n xxx triggerpipeline-3737304981-nx85k 
Name:                      triggerpipeline-3737304981-nx85k
Namespace:                 xxx
Node:                      k8s-agent-d7584a3a-2/10.240.0.6
Start Time:                Wed, 27 Jun 2018 15:33:48 +0200
Labels:                    app=triggerpipeline
                           pod-template-hash=3737304981
Annotations:               kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"xxx","name":"triggerpipeline-3737304981","uid":"b91320ff-7a0e-11e8-9e7...
Status:                    Terminating (expires Fri, 27 Jul 2018 09:00:35 +0200)
Termination Grace Period:  0s
IP:                        
Controlled By:             ReplicaSet/triggerpipeline-3737304981
Containers:
  alpine:
    Container ID:  docker://8443c7478dfe1a57a891b455366ca007fe00415178191a54b0199d246ccbd566
    Image:         alpine
    Image ID:      docker-pullable://alpine@sha256:e1871801d30885a610511c867de0d6baca7ed4e6a2573d506bbec7fd3b03873f
    Port:          <none>
    Command:
      sh
    Args:
      -c
      apk add --no-cache curl && echo "0 */4 * * * curl -v --trace-time http://myapi:80/api/v1/pipeline/start " | crontab - && crond -f
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-p9qtw (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  default-token-p9qtw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-p9qtw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     <none>
Events:          <none>

I have tried using --now or setting the grace period. For instance

09:00 $  kubectl delete pod -n xxx triggerpipeline-3737304981-nx85k --force --grace-period=0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "triggerpipeline-3737304981-nx85k" deleted

Still the pod is hanging and that causes the corresponding deployment to be stuck as well.

I am also haunted by these “Need to kill Pod” messages in the pod events. What does this mean by the way? That _Kubernetes_ feels the need to kill the pod, or that _I_ should kill the pod?

this happened to me a couple of days ago and I gave up deleting and left the pod as it. Then today, it was disappeared and seems to be deleted eventually.

Happened to me just now. The --force --now solution didn't work for me. I found the following line in kubelet logs suspicious

Aug 6 15:25:37 kube-minion-1 kubelet[2778]: W0806 15:25:37.986549 2778 docker_sandbox.go:263] NetworkPlugin cni failed on the status hook for pod "backend-foos-227474871-gzhw0_default": Unexpected command output nsenter: cannot open : No such file or directory

Which led me to finding the following issue:
https://github.com/openshift/origin/issues/15802

I'm not on openshift but on Openstack, so I thought it could be related. I gave the advice to restart docker a shot.
Restarting docker made the pods stuck in "Terminating" go away.

I know this is only a work-around, but I'm not waking up sometimes at 3am to fix this anymore.
Not saying you should use this, but it might help some people.

The sleep is what I have my pods terminationGracePeriodSeconds is set to (30 seconds). If its alive longer than that, this cronjob will --force --grace-period=0 and kill it completely

apiVersion: batch/v1beta1 kind: CronJob metadata: name: stuckpod-restart spec: concurrencyPolicy: Forbid successfulJobsHistoryLimit: 1 failedJobsHistoryLimit: 5 schedule: "*/1 * * * *" jobTemplate: spec: template: spec: containers: - name: stuckpod-restart image: devth/helm:v2.9.1 args: - /bin/sh - -c - echo "$(date) Job stuckpod-restart Starting"; kubectl get pods --all-namespaces=true | awk '$3=="Terminating" {print "sleep 30; echo "$(date) Killing pod $1"; kubectl delete pod " $1 " --grace-period=0 --force"}'; echo "$(date) Job stuckpod-restart Complete"; restartPolicy: OnFailure

I am seeing the same error with Kubernetes v1.10.2. Pods get stuck in terminating indefinitely and the kubelet on the node in question repeatedly logs:

Aug 21 13:25:55 node-09 kubelet[164855]: E0821 13:25:55.149132  
164855 nestedpendingoperations.go:267] 
Operation for "\"kubernetes.io/configmap/b838409a-a49e-11e8-bdf7-000f533063c0-configmap\" 
(\"b838409a-a49e-11e8-bdf7-000f533063c0\")" failed. No retries permitted until 2018-08-21 
13:27:57.149071465 +0000 UTC m=+1276998.311766147 (durationBeforeRetry 2m2s). Error: "error 
cleaning subPath mounts for volume \"configmap\" (UniqueName: 
\"kubernetes.io/configmap/b838409a-a49e-11e8-bdf7-000f533063c0-configmap\") pod 
\"b838409a-a49e-11e8-bdf7-000f533063c0\" (UID: \"b838409a-a49e-11e8-bdf7-000f533063c0\") 
: error deleting /var/lib/kubelet/pods/b838409a-a49e-11e8-bdf7-000f533063c0/volume-
subpaths/configmap/pod-master/2: remove /var/lib/kubelet/pods/b838409a-a49e-11e8-bdf7-
000f533063c0/volume-subpaths/configmap/pod-master/2: device or resource busy"

I can manually unmount the subpath volume in question without complaint (Linux does not tell me it is busy). This stops the kubelet from logging the error message. However, this does not inspire Kubernetes to continue cleanup, as the pod is still shown in terminating state. Routinely restarting Docker to clean this up is not really an acceptable solution because of the disruption it causes to running containers.

Also of note: the container itself is gone from docker ps -a with no evidence that it ever existed, so I'm not sure this is actually a Docker issue. We are using Docker version 17.03.2-ce.

An update: we had configured our nodes to redirect the kubelet root directory to a non-OS volume with a symlink (/var/lib/kubelet was a symlink pointing to another directory on a different volume). When I reconfigured things to pass --root-dir to the kubelet so that it went to the desired directory directly, rather than through a symlink, and restarted the kubelet, it cleaned up the volume mounts and cleared out the pods that were stuck terminating without requiring a Docker restart.

I experienced this issue today for the first time while running some pods locally on minikube.

I had a bunch of pods stuck in Terminating due to a configmap/secret mounted as a volume which was missing. None of the suggestions/workarounds/solutions posted above worked except this one.

One thing that I think is worth of notice is the following though:

  • When I ran kubectl get pods, I got the list of pods with the Terminating status.
  • When I ran docker ps | grep -i {{pod_name}} though, none of the pods in Terminating status as seen by kubectl get pods were running in the minikube VM.

I was expecting docker ps to return the list of pods stuck in the Terminating state but in reality none of them were running, yet kubectl get pods was returning data about them? Would anyone be able to explain why is that?

I experienced this issue with 4 deployments. Then I switched from “local volume” to “host path” for all mounts, and it is gone for me.

I just had the problem that the pods were not terminating because a secret was missing. After I created that secret in that namespace everything was back to normal.

How do you create a secret in the namespace if the namespace is in "Terminating" state?

kubectl delete --all pods --namespace=xxxxx --force --grace-period=0

works for me.

Do not forget about "--grace-period=0". It matters

kubectl warned me "warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely." when I use --force --grace-period=0.
Can anyone tell me if it'll really happen?

in fact, when we delete some pod, it may delay to delete by some reasons,
and if we execute "kubectl delete" with flag "--force --grace-period=0",
the resource object would be delete at once.

Can you help to confirm if the pod will be deleted immediately?
Did that mean the warning message is actually inaccurate?

@windoze , if you --force --grace-period=0 option, it means that the pod API object will deleted from API server immediately. Node kubelet is responsible to clean up volume mounts and kill containers. If kubelet is not running or has issues during cleaning up the pod, the container might be still running. But Kubelet should keep trying to clean up the pods whenever possible.

So that still means the deletion could take forever because kubelet could be malfunctioning?
Is there any way to make sure the pod is deleted?
I'm asking the question because I've some huge pods running in the cluster and there is no enough memory on every node to run 2 instances of them.
If the deletion failed the node becomes unusable, and if this issue happening multiple times, the service will be completely down because eventually there'll be no node can run this pod.

In plain-n-old docker environment I can force kill a pod with kill -9 or something like it, but seems k8s doesn't have such function.

@windoze do you know why your pod deletion often failed? It is because kubelet is not running, or kubelet was trying to kill the container but failed with some errors?

Such situation happened several times on my cluster several months ago, kubelet was running but docker daemon seemed to have some trouble and got stuck with no error log.
My solution was to log in to the node and force kill the container process and restart the docker daemon.
After some upgradings the issue was gone and I never had it again.

kubectl delete pods <podname> --force --grace-period=0 worked for me!

@shinebayar-g , the problem with --force is that it could mean that your container will keep running. It just tells Kubernetes to forget about this pod's containers. A better solution is to SSH into the VM running the pod and investigate what's going on with Docker. Try to manually kill the containers with docker kill and if successful, attempt to delete the pod normally again.

@agolomoodysaada Ah, that makes sense. Thanks for the explanation. So I wouldn't really know that actual container is really deleted or not right?

so, it's the end of 2018, kube 1.12 is out and ... you all still have problems with stuck pods ?

I have the same issue, either --force --grace-period=0 or --force --now doesn't work, the following is the logs:

root@r15-c70-b03-master01:~# kubectl -n infra-lmat get pod node-exporter-zbfpx
NAME READY STATUS RESTARTS AGE
node-exporter-zbfpx 0/1 Terminating 0 4d

root@r15-c70-b03-master01:~# kubectl -n infra-lmat delete pod node-exporter-zbfpx --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "node-exporter-zbfpx" deleted

root@r15-c70-b03-master01:~# kubectl -n infra-lmat get pod node-exporter-zbfpx
NAME READY STATUS RESTARTS AGE
node-exporter-zbfpx 0/1 Terminating 0 4d

root@r15-c70-b03-master01:~# kubectl -n infra-lmat delete pod node-exporter-zbfpx --now --force
pod "node-exporter-zbfpx" deleted

root@r15-c70-b03-master01:~# kubectl -n infra-lmat get pod node-exporter-zbfpx
NAME READY STATUS RESTARTS AGE
node-exporter-zbfpx 0/1 Terminating 0 4d

root@r15-c70-b03-master01:~#

I tried to edit the pod and delete the finalizers section in metadata, but it also failed.

I'm still seeing this in a 100% reproducible fashion (same resource defintions) with the kubectl 1.13 alpha and Docker for Desktop on macOS. By reproducible I mean that the only way to fix it seems to be to factory reset Docker for Mac, and when I setup my cluster again using the same resources (deployment script), the same clean-up script fails.

I'm not sure why it would be relevant but my clean-up script looks like:

#!/usr/bin/env bash
set -e

function usage() {
    echo "Usage: $0 <containers|envs|volumes|all>"
}

if [ "$1" = "--help" ] || [ "$1" = "-h" ] || [ "$1" = "help" ]; then
    echo "$(usage)"
    exit 0
fi

if [ $# -lt 1 ] || [ $# -gt 1 ]; then
    >&2 echo "$(usage)"
    exit 1
fi

MODE=$1

function join_with {
    local IFS="$1"
    shift
    echo "$*"
}

resources=()

if [ "$MODE" = "containers" ] || [ "$MODE" = "all" ]; then
    resources+=(daemonsets replicasets statefulsets services deployments pods rc)
fi

if [ "$MODE" = "envs" ] || [ "$MODE" = "all" ]; then
    resources+=(configmaps secrets)
fi

if [ "$MODE" = "volumes" ] || [ "$MODE" = "all" ]; then
    resources+=(persistentvolumeclaims persistentvolumes)
fi

kubectl delete $(join_with , "${resources[@]}") --all

Because the cluster is run locally I can verify that there are no containers running in Docker, it's just kubectl that's getting hung up on terminating pods. When I describe the pods the status is listed as Status: Terminating (lasts <invalid>)

Just happened to me once again. I was trying to install percona pmm-server with NFS share and software didn't even came up, so I removed and this happened. (Persistent claim wasn't working for this software). Guess I'm calling good old kubectl delete pods <podname> --force --grace-period=0 once again. But question is how do I know where this pod is living on?

@shinebayar-g , SSH into the VM it was on and run docker ps.

Well it wasn't there.. I have few VMs , so I asked how to find out which one is the right one. :)

@shinebayar-g this may work:
kubectl describe pod/some-pod-name | grep '^Node:'

same issue.

docker ps found that the container is in "Dead" status not Exited(0) as expected

Manually deleting the container, lead to the following docker log entry:

level=warning msg="container kill failed because of 'container not found' or 'no such process': Cannot kill container 

Unfortunately the line is cut off, but I think I remember, the problem was that the process was not there anymore.

I am still getting stuck with this issue with k8s v1.11.0. Here is a check-list of what I do to clean up my pods:

  • Make sure that all resources that are attached to the pod have been reclaimed. Not all of them are visible in kubectl get; some of them are only known to the Kubelet the pod is running on, so you will have to follow its log stream locally
  • When all else fails, kubectl edit the failed pod and remove finalizers:- foregroundDeletion

Two more tips:

  • In steady-state a non-confused Kubelet should log no periodic messages whatsoever. Any kind of repeated failure to release something, is the symptom of a stuck pod.
  • you can keep a kubectl delete command blocked in another window to monitor your progress (even on a pod you already "deleted" many times). kubectl delete will terminate as soon as the last stuck resource gets released.

Faced with this today.
What was done:

  1. ssh to node and remove container manually
  2. After that kubectl get pods shows me what my stucked container 0/1 terminating (was 1/1 terminating)
  3. Remove finalizers section from pod, my was foregroundDeletion ( $ kubectl edit pod/name ) --> container removed from pods list
  4. Delete deployment --> all deployment related stuff removed.
kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

we are facing same issue when we started mounting secrets (shared with many pods). The pod goes in terminating state and stay there forever. Our version is v1.10.0 . The attached docker container is gone but the reference in the API server remains unless I forcefully delete pod with --grace-period=0 --force option.

Looking for a permanent solution.

Well, Recently I tested runc exploit CVE-2019-5736 on my staging cluster, As you already know the exploit rewrites the runc binary on the host machine. Its destructive exploit. After that i saw strange behavior on cluster. All pods stuck on the terminating state. The workaround was draining affected node purge docker and re-install it. After that all pods and k8s cluster functioning normal as before. Maybe its a docker issue and re-installing it solves your problem too!. Thanks

Fresh v1.13.3 install here. This happens to me too. Happens since I mounted same NFS volumes across a few pods which seems to have something to do with it.

I see this issue when creating a deployment that tries to create a volume using a secret that doesn't exist, deleting that deployment/service leaves around a Terminating pod.

facing same issue with v.1.12.3, and --grace-period=0 --force or --now both invalid, delete the statefulset which belongs also invalid

Same issue with an SMB (I think?) mount (Azure Files share as per https://docs.microsoft.com/en-us/azure/aks/azure-files-volume).

same issue with 13.3

I have the same issue that pod is in "Terminating" state for almost 2 days.
I am using Minikube on Linux machine (Debian).

Kubectl version:
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:00:57Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Minikube Version:
minikube version: v0.34.1

@ardalanrazavi why is it terminating for two days? Just force delete if it doesn't delete after 5 minutes

@nmors

why is it terminating for two days?

That's a good question. We all would like to know that.

Just force delete if it doesn't delete after 5 minutes

Deleting it forcefully leaves the cluster in inconsistent state. (With minikube, that is not your real cluster, it's admittedly much less of a concern)

@AndrewSav

I don't see any other solutions here to be frank.

Sure, the cluster will be left in an "inconsistent state". I'd like to understand what you mean exactly by this. Force closing is bad. I also don't like it, but in my case, I am comfortable destroying and redeploying any resources as required.

In my case, it seems to only get stuck terminating on the pods which have an NFS mount. And only happens when the NFS server goes down before the client tries to go down.

I fixed the issue, I was able to isolate that all pods stuck terminating were all on one node, node was restarted and problem has been gone.

@nmors @AndrewSav I have done force delete as well.

Deleting your nfs server before you delete your pods is known to cause unmount to hang forever. It's best to order your deletions in that case so that your nfs server is always deleted last.

@msau42 My NFS server is not part of the k8s cluster - it's a seperate appliance and machine all together

It doesn't matter if it's part of the k8s cluster or not. If the nfs server is inaccessible, then unmount will hang until it becomes accessible again.

@msau42 that's strange, because I'm pretty sure that even when it came back online, the pods were still stuck terminating. new pods start up and mount fine.

I use NFS server on kubernetes followed by this example and this happens quite often unfortunately ..

@shinebayar-g I also followed that guide, but now I have gotten rid of the PVs and PVCs and defined my volume directly in the deployment, like so:

        volumeMounts:
        - mountPath: /my-pod-mountpoint
          name: my-vol
      volumes:
        - name: my-vol
          nfs:
            server: "10.x.x.x"
            path: "/path/on/server"
            readOnly: false

I haven't had any issues since, I changed this only about a week hoping that the simpler config would be more reliable.. let's see... Maybe this will fix the issue?

As a workaround I wrote a script which grabs some last lines from /var/log/syslog and search for errors like "Operation for...remove /var/lib/kubelet/pods ... directory not empty" or "nfs...device is busy...unmount.nfs" or "stale NFS file handle".
Then it extracts either pod_id or pod full directory and see what mounts it has (like mount | grep $pod_id), then unmounts all and removes the corresponding directories. Eventually kubelet does the rest and gracefully shutdowns and deletes the pods. No more pods in Terminating state.

I put that script in cron to run every minute. As a result - no issue for now, even 3-4 month later.
Note: I know, this approach is unreliable and it requires check on every cluster upgrade but it works!

I am using version 1.10 and I experienced this issue today and I think my problem is related with the issue of mounting secret volume which might have left some task pending and left the pod in termination status forever.

I had to use the --grace-period=0 --force option to terminate the pods.

root@ip-10-31-16-222:/var/log# journalctl -u kubelet | grep dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds Mar 20 15:50:31 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: I0320 15:50:31.179901 528 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "config-volume" (UniqueName: "kubernetes.io/configmap/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-config-volume") pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds" (UID: "e3d7c57a-4b27-11e9-9aaa-0203c98ff31e") Mar 20 15:50:31 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: I0320 15:50:31.179935 528 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-xjlgc" (UniqueName: "kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-default-token-xjlgc") pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds" (UID: "e3d7c57a-4b27-11e9-9aaa-0203c98ff31e") Mar 20 15:50:31 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: I0320 15:50:31.179953 528 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "secret-volume" (UniqueName: "kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume") pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds" (UID: "e3d7c57a-4b27-11e9-9aaa-0203c98ff31e") Mar 20 15:50:31 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:50:31.310200 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:50:31.810156118 +0000 UTC m=+966792.065305175 (durationBeforeRetry 500ms). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxx-com\" not found" Mar 20 15:50:31 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:50:31.885807 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:50:32.885784622 +0000 UTC m=+966793.140933656 (durationBeforeRetry 1s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxxxx-com\" not found" Mar 20 15:50:32 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:50:32.987385 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:50:34.987362044 +0000 UTC m=+966795.242511077 (durationBeforeRetry 2s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxx-com\" not found" Mar 20 15:50:35 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:50:35.090836 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:50:39.090813114 +0000 UTC m=+966799.345962147 (durationBeforeRetry 4s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxx-com\" not found" Mar 20 15:50:39 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:50:39.096621 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:50:47.096593013 +0000 UTC m=+966807.351742557 (durationBeforeRetry 8s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxx-com\" not found" Mar 20 15:50:47 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:50:47.108644 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:51:03.10862005 +0000 UTC m=+966823.363769094 (durationBeforeRetry 16s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxx-com\" not found" Mar 20 15:51:03 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:51:03.133029 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:51:35.133006645 +0000 UTC m=+966855.388155677 (durationBeforeRetry 32s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxxx-com\" not found" Mar 20 15:51:35 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:51:35.184310 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:52:39.184281161 +0000 UTC m=+966919.439430217 (durationBeforeRetry 1m4s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxx-com\" not found" Mar 20 15:52:34 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:52:34.005027 528 kubelet.go:1640] Unable to mount volumes for pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)": timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc]; skipping pod Mar 20 15:52:34 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:52:34.005085 528 pod_workers.go:186] Error syncing pod e3d7c57a-4b27-11e9-9aaa-0203c98ff31e ("dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)"), skipping: timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc] Mar 20 15:52:39 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:52:39.196332 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:54:41.196308703 +0000 UTC m=+967041.451457738 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxxx-com\" not found" Mar 20 15:54:41 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:54:41.296252 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:56:43.296229192 +0000 UTC m=+967163.551378231 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxxx-com\" not found" Mar 20 15:54:48 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:54:48.118620 528 kubelet.go:1640] Unable to mount volumes for pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)": timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc]; skipping pod Mar 20 15:54:48 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:54:48.118681 528 pod_workers.go:186] Error syncing pod e3d7c57a-4b27-11e9-9aaa-0203c98ff31e ("dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)"), skipping: timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc] Mar 20 15:56:43 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:56:43.398396 528 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\" (\"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\")" failed. No retries permitted until 2019-03-20 15:58:45.398368668 +0000 UTC m=+967285.653517703 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"secret-volume\" (UniqueName: \"kubernetes.io/secret/e3d7c57a-4b27-11e9-9aaa-0203c98ff31e-secret-volume\") pod \"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds\" (UID: \"e3d7c57a-4b27-11e9-9aaa-0203c98ff31e\") : secrets \"data-platform.xxxx-com\" not found" Mar 20 15:57:05 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:57:05.118566 528 kubelet.go:1640] Unable to mount volumes for pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)": timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc]; skipping pod Mar 20 15:57:05 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:57:05.118937 528 pod_workers.go:186] Error syncing pod e3d7c57a-4b27-11e9-9aaa-0203c98ff31e ("dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)"), skipping: timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc] Mar 20 15:59:22 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:59:22.118593 528 kubelet.go:1640] Unable to mount volumes for pod "dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)": timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume config-volume default-token-xjlgc]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc]; skipping pod Mar 20 15:59:22 ip-10-31-16-222.eu-west-2.compute.internal kubelet[528]: E0320 15:59:22.118624 528 pod_workers.go:186] Error syncing pod e3d7c57a-4b27-11e9-9aaa-0203c98ff31e ("dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds_default(e3d7c57a-4b27-11e9-9aaa-0203c98ff31e)"), skipping: timeout expired waiting for volumes to attach or mount for pod "default"/"dp-tag-change-ingestion-com-depl-5bd59f74c4-589ds". list of unmounted volumes=[secret-volume config-volume default-token-xjlgc]. list of unattached volumes=[secret-volume config-volume default-token-xjlgc]

I've found that if you use --force --grace-period=0 all it does is remove the reference... if you ssh into the node, you'll still see the docker containers running.

In my case, there was Out of memory on the node.
And kernel killed cilium-agent, that seems disturb pod termination.
I just restarted the node and it cleared.

In my experience, sudo systemctl restart docker on the node helps (but there is obviously downtime).

And this is still happening periodically on random nodes that are either A) close to memory limits or B) CPU starved (either bc of some kswapd0 issue which might still be mem related, or actual load)

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

This is very much active issue still, k8s 1.15.4 and RHEL Docker 1.13.1. All the time pods stay in Terminating but the container is already gone, and k8s cannot figure out itself, but requires human interaction. Makes test scripting real PITA.

/reopen
/remove-lifecycle rotten

@tuminoid: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

This is very much active issue still, k8s 1.15.4 and RHEL Docker 1.13.1. All the time pods stay in Terminating but the container is already gone, and k8s cannot figure out itself, but requires human interaction. Makes test scripting real PITA.

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/reopen
/remove-lifecycle rotten

@mikesplain: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Same here: pod stuck in terminating phase for more then 19 minutes. Container is successfully terminated, but Kubernetes still believes it needs to wait for something..

Name:                      worker-anton-nginx-695d8bd9c6-7q4l9
Namespace:                 anton
Priority:                  0
Status:                    Terminating (lasts 19m)
Termination Grace Period:  30s
IP:                        10.220.3.36
IPs:                       <none>
Controlled By:             ReplicaSet/worker-anton-nginx-695d8bd9c6
Containers:
  worker:
    Container ID:   docker://12c169c8ed915bc290c14c854a6ab678fcacea9bb7b1aab5512b533df4683dd6
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Events:          <none>

No events, not logs...

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-17T17:16:09Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.8-gke.2", GitCommit:"188432a69210ca32cafded81b4dd1c063720cac0", GitTreeState:"clean", BuildDate:"2019-10-21T20:01:24Z", GoVersion:"go1.12.11b4", Compiler:"gc", Platform:"linux/amd64"}
a

Can you check your kubelet logs and see if there's any messages about volume unmount failing, or orphaned pods?

I've seen this as well
E1206 03:05:40.247161 25653 kubelet_volumes.go:154] Orphaned pod "0406c4bf-17e3-4613-a526-34e8a6cee208" found, but volume paths are still present on disk : There were a total of 8 errors similar to this. Turn up verbosity to see them.

I've seen it too. Can't check logs because kubectl complains it can't connect to docker container and can't create new pod due to current existence of terminating pod. Rather annoying.

Experiencing it too and it's rather annoying to have to confirm if Kubernetes properly cleaned the old pods or not.
Hopefully, this gets fixed soon.

And what about this issue? Did it resolve? I have a same, but this does not begin to happen immediately, but some time after the start of the node, if you reset the node, then for some time everything is good

Could you check if there are Finalizers on the pod keeping it from being deleted?

There are no Finalizers in the issued pod

FYI I resolved this with a force delete using:

kubectl delete pods <pod> --grace-period=0 --force

And I believe this successfully managed to terminate the pod. Since then I have not experienced the issue again. I have possibly updated since then, so could be a version issue, but not 100% since it's been so long since I've seen the issue.

This happens me when a pod is running out of memory. It doesn't terminate until the memory usage goes down again.

FYI I resolved this with a force delete using:

kubectl delete pods <pod> --grace-period=0 --force

And I believe this successfully managed to terminate the pod. Since then I have not experienced the issue again. I have possibly updated since then, so could be a version issue, but not 100% since it's been so long since I've seen the issue.

That worked for me

kubectl delete pods <pod> --grace-period=0 --force is a temporary fix, I don't want to run a manual fix every time there is failover for one of the affected pods. My zookeeper pods aren't terminating in minikube and on Azure AKS.

Update March 9th 2020
I used a preStop lifecycle hook to manually terminate my pods. My zookeeper pods were stuck in a terminating status and wouldn't respond to a term signal from within the container. I had basically the same manifest running elsewhere and everything terminates correctly, no clue what the root cause is.

same issue, super annoying

same issue :( pods stuck in terminating, since 3 days

FYI I resolved this with a force delete using:

kubectl delete pods <pod> --grace-period=0 --force

And I believe this successfully managed to terminate the pod. Since then I have not experienced the issue again. I have possibly updated since then, so could be a version issue, but not 100% since it's been so long since I've seen the issue.

Also, the --force flag doesn't necessarily mean the pod is removed, it just doesn't wait for confirmation (and drops the reference, to my understanding). As stated by the warning The resource may continue to run on the cluster indefinetely.

Edit: I was ill-informed. See elrok123s comment below for further motivation.

FYI I resolved this with a force delete using:

kubectl delete pods <pod> --grace-period=0 --force

And I believe this successfully managed to terminate the pod. Since then I have not experienced the issue again. I have possibly updated since then, so could be a version issue, but not 100% since it's been so long since I've seen the issue.

Also, the --force flag doesn't necessarily mean the pod is removed, it just doesn't wait for confirmation (and drops the reference, to my understanding). As stated by the warning The resource may continue to run on the cluster indefinetely.

Correct, but the point is that --grace-period=0 forces the delete to happen :) not sure why your comment is relevant :/

I feel that his comment is relevant because the underlying container
(docker or whatever) may still be running and not fully deleted.., The
illusion of it “removed” is a little misleading at times

On Thu, Jun 4, 2020 at 9:16 AM, Conner Stephen McCabe <
[email protected]> wrote:

FYI I resolved this with a force delete using:

kubectl delete pods --grace-period=0 --force

And I believe this successfully managed to terminate the pod. Since then I
have not experienced the issue again. I have possibly updated since then,
so could be a version issue, but not 100% since it's been so long since
I've seen the issue.

Also, the --force flag doesn't necessarily mean the pod is removed, it
just doesn't wait for confirmation (and drops the reference, to my
understanding). As stated by the warning The resource may continue to run
on the cluster indefinetely.

Correct, but the point is that --grace-period=0 forces the delete to
happen :) not sure why your comment is relevant :/


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/51835#issuecomment-638840136,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAH34CDZF7EJRLAQD7OSH2DRU6NCRANCNFSM4DZKZ5VQ
.

That is indeed my point; using this the --force method risks leaving underlying load weighing down your nodes, and doesn't necessarily fix the original problem. In the worst case it's a "If I can't see it it doesn't exist"-fix which can end up being even harder to detect.

Or are your saying that --grace-period=0 is guaranteed to force the removal of the underlying container, @elrok123?
If that is the case, my comment is based on faulty knowledge and is irrelevant, but if the risk of leaving running containers remains when using --grace-period=0 so does my point.

@oscarlofwenhamn As far as I'm aware, this is effectively running sigkill on all processes in that pod, ensuring deletion of zombie processes (source: Point 6 under 'Termination of Pods' - https://kubernetes.io/docs/concepts/workloads/pods/pod/#:~:text=When%20the%20grace%20period%20expires,period%200%20(immediate%20deletion).), and successfully removing the pod (may not happen immediately, but it will happen.)

The guide mentions that it removes the reference, but does not delete the pod itself (source: 'Force Deletion' - https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/), however grace-period=0 should effectively sigkill your pod albeit, not immediately.

I'm just reading the docs and the recommended ways to handle the scenario I encountered. The issue I specifically encountered was not a reoccurring issue, and something that happened once; I do believe the REAL fix for this is fixing your deployment, but until you get there, this method should help.

@elrok123 Brilliant - I was indeed ill-informed. I've updated my response above, referencing this explanation. Thanks for the detailed response, and a further motivated method for dealing with troublesome pods. Cheers!

currently have pods stuck for 2+ days in terminating state.

For me the namespace is stuck in Terminating . No pods are listed. No services... nothing. The namespace is empty. Still... stuck in Terminating.

@JoseFMP use kubectl to request the yaml from the namespace, it might have finalizers that are holding up the process.

@JordyBottelier Thank you.

No finalizers. Still stuck Terminating

@JoseFMP here is a script to kill it off entirely (effectively nuke it), simply save it and run ./script_name :
```

!/bin/bash

set -eo pipefail

die() { echo "$*" 1>&2 ; exit 1; }

need() {
which "$1" &>/dev/null || die "Binary '$1' is missing but required"
}

checking pre-reqs

need "jq"
need "curl"
need "kubectl"

PROJECT="$1"
shift

test -n "$PROJECT" || die "Missing arguments: kill-ns "

kubectl proxy &>/dev/null &
PROXY_PID=$!
killproxy () {
kill $PROXY_PID
}
trap killproxy EXIT

sleep 1 # give the proxy a second

kubectl get namespace "$PROJECT" -o json | jq 'del(.spec.finalizers[] | select("kubernetes"))' | curl -s -k -H "Content-Type: application/json" -X PUT -o /dev/null --data-binary @- http://localhost:8001/api/v1/namespaces/$PROJECT/finalize && echo "Killed namespace: $PROJECT"```

I've also seemingly run into this, with multiple pods stuck in terminating, including one pod which is no longer visible anywhere in my infrastructure but still running as a ghost (it serves requests and I can see requests being served even with a deployment scale of zero).

I have zero visibility nor control over this pod and ask how I am supposed to troubleshoot a situation like this without shutting down all nodes forcefully?

I've also seemingly run into this, with multiple pods stuck in terminating, including one pod which is no longer visible anywhere in my infrastructure but still running as a ghost (it serves requests and I can see requests being served even with a deployment scale of zero).

I have zero visibility nor control over this pod and ask how I am supposed to troubleshoot a situation like this without shutting down all nodes forcefully?

you'll have to access docker on the node.
You can use my dink (https://github.com/Agilicus/dink) which will bring up a pod w/ a shell w/ docker access, or ssh to the pod.
docker ps -a
docker stop ####

good luck.

Thanks for the direction.

I was eventually able to solve this, but still a bit puzzled how it could happen (for me the pod was completely invisible). As it was in production things were a bit hectic and I wasn't able to perform diagnostics, but if it happens again hopefully I can make a better bug report.

Seeing a similar symptom, pods stuck in terminating(interestingly they all have exec type probe for readiness/liveliness). Looking at the logs I can see: kubelet[1445]: I1022 10:26:32.203865 1445 prober.go:124] Readiness probe for "test-service-74c4664d8d-58c96_default(822c3c3d-082a-4dc9-943c-19f04544713e):test-service" failed (failure): OCI runtime exec failed: exec failed: cannot exec a container that has stopped: unknown. This message repeats itself forever and changing the exec probe to tcpSocket seems to allow the pod to terminate(based on a test, will follow up on it). The pod seems to have one of the containers "Running" but not "Ready", the logs for the "Running" container does show as if the service stopped.

This happens on containerd 1.4.0 when node load is high and vm.max_map_count is set to a higher value than the default, the containerd-shim doesnt drain the stdout fifo and blocks waiting for it to be drained, while dockerd fails to ge the event/acknowledge from containerd that the processes are gone.

@discanto thanks for sharing this information. Is the problem being fixed or tracked?

@Random-Liu

The bug has opened more than 3 years. Pods stuck on terminating could be caused by a variety of reasons. When reporting your case, it would be very helpful to post some of the kubelet logs to see whether the pods stuck.

Was this page helpful?
5 / 5 - 2 ratings