/sig api-machinery

dims on 7 Mar 2018

@shean-guangchang Do you have some way to reproduce this?

And out of curiosity, are you using any CRDs? We faced this problem with TPRs previously.

nikhita on 10 Mar 2018

👍9

/kind bug

nikhita on 10 Mar 2018

I seem to be experiencing this issue with a rook deployment:

➜  tmp git:(master) ✗ kubectl delete namespace rook
Error from server (Conflict): Operation cannot be fulfilled on namespaces "rook": The system is ensuring all content is removed from this namespace.  Upon completion, this namespace will automatically be purged by the system.
➜  tmp git:(master) ✗

I think it does have something to do with their CRD, I see this in the API server logs:

E0314 07:28:18.284942       1 crd_finalizer.go:275] clusters.rook.io failed with: timed out waiting for the condition
E0314 07:28:18.287629       1 crd_finalizer.go:275] clusters.rook.io failed with: Operation cannot be fulfilled on customresourcedefinitions.apiextensions.k8s.io "clusters.rook.io": the object has been modified; please apply your changes to the latest version and try again

I've deployed rook to a different namespace now, but I'm not able to create the cluster CRD:

➜  tmp git:(master) ✗ cat rook/cluster.yaml 
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
  name: rook
  namespace: rook-cluster
spec:
  dataDirHostPath: /var/lib/rook-cluster-store
➜  tmp git:(master) ✗ kubectl create -f rook/
Error from server (MethodNotAllowed): error when creating "rook/cluster.yaml": the server does not allow this method on the requested resource (post clusters.rook.io)

Seems like the CRD was never cleaned up:

➜  tmp git:(master) ✗ kubectl get customresourcedefinitions clusters.rook.io -o yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  creationTimestamp: 2018-02-28T06:27:45Z
  deletionGracePeriodSeconds: 0
  deletionTimestamp: 2018-03-14T07:36:10Z
  finalizers:
  - customresourcecleanup.apiextensions.k8s.io
  generation: 1
  name: clusters.rook.io
  resourceVersion: "9581429"
  selfLink: /apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/clusters.rook.io
  uid: 7cd16376-1c50-11e8-b33e-aeba0276a0ce
spec:
  group: rook.io
  names:
    kind: Cluster
    listKind: ClusterList
    plural: clusters
    singular: cluster
  scope: Namespaced
  version: v1alpha1
status:
  acceptedNames:
    kind: Cluster
    listKind: ClusterList
    plural: clusters
    singular: cluster
  conditions:
  - lastTransitionTime: 2018-02-28T06:27:45Z
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: 2018-02-28T06:27:45Z
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  - lastTransitionTime: 2018-03-14T07:18:18Z
    message: CustomResource deletion is in progress
    reason: InstanceDeletionInProgress
    status: "True"
    type: Terminating
➜  tmp git:(master) ✗

justinbarrick on 14 Mar 2018

👍24

I have a fission namespace in a similar state:

➜  tmp git:(master) ✗ kubectl delete namespace fission
Error from server (Conflict): Operation cannot be fulfilled on namespaces "fission": The system is ensuring all content is removed from this namespace.  Upon completion, this namespace will automatically be purged by the system.
➜  tmp git:(master) ✗ kubectl get pods -n fission     
NAME                          READY     STATUS        RESTARTS   AGE
storagesvc-7c5f67d6bd-72jcf   0/1       Terminating   0          8d
➜  tmp git:(master) ✗ kubectl delete pod/storagesvc-7c5f67d6bd-72jcf --force --grace-period=0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Error from server (NotFound): pods "storagesvc-7c5f67d6bd-72jcf" not found
➜  tmp git:(master) ✗ kubectl describe pod -n fission storagesvc-7c5f67d6bd-72jcf
Name:                      storagesvc-7c5f67d6bd-72jcf
Namespace:                 fission
Node:                      10.13.37.5/10.13.37.5
Start Time:                Tue, 06 Mar 2018 07:03:06 +0000
Labels:                    pod-template-hash=3719238268
                           svc=storagesvc
Annotations:               <none>
Status:                    Terminating (expires Wed, 14 Mar 2018 06:41:32 +0000)
Termination Grace Period:  30s
IP:                        10.244.2.240
Controlled By:             ReplicaSet/storagesvc-7c5f67d6bd
Containers:
  storagesvc:
    Container ID:  docker://3a1350f6e4871b1ced5c0e890e37087fc72ed2bc8410d60f9e9c26d06a40c457
    Image:         fission/fission-bundle:0.4.1
    Image ID:      docker-pullable://fission/fission-bundle@sha256:235cbcf2a98627cac9b0d0aae6e4ea4aac7b6e6a59d3d77aaaf812eacf9ef253
    Port:          <none>
    Command:
      /fission-bundle
    Args:
      --storageServicePort
      8000
      --filePath
      /fission
    State:          Terminated
      Exit Code:    0
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /fission from fission-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from fission-svc-token-zmsxx (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  fission-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  fission-storage-pvc
    ReadOnly:   false
  fission-svc-token-zmsxx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  fission-svc-token-zmsxx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>
➜  tmp git:(master) ✗

Fission also uses CRDs, however, they appear to be cleaned up.

justinbarrick on 14 Mar 2018

👍1

@shean-guangchang - I had the same issue. I've deleted everything under the namespaces manually, deleted and purged everything from "helm" and restarted the master nodes one by one and it fixed the issue.

I imagine what i've encountered has something to do with "ark", "tiller" and Kuberenets all working together (i bootstraped using helm and backed-up using ark) so this may not be a Kuberenets issue per say, on the other hand, it was pretty much impossible to troubleshot because there are no relevant logs.

barakAtSoluto on 22 Mar 2018

if it is the rook one, take a look at this: https://github.com/rook/rook/issues/1488#issuecomment-365058080

xetys on 24 Mar 2018

👍4

I guess that makes sense, but it seems buggy that it's possible to get a namespace into an undeletable state.

justinbarrick on 24 Mar 2018

👍47

I have a similar environment (Ark & Helm) with @barakAtSoluto and have the same issue. Purging and restarting the masters didn't fix it for me though. Still stuck at terminating.

OguzPastirmaci on 26 Apr 2018

I had that too when trying to recreate the problem. I eventually had to create a new cluster....
Exclude - default, kube-system/public and all ark related namespaces from backup and restore to prevent this from happening...

barakAtSoluto on 29 Apr 2018

I'm also seeing this too, on a cluster upgraded from 1.8.4 to 1.9.6. I don't even know what logs to look at

jaxxstorm on 3 May 2018

The same issue on 1.10.1 :(

adampl on 27 Jun 2018

Same issue on 1.9.6

Edit: The namespace couldn't be deleted because of some pods hanging. I did a delete with --grace-period=0 --force on them all and after a couple of minutes the namespace was deleted as well.

siXor on 28 Jun 2018

👍26 😕11 🎉8

Hey,

I've got this over and over again and it's most of the time some trouble with finalizers.

If a namespace is stuck, try to kubectl get namespace XXX -o yaml and check if there is a finalizer on it. If so, edit the namespace and remove the finalizer (by passing an empty array) and then the namespace gets deleted

xetys on 28 Jun 2018

👍80 🎉28 😕20 ❤11 👎8 🚀2

@xetys is it safe? in my case there is only one finalizer named "kubernetes".

adampl on 29 Jun 2018

👍68 👀7

That's strange, I've never seen such a finalizer. I just can speak based in my experience. I did that several time in a production cluster and it's still alive

xetys on 2 Jul 2018

😄8

Same issue on 1.10.5. I tried all advice in this issue without result. I was able to get rid of the pods, but the namespace is still hanging.

andraxylia on 3 Jul 2018

Actually, the ns too got deleted after a while.

andraxylia on 3 Jul 2018

It would be good to understand what causes this behavior, the only finalizer I had is kubernetes. I also have dynamic webhooks, can these be related?

andraxylia on 3 Jul 2018

@xetys Well, finally I used your trick on the replicas inside that namespace. They had some custom finalizer that probably no longer existed, so I couldn't delete them. When I removed the references to that finalizer, they disappeared and so did the namespace. Thanks! :)

adampl on 3 Jul 2018

Same issue on an EKS 1.10.3 cluster:

Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-28T20:13:43Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

bobhenkel on 13 Jul 2018

👍4

Having the same problem on a bare metal cluster:

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

My namespace looks like so:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"name":"creneaux-app","namespace":""}}
  name: creneaux-app
spec:
  finalizers:
  - kubernetes

It's actually the second namespace I've had have this problem.

ManifoldFR on 22 Jul 2018

👍24

Try this to get the actual list of all things in your namespace: https://github.com/kubernetes/kubectl/issues/151#issuecomment-402003022

Then for each object do kubectl delete or kubectl edit to remove finalizers.

adampl on 23 Jul 2018

👍14

removing the initializer did the trick for me...

pauloeliasjr on 24 Jul 2018

👎10 👍1

When I do kubectl edit namespace annoying-namespace-to-delete and remove the finalizers, they get re-added when I check with a kubectl get -o yaml.

Also, when trying what you suggested @adampl I get no output (removing --ignore-not-found confirms no resources are found in the namespace, of any type).

ManifoldFR on 24 Jul 2018

👍47 😕14

@ManifoldFR , I had the same issue as yours and I managed to make it work by making an API call with json file .
kubectl get namespace annoying-namespace-to-delete -o json > tmp.json
then edit tmp.json and remove"kubernetes"

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize

and it should delete your namespace,

slassh on 28 Jul 2018

👍301 ❤76 🎉75 🚀38 😄34 👎31 👀14 😕3

@slassh It worked! Should've thought about making an API call: thanks a lot! We shall sing your praises forever

ManifoldFR on 28 Jul 2018

Issue exists in v1.11.1. I had a stuck rancher/helm deployment of dokuwiki. I first had to force delete the pods as suggested by @siXor and then followed @slassh advice. All good now.

2stacks on 2 Aug 2018

@slassh how to see the kubernetes-cluster-ip? i use one of the node ip deployed the kubernetes cluster replace, and it report 404.

jiuchongxiao on 3 Aug 2018

hi @jiuchongxiao , by kubernetes-cluster-ip i meant one of your node masters ip.
sorry if it's confusing !

slassh on 3 Aug 2018

If you start 'kubectl proxy' first you can direct the curl to http://127.0.0.1:8001/api/v1/namespaces/annoying-namespace-to-delete/finalize. I couldn't get authentication to work until I did it that way.

2stacks on 3 Aug 2018

👍66 🎉10 🚀6 ❤5

good tips @2stacks. Just need replace https to http.

fkpwolf on 21 Aug 2018

👍4

I still see this issue in 1.11.2.

mbrt on 23 Aug 2018

👍5 😕3

To give more context for reproducing, I saw this only with CRDs. By deleting the CRD object, I got in a weird state were the objects owned by it were not deleted. I didn't notice so I issued a delete for the namespace. Then I deleted all the objects in the namespace with kubectl delete all --all -n my-namespace. At that point the namespace deletion got stuck. I hope this helps in some way.

mbrt on 23 Aug 2018

👍2 🎉1

By looking at logs I just found out that this particular problem was related to the controller manager being unhealthy. In my case it was not a bug most likely. When the controller manager went up again everything was cleaned up correctly.

mbrt on 28 Aug 2018

@slassh Perfect solution! thank you very much！！！！

thsheep on 29 Aug 2018

👍1

I also see this issue with 1.10.x. I find @slassh's comment a workaround that only hides the real issue. Why are the namespaces stuck at Terminating?

palmerabollo on 12 Oct 2018

We discovered the reason for deleting namespaces to be stuck in our case (@palmerabollo)

When a namespace has a finalizer kubernetes, it means its an internal problem with the API Server.

Run kubectl api-resources, if it returns an like the following, it means that the custom API isn't reachable.

error: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request

Run kubectl get apiservices v1beta1.metrics.k8s.io -o yaml, for checking its status conditions.

status:
  conditions:
  - lastTransitionTime: 2018-10-15T08:14:15Z
    message: endpoints for service/metrics-server in "kube-system" have no addresses
    reason: MissingEndpoints
    status: "False"
    type: Available

The above error should be caused by a crashloopbackoff affecting metrics-server. It would be similar for other custom APIs registered in Kubernetes.

Check your services health in kube-system for restoring cluster runtime operations like deleting namespaces.

javierprovecho on 15 Oct 2018

👍40 ❤6

I'm facing with this issue on v1.11.3. At to the finalizers only kubernetes present for problematic namespace.

spec:
  finalizers:
  - kubernetes

grendach on 2 Nov 2018

👍2

@slassh Thanks a million, your solution works well!
I have the same problem in my cluster with ark, tiller and kubed. I suspect the issues might be the api of kubed that is giving an error, although not sure why it impacts the deletion of another namespace.

davidedal on 8 Nov 2018

👍1

@javierprovecho I was merely playing around with metrics server and since it wasn't working I tried to delete the service and whatnot but my namespace still won't delete, error is

status:
  conditions:
  - lastTransitionTime: 2018-08-24T08:59:36Z
    message: service/metrics-server in "kube-system" is not present
    reason: ServiceNotFound
    status: "False"
    type: Available

Do you know how to recover from this state?

edit: I found out... I had to delete _everything_ even remotely related to metrics/HPA and then restart the entire control plane (had to take down all my replicas of it, before booting them back up.) this included deleting the apiservice v1beta1.metrics.k8s.io itself.

2rs2ts on 15 Nov 2018

@2rs2ts

$ kubectl delete apiservice v1beta1.metrics.k8s.io

By getting rid of the non-functioning metrics API service the controller-manager will be able to delete the stale namespace(s).

Restarting the control plane is not necessary.

antoineco on 15 Nov 2018

👍14 ❤2 🎉1

@antoineco no, it was necessary; I deleted the apiservice and waited quite a while but the namespace would not be deleted until I restarted the control plane.

2rs2ts on 15 Nov 2018

first, take small coffee and relax , now go to your k8s master nodes

kubectl cluster-info
Kubernetes master is running at https://localhost:6443
KubeDNS is running at https://localhost:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

now run the kube-proxy
kubectl proxy &
Starting to serve on 127.0.0.1:8001

save the ID to delete it later on :)

find your name-space that decided no to be deleted :) for us it will be cattle-system
kubectl get ns
cattle-system Terminating 1d

put it in file

kubectl get namespace cattle-system -o json > tmp.json

edit the file and remove the finalizers
},
"spec": {
"finalizers": [
"kubernetes"
]
},
after editing it should look like this 👍
},
"spec": {
"finalizers": [
]
},
we almost there 👍

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json http://127.0.0.1:8001/api/v1/namespaces/${NAMESPACE}/finalize

and it's gone 👍

shdowofdeath on 19 Nov 2018

👍40 ❤8 🎉8 👎8 🚀6

Hey, the finalizer kubernetes is there for a reason. For us it was a wrongly configured metrics API service name. Maybe for you is something else, which you can discover by looking at your control plane logs. Without confirmation of a bug, removing the finalizer may produce undesired consequences like leaving stuff created that can no longer be accesible again for deletion purposes.

javierprovecho on 19 Nov 2018

👍12

as this issue is still open:
within my minikube cluster running with "none", this happened after the host woke up from hibernate.

my assumption:
in my case the hibernate triggered the same problems, an enabled swap would do.

which yields the assumption:
the swap might be enabled in the affected clusters?

but this is just conjecture. the important thing for me, and anyone landing in this bug with my local setup: hibernate is bad for kubernetes.

scones on 20 Nov 2018

first, take small coffee and relax , now go to your k8s master nodes

kubectl cluster-info
Kubernetes master is running at https://localhost:6443
KubeDNS is running at https://localhost:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

now run the kube-proxy
kubectl proxy &
Starting to serve on 127.0.0.1:8001

save the ID to delete it later on :)

find your name-space that decided no to be deleted :) for us it will be cattle-system
kubectl get ns
cattle-system Terminating 1d

put it in file

kubectl get namespace cattle-system -o json > tmp.json

edit the file and remove the finalizers
},
"spec": {
"finalizers": [
"kubernetes"
]
},
after editing it should look like this 👍
},
"spec": {
"finalizers": [
]
},
we almost there 👍

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json http://127.0.0.1:8001/api/v1/namespaces/${NAMESPACE}/finalize

and it's gone 👍

Great!!
Works

omerElezra on 20 Nov 2018

I run into this issue periodically if we change our gcloud instances (e.g. upgrading nodes). This replaces the old node from gcloud instances list with a new one but leaves the pods in the k8s namespace hanging:

Reason:                    NodeLost
Message:                   Node <old node> which was running pod <pod> is unresponsive

This then leaves the pods in an unknown state:

$ kubectl get po
NAME                               READY     STATUS    RESTARTS   AGE
<pod>                              2/2       Unknown   0          39d

Due to this, the namespace will never finish terminating. Not sure if this means we should change our finalizers or if there's an actual bug related to terminating that should be handling pods in an UNKNOWN state (or if there should be a way of force terminating a namespace for cases like this).

wolfadactyl on 20 Nov 2018

I run into this issue periodically if we change our gcloud instances (e.g. upgrading nodes). This replaces the old node from gcloud instances list with a new one but leaves the pods in the k8s namespace hanging:
Reason:                    NodeLost
Message:                   Node <old node> which was running pod <pod> is unresponsive
This then leaves the pods in an unknown state:
$ kubectl get po
NAME                               READY     STATUS    RESTARTS   AGE
<pod>                              2/2       Unknown   0          39d
Due to this, the namespace will never finish terminating. Not sure if this means we should change our finalizers or if there's an actual bug related to terminating that should be handling pods in an UNKNOWN state (or if there should be a way of force terminating a namespace for cases like this).

cool it's not the same issue
you need to put nodes in maintenance mode and then after it's in maintenance mode all pods will be evacuated and then u can delete/upgrade

shdowofdeath on 21 Nov 2018

look it ，https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/,
edit resource and delete metadata.finalizers，and delete unuseful crd，you can delete it force

shenkonghui on 24 Nov 2018

But what does the kubernetes finalizer do exactly? Is there any risk that resources are not being correctly cleaned up with this hack?

samuela on 9 Dec 2018

👍3

For the rook stuck terminating this helped https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md

marcstreeter on 13 Dec 2018

👍3

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize

Error from server (NotFound): namespaces "annoying-namespace-to-delete" not found

LeiYangGH on 18 Dec 2018

first, take small coffee and relax , now go to your k8s master nodes

kubectl cluster-info
Kubernetes master is running at https://localhost:6443
KubeDNS is running at https://localhost:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

now run the kube-proxy
kubectl proxy &
Starting to serve on 127.0.0.1:8001

save the ID to delete it later on :)

find your name-space that decided no to be deleted :) for us it will be cattle-system
kubectl get ns
cattle-system Terminating 1d

put it in file

kubectl get namespace cattle-system -o json > tmp.json

edit the file and remove the finalizers
},
"spec": {
"finalizers": [
"kubernetes"
]
},
after editing it should look like this 👍
},
"spec": {
"finalizers": [
]
},
we almost there 👍

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json http://127.0.0.1:8001/api/v1/namespaces/${NAMESPACE}/finalize

and it's gone 👍

Invalid value: "The edited file failed validation": ValidationError(Namespace.spec): invalid type for io.k8s.api.core.v1.NamespaceSpec: got "string", expected "map"

LeiYangGH on 18 Dec 2018

👍5

If you have many namespaces stuck in Terminating, you can automate this:

kubectl get ns | grep Terminating | awk '{print $1}' | gxargs  -n1 -- bash -c 'kubectl get ns "$0" -o json | jq "del(.spec.finalizers[0])" > "$0.json"; curl -k -H "Content-Type: application/json" -X PUT --data-binary @"$0.json" "http://127.0.0.1:8001/api/v1/namespaces/$0/finalize" '

make sure that all namespaces that you want the finalizer removed are indeed Terminating.

You need the kubectl proxy running and jq for the above to work.

olmoser on 19 Dec 2018

👍28 ❤1 🎉1

In our case, metrics api service is down and i can see this error log from verbose logging

kubectl delete ns <namespace-name> -v=7
.......
I0115 11:03:25.548299   12445 round_trippers.go:383] GET https://<api-server-url>/apis/metrics.k8s.io/v1beta1?timeout=32s
I0115 11:03:25.548317   12445 round_trippers.go:390] Request Headers:
I0115 11:03:25.548323   12445 round_trippers.go:393]     Accept: application/json, */*
I0115 11:03:25.548329   12445 round_trippers.go:393]     User-Agent: kubectl/v1.11.3 (darwin/amd64) kubernetes/a452946
I0115 11:03:25.580116   12445 round_trippers.go:408] Response Status: 503 Service Unavailable in 31 milliseconds

After fixing the metrics apiservice, terminating ones are completed.
Not really sure why deletion depends on metrics apiservice, also intrested to know how it works if metrics apiservice is not installed on the cluster

manojbadam on 16 Jan 2019

👍3

Not really sure why deletion depends on metrics apiservice,

@manojbadam because metrics is registered in the api server, when performing a namespace deletion, it must query that external api for (namespaced) resources to be deleted (if exist) associated with that namespace. If the extension server isn't available, Kubernetes can't guarantee that all objects have been removed, and it doesn't have a persistent mechanism (in memory or disk) to reconcile later because the root object would have been removed. That happens with any registered api extension service.

javierprovecho on 17 Jan 2019

👍1

As I was constantly running into this, I automated this with a small shell script:

https://github.com/ctron/kill-kube-ns/blob/master/kill-kube-ns

It fetches the project, fixes the JSON, starts and properly stops "kubectl proxy", …

Thanks to everyone pointing me into the right direction!

ctron on 18 Jan 2019

👍27 ❤12 🎉9 😄5 🚀3 👎2

As I was constantly running into this, I automated this with a small shell script:

https://github.com/ctron/kill-kube-ns/blob/master/kill-kube-ns

It fetches the project, fixes the JSON, starts and properly stops "kubectl proxy", …

Thanks to everyone pointing me into the right direction!

my hero! <3

samvdb on 19 Jan 2019

🎉1

I ran into this problem too. I'm on Google Kubernetes Engine and using Terraform to spin up Kubernetes clusters and to create namespaces and pods inside the cluster. The problem started a while after running a terraform destroy.

In my case, this turns out to be an issue with order in which Terraform executes the destroy. Terraform deletes the node pool first, and then deletes the namespaces and pods. But due to deleting the (only) node pool, the Kubernetes cluster broke, and that's what caused the namespace deletion to get stuck at "terminating" forever.

FooBarWidget on 14 Feb 2019

👍3 👀1

@FooBarWidget same problem for me :(

asychev on 15 Feb 2019

👍4

As I was constantly running into this, I automated this with a small shell script:

https://github.com/ctron/kill-kube-ns/blob/master/kill-kube-ns

It fetches the project, fixes the JSON, starts and properly stops "kubectl proxy", …

Thanks to everyone pointing me into the right direction!

[root@k8s-master ~]# curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://172.*****:6443/api/v1/namespaces/rook-ceph/finalize
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {

},
"status": "Failure",
"message": "namespaces "rook-ceph" is forbidden: User "system:anonymous" cannot update namespaces/finalize in the namespace "rook-ceph"",
"reason": "Forbidden",
"details": {
"name": "rook-ceph",
"kind": "namespaces"
},
"code": 403

I got a return code 403, what should I do :(

mingxingshi on 22 Feb 2019

👍1

As I was constantly running into this, I automated this with a small shell script:
https://github.com/ctron/kill-kube-ns/blob/master/kill-kube-ns
It fetches the project, fixes the JSON, starts and properly stops "kubectl proxy", …
Thanks to everyone pointing me into the right direction!
[root@k8s-master ~]# curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://172.*****:6443/api/v1/namespaces/rook-ceph/finalize
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {

},
"status": "Failure",
"message": "namespaces "rook-ceph" is forbidden: User "system:anonymous" cannot update namespaces/finalize in the namespace "rook-ceph"",
"reason": "Forbidden",
"details": {
"name": "rook-ceph",
"kind": "namespaces"
},
"code": 403
I got a return code 403, what should I do :(

Thx god, the terminating namespace is finally gone. The following method do trick for me.

NAMESPACE=rook-ceph
kubectl proxy &
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize

mingxingshi on 22 Feb 2019

👍31 🚀7 ❤4

I have the same issue but i don't see any metrics-service.

I'm playing around with k8s from digitalocean and gitlab auto devops. My assumption is some digitalocean blob storage but i'm lost on how to analyse or fix it.

Sicaine on 28 Feb 2019

@mingxingshi tx. Did a edit namespace which didn't do the trick. Your script did it.

Sicaine on 2 Mar 2019

Wow, finally got rid of it. Thanks for the commands @mingxingshi !

zimmertr on 11 Mar 2019

The solution for me was:

kubectl delete apiservice v1beta1.metrics.k8s.io

PascalBourdier on 13 Mar 2019

👍5

just figured i should leave my experience of this here:

i was doing terraform apply with the following resource:

resource "helm_release" "servicer" {
  name      = "servicer-api"
  // n.b.: this is a local chart just for this terraform project
  chart     = "charts/servicer-api"
  namespace = "servicer"
  ...
}

but i am a helm newb and had a template that had a template in it that created a namespace called servicer. This caused terraform and k8s to get in this bad state where terraform would fail, then k8s would leave the servicer namespace permanently in the Terminating state. Doing what @mingxingshi suggests above made the namespace terminate, as it had no resources attached to it.

this issue stopped happening for me when i removed that template that made the namespace and left it to helm to create it.

chris-brace on 21 Mar 2019

The problem is completely repeatable for me. First, clone the prometheus-operator. Then:

cd prometheus-operator/contrib/kube-prometheus
kubectl create -f manifests/ --validate=false
 ... wait ...
kubectl delete namespace monitoring

Hangs. If, however, I use kubectl delete -f manifests/, then cleanup is successful.

pdxkzeerkti on 29 Mar 2019

👍3

Yeah, had the same hang with prometheus-operator. Need to kubectl delete -f manifests/ to unstuck.
I think there are some finalizers in prometheus CRD's that are misbehaving, in this particular scenario it's hardly kubernetes fault. However kubernetes should make it easier finding the culprit, because the lengts of this thread demonstrates that there could be many causes and it's not easy to get to the bottom of it in each particular scenario.

AndrewSav on 29 Mar 2019

👍3

I'm a kubernetes noob so I can't offer much info here but I also have 2 namespaces stuck in terminating status. My kubernetes setup is using IBM Cloud Private 3.1.2 Community Edition

kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4+icp", GitCommit:"3f5277fa129f05fea532de48284b8b01e3d1ab4e", GitTreeState:"clean", BuildDate:"2019-01-17T13:41:02Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4+icp", GitCommit:"3f5277fa129f05fea532de48284b8b01e3d1ab4e", GitTreeState:"clean", BuildDate:"2019-01-17T13:41:02Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

kubectl cluster-info
Kubernetes master is running at https://ip
catalog-ui is running at https://ip/api/v1/namespaces/kube-system/services/catalog-ui:catalog-ui/proxy
Heapster is running at https://ip/api/v1/namespaces/kube-system/services/heapster/proxy
image-manager is running at https://ip/api/v1/namespaces/kube-system/services/image-manager:image-manager/proxy
CoreDNS is running at https://ip/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
metrics-server is running at https://ip/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
platform-ui is running at https://ip/api/v1/namespaces/kube-system/services/platform-ui:platform-ui/proxy

kubectl get nodes
NAME          STATUS                     ROLES                          AGE   VERSION
ip1    Ready,SchedulingDisabled   etcd,management,master,proxy   23h   v1.12.4+icp
ip2   Ready                      worker                         23h   v1.12.4+icp
ip3   Ready                      worker                         23h   v1.12.4+icp

I have two namespaces stuck in the terminating state

kubectl get ns
NAME           STATUS        AGE
my-apps       Terminating   21h
cert-manager   Active        23h
default        Active        23h
istio-system   Active        23h
kube-public    Active        23h
kube-system    Active        23h
platform       Active        22h
psp-example    Terminating   18h
services       Active        22h

When I check the finalizers as described in this comment I only see kubernetes.

kubectl get ns my-apps -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"name":"my-apps"}}
  creationTimestamp: 2019-04-10T18:23:55Z
  deletionTimestamp: 2019-04-11T15:24:24Z
  name: my-apps
  resourceVersion: "134914"
  selfLink: /api/v1/namespaces/my-apps
  uid: ccb0398d-5bbd-11e9-a62f-005056ad5350
spec:
  finalizers:
  - kubernetes
status:
  phase: Terminating

Regardless I tried removing kubernetes from the finalizers and it didn't work. I also tried using the json/api approach described in this comment. Also didn't work. I tried restarting all the nodes and that didn't work either.

I also tried doing the force delete and that doesn't work either

kubectl delete namespace my-apps --force --grace-period 0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Error from server (Conflict): Operation cannot be fulfilled on namespaces "my-apps": The system is ensuring all content is removed from this namespace.  Upon completion, this namespace will automatically be purged by the system.

zjgoodman on 11 Apr 2019

👍5

In my case the namespace is rook-ceph, kubectl -n rook-ceph patch cephclusters.ceph.rook.io rook-ceph -p '{"metadata":{"finalizers": []}}' --type=merge works for me. For other cases it should work too.

From: https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md

ihciah on 13 Apr 2019

👍7 🚀1 🎉1

@ManifoldFR , I had the same issue as yours and I managed to make it work by making an API call with json file .
kubectl get namespace annoying-namespace-to-delete -o json > tmp.json
then edit tmp.json and remove"kubernetes"

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize

and it should delete your namespace,

I have some problem while using your approach, what should I do for next step troubleshooting?

~ curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://39.96.4.11:6443/api/v1/namespaces/istio-system/finalize
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "namespaces \"istio-system\" is forbidden: User \"system:anonymous\" cannot update resource \"namespaces/finalize\" in API group \"\" in the namespace \"istio-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "istio-system",
    "kind": "namespaces"
  },
  "code": 403

lioncruise on 16 May 2019

👍4

@ManifoldFR , I had the same issue as yours and I managed to make it work by making an API call with json file .
kubectl get namespace annoying-namespace-to-delete -o json > tmp.json
then edit tmp.json and remove"kubernetes"
curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize
and it should delete your namespace,

I have some problem while using your approach, what should I do for next step troubleshooting?
~ curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://39.96.4.11:6443/api/v1/namespaces/istio-system/finalize
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "namespaces \"istio-system\" is forbidden: User \"system:anonymous\" cannot update resource \"namespaces/finalize\" in API group \"\" in the namespace \"istio-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "istio-system",
    "kind": "namespaces"
  },
  "code": 403

My problem can be solved by this script: https://github.com/ctron/kill-kube-ns/blob/master/kill-kube-ns.

lioncruise on 17 May 2019

👍5

yup https://github.com/ctron/kill-kube-ns/blob/master/kill-kube-ns does the trick

set -eo pipefail

die() { echo "$*" 1>&2 ; exit 1; }

need() {
    which "$1" &>/dev/null || die "Binary '$1' is missing but required"
}

# checking pre-reqs

need "jq"
need "curl"
need "kubectl"

PROJECT="$1"
shift

test -n "$PROJECT" || die "Missing arguments: kill-ns <namespace>"

kubectl proxy &>/dev/null &
PROXY_PID=$!
killproxy () {
    kill $PROXY_PID
}
trap killproxy EXIT

sleep 1 # give the proxy a second

kubectl get namespace "$PROJECT" -o json | jq 'del(.spec.finalizers[] | select("kubernetes"))' | curl -s -k -H "Content-Type: application/json" -X PUT -o /dev/null --data-binary @- http://localhost:8001/api/v1/namespaces/$PROJECT/finalize && echo "Killed namespace: $PROJECT"

maurodelazeri on 31 May 2019

👍22 🎉3 👀1

It seems that namespaces are actually not deleted.
In my case, kubectl get ns does not show the deleted namespace but a kubectl get all -n <namespace> show all resources safe and sound.
I checked on the nodes and docker containers were still running...

glouis on 6 Jun 2019

@glouis that's because you bypassed the finalizers using the method above, so Kubernetes didn't have time to execute all those essential deletion tasks.

It's really sad to see so many people blindly advocating for this method without understanding its consequences. It's extremely ugly and can potentially leave tons of leftovers in the cluster. @javierprovecho already mentioned it above, and @liggitt also mentioned it in another GitHub issue.

You'd be better off fixing the broken v1beta1.metrics.k8s.io API service, or deleting it if you don't need it.

See also #73405

antoineco on 6 Jun 2019

👍14 ❤1

I second @antoineco message. I tested this out on one of our sandbox environments because we were constantly getting stuck namespaces. after about a month all the docker daemons were freezing for no reason. Turns out we created huge memory leaks from leaving resources behind.

After a lot of trial and error, and reading through these comments, it turned out to be custom resource definition for coreos grafana stack for the namespaces. Listing out the CRDs showed specific resources for that namespace. I was very lucky that the name of the CRD had the namespace in it that was stuck.

It also turned out that having one stuck namespace stops any more namespaces to delete. So even if you have namespace A that has no CRD's getting it stuck, and there is a namespace B with a stuck CRD, all the resources in A will stick around until B is gone. I think I must have done the described fix above on namespace A leaving a ton of resources around every time.

The thing that is still killing me is I cannot for the life of me find any log mentioning a namespace clean up failing on deleting a CRD, or even what it is currently doing. I had to spend an hour just figuring out what CRD it was stuck on. If anyone has an idea on how to get more info so I dont have to spend a huge amount of time figuring out the stuck resource that would be awesome.

jecafarelli on 13 Jun 2019

@jecafarelli good hint for Production Clusters. But unfortunate for me, i was just not able to kill it otherwise. I also knew i will recreate the whole cluster later on.

I tried analysing the issue but nothin in this thread helped me to solve it by other means.

Siegfriedk on 13 Jun 2019

This official solution helped me: https://success.docker.com/article/kubernetes-namespace-stuck-in-terminating
This is not the same as kubectl edit namespace rook-ceph. I was unable to solve this problem until I PUTrequest with deleted _"finalizers"_

Kamikozz on 14 Jun 2019

ok so I ran into this again with coreos, and I dug a bit deeper. this is most definitely because of a cluster wide resource definition that is namespaced, and further more maybe it couldnt delete it because it cant query info on coreos. I did find errors in the apiserver logs that showed errors on trying to get information about an api group. I used the referenced issue above to come up with a quick script that lists out the resources that got the ns stuck for me.

ill probably just use this in the future if I run into it again and keep adding any other namespaced resources I run into.

for ns in `kubectl get ns --field-selector status.phase=Terminating -o name | cut -d/ -f2`; 
do
  echo "apiservice under namespace $ns"
  kubectl get apiservice -o json |jq --arg ns "$ns" '.items[] |select(.spec.service.namespace != null) | select(.spec.service.namespace == $ns) | .metadata.name ' --raw-output
  echo "api resources under namespace $ns"
  for resource in `kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -o name -n $ns`; 
  do 
    echo $resource
  done;
done

jecafarelli on 19 Jun 2019

👍3

Thanks a lot @jecafarelli, you helped me solve my issue the right way ;-)

I had installed cert-manager on an OpenShift cluster inside the cert-manager namespace and when I tried to delete this namespace, it got stuck in terminating state. Executing oc delete apiservices v1beta1.admission.certmanager.k8s.io seems to have solved the problem, the namespace is gone.

yann-soubeyrand on 21 Jun 2019

👍2

Thanks a lot @jecafarelli, you helped me solve my issue the right way ;-)

I had installed cert-manager on an OpenShift cluster inside the cert-manager namespace and when I tried to delete this namespace, it got stuck in terminating state. Executing oc delete apiservices v1beta1.admission.certmanager.k8s.io seems to have solved the problem, the namespace is gone.

Same here, running kubectl delete -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml helped

gregory-fonkatz on 24 Jun 2019

Just chiming in to say I've also met this error on version 1.13.6 with GKE. It happened after I disabled GKE's Istio addon with the goal of manually installing it for full control.
This is the longest issue thread I've ever taken the time to read through, and I'm blown away that there is no real consensus or reproduction steps to the root of this issue. Seems it can get tripped in so many different way :(

The JSON and curl/proxy method mentioned numerous times above and documented at https://success.docker.com/article/kubernetes-namespace-stuck-in-terminating is what saved me.

sudomann on 6 Jul 2019

The advice at https://success.docker.com/article/kubernetes-namespace-stuck-in-terminating is actively harmful, and can result in orphaned resources not getting cleaned up and resurfacing if a namespace with an identical name is later recreated.

There is work in progress to surface the specific cause of the hung delete, but the fundamental issue is that there are API types that cannot be verified to have been cleaned up, so the namespace deletion blocks until they are verified.

liggitt on 6 Jul 2019

👍8

We also hit this with Knative which installs this namespaced apiservice.

---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  labels:
    autoscaling.knative.dev/metric-provider: custom-metrics
    serving.knative.dev/release: v0.7.1
  name: v1beta1.custom.metrics.k8s.io
spec:
  group: custom.metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: autoscaler
    namespace: knative-serving
  version: v1beta1
  versionPriority: 100
---

After deleting it both the knative-serving ns and a bunch of other stuck namespaces cleaned up. Thanks to @jecafarelli for the above bash script.
Here's a terrible powershell version.

$api = kubectl get apiservice -o json  | convertfrom-json
#list out the namespaced api items can ignore kube-system
$api.items | % { $_.spec.service.namespace }
#repalce knative-serving with whatever namespace you found
$api.items | ? { $_.spec.service.namespace -eq 'knative-serving'  } | ConvertTo-Json
#replace v1beta1.custom.metrics.k8s.io with whatever you found. 
k delete apiservice v1beta1.custom.metrics.k8s.io

paulgmiller on 12 Jul 2019

I had the same problem today and this script worked for me.

indradhanush on 17 Jul 2019

👍4 👎1

@kubernetes/sig-api-machinery-misc

This bug has existed for > year and is still a problem... What is your plan to address inbound issues such as this?

timothysc on 5 Aug 2019

👍6 👀1

This could help with at least understanding whats going on: https://github.com/kubernetes/kubernetes/pull/80962

alvaroaleman on 5 Aug 2019

I am hitting the same issue

k get ns cdnamz-k8s-builder-system  -o yaml 
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"labels":{"control-plane":"controller-manager"},"name":"cdnamz-k8s-builder-system"}}
  creationTimestamp: "2019-08-05T18:38:21Z"
  deletionTimestamp: "2019-08-05T20:37:37Z"
  labels:
    control-plane: controller-manager
  name: cdnamz-k8s-builder-system
  resourceVersion: "5980028"
  selfLink: /api/v1/namespaces/cdnamz-k8s-builder-system
  uid: 3xxxxxxx
spec:
  finalizers:
  - kubernetes
status:
  phase: Terminating

 k get ns 
NAME                        STATUS        AGE
cdnamz-k8s-builder-system   Terminating   4h20m

goswamig on 6 Aug 2019

Namespace controller should report conditions to the namespace status and clients should report that. Needs a KEP, but should be pretty straightforward if someone can take and validate it.

smarterclayton on 7 Aug 2019

@timothysc there is (or was) a PR in flight (somewhere) doing exactly what @smarterclayton says.

I am pretty sure there is another github issue about this, too?

lavalamp on 7 Aug 2019

Yeah the PR is here: https://github.com/kubernetes/kubernetes/pull/73405

The issue I consider canonical is here: https://github.com/kubernetes/kubernetes/issues/70916

lavalamp on 7 Aug 2019

Here's a resource that helped me: https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.1/troubleshoot/ns_terminating.html

It's similar to the solution proposed by @slassh, but it uses kubectl proxy to create a local proxy and make the target IP of the curl command predictable.

--

Edit: as stated several times below this answer, this solution is a dirty hack and will possibly leave some dependent resources in the cluster. Use at your own risk, and possibly only use it as a quick way out in a development cluster (don't use it in a production cluster).

aurelienshz on 15 Aug 2019

👎2

removing the finalizer directly as described in the doc above can have consequences. The resources that were pending deletion will still be defined in the cluster even after the namespace has been released. This is the purpose of the finalizer. To ensure that all dependents are removed prior to allowing the deletion of the ns.

mauilion on 17 Aug 2019

👍3

Found workaround in similar questions:

NAMESPACE=<namespace-name>
kubectl proxy &
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize

pavel-khritonenko on 21 Aug 2019

👍26 🚀4 🎉2 👎2

Found workaround in similar questions:

NAMESPACE=<namespace-name>
kubectl proxy &
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize

Thank you!
It works good.
I create simple app use this workaround: https://github.com/jenoOvchi/k8sdelns
I use it for fast deletion and hope it will be helpful for someone.

jenoOvchi on 25 Aug 2019

👍10 👎6 ❤1

Kubernetes 1.12.2 namespaces are in the Terminating state. Sometimes the finalizers can be deleted by modifying the yaml of ns. It cannot be deleted by api. Can it be deleted? What is the situation? Have we specifically tracked it (prerequisite: there are no resources in this ns), I hope I can get pointers, thank you!

ACK-lcn on 26 Aug 2019

Again, please do not remove the finalizer, it is there for a reason. Try to instead find out which resources in the NS are pending deletion by:

Checking if any apiservice is unavailable and hence doesn't serve its resources: kubectl get apiservice|grep False
Finding all resources that still exist via kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -n $your-ns-to-delete (Kudos to Jordan for that one)

The solution to this problem is not to short-circuit the cleanup mechanism, its to find out what prevents cleanup from succeeding.

alvaroaleman on 26 Aug 2019

👍65 🎉16

Again, please do not remove the finalizer, it is there for a reason. Try to instead find out which resources in the NS are pending deletion by:

Checking if any apiservice is unavailable and hence doesn't serve its resources: kubectl get apiservice|grep False

Finding all resources that still exist via kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -n $your-ns-to-delete (Kudos to Jordan for that one)

The solution to this problem is not to short-circuit the cleanup mechanism, its to find out what prevents cleanup from succeeding.

How right you are.
In my case pod of Operator Framework apiservice was deleted and block terminating process.
Removing an unused apiservice (kubectl delete apiservice ) solved the problem.

jenoOvchi on 26 Aug 2019

👍8 🎉5

Hi all, code freeze is coming up in just a few days (Thursday, end of day, PST), so we need to make sure that this issue will be solved for v1.16 or moved to v1.17. Can you comment on it's status?

josiahbjorgaard on 27 Aug 2019

Will this be backported into a current GKE release? I have a cluster that has a handful of namespaces that are still "Terminating".

squarelover on 28 Aug 2019

@squarelover even after doing this? https://github.com/kubernetes/kubernetes/issues/60807#issuecomment-524772920

lavalamp on 28 Aug 2019

@josiahbjorgaard I just approved the PR, which is all we will be doing on this for 1.16.

lavalamp on 28 Aug 2019

https://github.com/kubernetes/kubernetes/pull/73405 is the aforementioned PR

alapidas on 28 Aug 2019

It's merged. I think there may be more we can do, but please take future comments to #70916.

lavalamp on 30 Aug 2019

Again, please do not remove the finalizer, it is there for a reason. Try to instead find out which resources in the NS are pending deletion by:

Checking if any apiservice is unavailable and hence doesn't serve its resources: kubectl get apiservice|grep False

Finding all resources that still exist via kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -n $your-ns-to-delete (Kudos to Jordan for that one)

The solution to this problem is not to short-circuit the cleanup mechanism, its to find out what prevents cleanup from succeeding.

In many of the cases you might have Metric-Server installed. When the pods you deploy in a specific namespaces looks for metric gathering. It hangs on with the Metric-server. So even after you delete all the resources in that namespace, metric-server is somehow linked to that namespace. Which will prevent you from deleting the namespace.
This post helps you identify the reason why you cannot delete Namespace. So the rite way.

ganchandrasekaran on 5 Sep 2019

👍11

Try this to get the actual list of all things in your namespace: kubernetes/kubectl#151 (comment)

Then for each object do kubectl delete or kubectl edit to remove finalizers.

This solution useful for me, thanks.

Aisuko on 28 Oct 2019

Hi guys,

I made a script to make easier delete namespaces in Terminating status: https://github.com/thyarles/knsk.

Thanks.

thyarles on 30 Oct 2019

👍4 🎉3

We met with the same issue, when deleting a namespace, it will stuck in 'Terminating' state. I followed the stpes above to remove 'kubernetes' in finalizer in the yaml file. It works.

However, we don't know why we need to do extra steps. It should do kubectl delete ns foonamespace and it should delete. Can anyone give me a reason? Thank you!

xzhang007 on 11 Nov 2019

Hello @xzhang007,

If you discover why the namespace deletion stucks in Terminating state, please, let me know. I tried for a while a good answer, but nothing. Then I made a script to make my life easier until discover and fix the cause.

Thank you.

thyarles on 11 Nov 2019

@thyales it seems I did not find an answer up to now.

xzhang007 on 11 Nov 2019

In our case we discovered that one of the webhhoks and finalizers we were
using was reaching out to a pod which was living in the namespaces that got
deleted.
Once the pod got deleted, the termination was stuck.

>

fabiand on 11 Nov 2019

@xzhang007 have you looked at the answer @alvaroaleman provided? For us that was enough to find out what the cause was.

Again, please do not remove the finalizer, it is there for a reason. Try to instead find out which resources in the NS are pending deletion by:

Checking if any apiservice is unavailable and hence doesn't serve its resources: kubectl get apiservice|grep False

Finding all resources that still exist via kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -n $your-ns-to-delete (Kudos to Jordan for that one)

The solution to this problem is not to short-circuit the cleanup mechanism, its to find out what prevents cleanup from succeeding.

also, when this issue was closed, there was a new ticket referenced to discuss how to make it clear why the namespace is stuck in Terminating. I suggest you take the conversation there instead of this closed issue.

It's merged. I think there may be more we can do, but please take future comments to #70916.

jeff-knurek on 12 Nov 2019

👍2

@jeff-knurek That should be the right way. Thank you.

xzhang007 on 12 Nov 2019

In our case it was a botched upgrade of cert-manager which broke the finalizer. https://github.com/jetstack/cert-manager/issues/1582

$ kube get apiservice

NAME                                   SERVICE                                                     AVAILABLE                  AGE
v1.                                    Local                                                       True                       43d
v1.apps                                Local                                                       True                       43d
v1.authentication.k8s.io               Local                                                       True                       43d
v1.authorization.k8s.io                Local                                                       True                       43d
v1.autoscaling                         Local                                                       True                       43d
v1.batch                               Local                                                       True                       43d
v1.coordination.k8s.io                 Local                                                       True                       43d
v1.networking.k8s.io                   Local                                                       True                       43d
v1.rbac.authorization.k8s.io           Local                                                       True                       43d
v1.scheduling.k8s.io                   Local                                                       True                       43d
v1.storage.k8s.io                      Local                                                       True                       43d
v1alpha1.certmanager.k8s.io            Local                                                       True                       3d22h
v1alpha1.crd.k8s.amazonaws.com         Local                                                       True                       43d
v1beta1.admission.certmanager.k8s.io   cert-manager/cointainers-cointainers-cert-manager-webhook   False (MissingEndpoints)   60m
v1beta1.admissionregistration.k8s.io   Local                                                       True                       43d
v1beta1.apiextensions.k8s.io           Local                                                       True                       43d
v1beta1.apps                           Local                                                       True                       43d
v1beta1.authentication.k8s.io          Local                                                       True                       43d
v1beta1.authorization.k8s.io           Local                                                       True                       43d
v1beta1.batch                          Local                                                       True                       43d
v1beta1.certificates.k8s.io            Local                                                       True                       43d
v1beta1.coordination.k8s.io            Local                                                       True                       43d
v1beta1.events.k8s.io                  Local                                                       True                       43d
v1beta1.extensions                     Local                                                       True                       43d
v1beta1.networking.k8s.io              Local                                                       True                       43d
v1beta1.node.k8s.io                    Local                                                       True                       43d
v1beta1.policy                         Local                                                       True                       43d
v1beta1.rbac.authorization.k8s.io      Local                                                       True                       43d
v1beta1.scheduling.k8s.io              Local                                                       True                       43d
v1beta1.storage.k8s.io                 Local                                                       True                       43d
v1beta1.webhook.cert-manager.io        cert-manager/cointainers-cointainers-cert-manager-webhook   False (MissingEndpoints)   3d22h
v1beta2.apps                           Local                                                       True                       43d
v2beta1.autoscaling                    Local                                                       True                       43d
v2beta2.autoscaling                    Local                                                       True                       43d

mazamats on 12 Nov 2019

👍7

Hi.
I my case namespace stucks in Terminating when https://github.com/rancher/rancher/issues/21546#issuecomment-553635629

Maybe it will help.

IAkumaI on 14 Nov 2019

https://medium.com/@newtondev/how-to-fix-kubernetes-namespace-deleting-stuck-in-terminating-state-5ed75792647e

This worked like a champ for me

dstrimble on 15 Nov 2019

👎6 👍2

I also faced the same issue now it is working fine for me. Please refer following document and solve your issue

mayuchau on 24 Nov 2019

👍2

@zerkms well, sometimes, it's a legitimate advice, is not it? Often finalizers being waited on used to be served by objects that was deleted as part of the namespace deletion. In this case, since there is no point in waiting any longer - there is nothing that could do finalization any more -, patching the objects the way the article describes _is the only option_.

Note that the article is applicable only if the issue _was not resolved_ by the applying steps listed in the Known Issues page, linked at the top of the article, which basically is the advice that the comment you linked repeats.

AndrewSav on 25 Nov 2019

👍1

@zerkms well, sometimes, it's a legitimate advice, is not it? Often finalizers being waited on used to be served by objects that was deleted as part of the namespace deletion

I've never seen that be true for a spec.finalizer on a namespace. Every instance I've seen has involved the namespace cleanup controller, and has either been caused by a persistent object in the namespace (which that advice would strand in etcd), or an unresponsive aggregated API (which removing the namespace spec.finalizer would skip waiting for, also stranding any persisted resources from that API)

The article does not warn that bypassing the namespace finalization risks leaving namespaced resources stranded in storage, and is not recommended.

liggitt on 25 Nov 2019

👍3

I've never seen that be true for a spec.finalizer on a namespace

Yep, that's right, this is becaise this finalizer is implemented by the kubernetes itself, but there could be other finalizers on objects inside that namespace, which could be implemented by objects in that said namespace. One example that I encountered recently was https://appscode.com/products/stash/.

It puts finalizers on some of its CRDs which are to be serviced by the stash-operator deployment. But with stash-operator already deleted, there is nothing that can remove the finalizer mark from those CRDs and the namespace deletion gets stuck. In this case patching out those finalizers (not on the namespace itself, but on those objects) is the only sensible thing to do.

Hope it makes sense.

AndrewSav on 25 Nov 2019

👍1

In this case patching out those finalizers (not on the namespace itself, but on those objects) is the only sensible thing to do.

Correct. I would not object to that in a "delete all resources" cleanup scenario, but that is not what the linked article walks through... it describes how to remove a spec.finalizer from the namespace.

liggitt on 25 Nov 2019

first, take small coffee and relax , now go to your k8s master nodes

kubectl cluster-info
Kubernetes master is running at https://localhost:6443
KubeDNS is running at https://localhost:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

now run the kube-proxy
kubectl proxy &
Starting to serve on 127.0.0.1:8001
save the ID to delete it later on :)

find your name-space that decided no to be deleted :) for us it will be cattle-system
kubectl get ns
cattle-system Terminating 1d

put it in file

kubectl get namespace cattle-system -o json > tmp.json

edit the file and remove the finalizers
},
"spec": {
"finalizers": [
"kubernetes"
]
},
after editing it should look like this 👍
},
"spec": {
"finalizers": [
]
},
we almost there 👍
curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json http://127.0.0.1:8001/api/v1/namespaces/${NAMESPACE}/finalize

and it's gone 👍

shdowofdeath on 27 Nov 2019

👍13 😄6 ❤4 😕4 👎4

Again, please do not remove the finalizer, it is there for a reason. Try to instead find out which resources in the NS are pending deletion by:

Checking if any apiservice is unavailable and hence doesn't serve its resources: kubectl get apiservice|grep False

Finding all resources that still exist via kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -n $your-ns-to-delete (Kudos to Jordan for that one)

The solution to this problem is not to short-circuit the cleanup mechanism, its to find out what prevents cleanup from succeeding.

Hey, guys! I follow the tips provided by @alvaroaleman and I made a script that inspect and try the clean deletion before do a hard deletion of stucked namespace.

What the script https://github.com/thyarles/knsk do:

Check for unavailable apiresources and ask to delete it
Check for pending resources in namespace and ask to delete it
Wait about 5 minutes to see if Kubernetes do a clean deletion if the script do any deletion
Force deletion of stucked namespace

Hope it helps.

thyarles on 29 Nov 2019

🚀11 👍11 🎉4

@thyarles Thank you so much. I used your way to solve the problem.

$kubectl get apiservices to check which service is unavailable, delete those available is false by $kubectl delete apiservice [service-name], and after that there would be no issues about deleting a name space.

For our team, there are 3 unavailable apiservices, v1beta1.admission.certmanager.k8s.io, v1beta1.metrics.k8s.io, and v1beta1.webhook.certmanager.k8s.io.

xzhang007 on 4 Dec 2019

🎉2

Note that your cluster is somewhat broken if the metrics apiserver isn't running, just removing the APIService doesn't actually fix the root cause.

lavalamp on 4 Dec 2019

👀1 👍1

@lavalamp the metrics is an unavailable apiservice.

xzhang007 on 4 Dec 2019

Yes, which means the metrics apiserver is not running, which means HPA doesn't work on your cluster, and probably other things, too.

lavalamp on 4 Dec 2019

👍2

Yes. HPA doesn't work now. I should not delete metrics and find a way to fix it.

xzhang007 on 4 Dec 2019

@thyarles Thank you so much. I used your way to solve the problem.

$kubectl get apiservices to check which service is unavailable, delete those available is false by $kubectl delete apiservice [service-name], and after that there would be no issues about deleting a name space.

For our team, there are 3 unavailable apiservices, v1beta1.admission.certmanager.k8s.io, v1beta1.metrics.k8s.io, and v1beta1.webhook.certmanager.k8s.io.

@xzhang007 glad to hear! Now you must check why your v1beta1.metrics.k8s.io became broken. Check how it would like:

```
$ kubectl -n kube-system get all | grep metrics

pod/metrics-server-64f74f8d47-r5vcq 2/2 Running 9 119d
service/metrics-server ClusterIP xx.xx.xx.xx 443/TCP 201d
deployment.apps/metrics-server 1/1 1 1 201d
replicaset.apps/metrics-server-55c7f68d94 0 0 0 165d
replicaset.apps/metrics-server-5c696bb6d7 0 0 0 201d
replicaset.apps/metrics-server-5cdb8bb4fb 0 0 0 201d
replicaset.apps/metrics-server-64f74f8d47 1 1 1 119d
replicaset.apps/metrics-server-686789bb4b 0 0 0 145d```

thyarles on 4 Dec 2019

$ kubectl -n kube-system get all | grep metrics

pod/metrics-server-5dcfd4dd9f-m2v9k 1/1 Running 0 2d20h

service/metrics-server ClusterIP xx.xx.xx.xx 443/TCP 27d

deployment.apps/metrics-server 1/1 1 1 27d

replicaset.apps/metrics-server-5dcfd4dd9f 1 1 1 27d
replicaset.apps/metrics-server-7fcf9cc98b 0 0 0 27d

xzhang007 on 4 Dec 2019

Yes. HPA doesn't work now. I should not delete metrics and find a way to fix it.

@xzhang007 in fact it doesn't work before the you noticed the problem.... you just noticed because it put your deleted namespaces in stuck mode. Just use a helm package manager to re-deploy your metric-server or just call the command bellow to fix it (check the deployment file before apply):

$ curl https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/metrics-server-deployment.yaml | kubectl apply -f -

thyarles on 4 Dec 2019

@slassh solution worked for me on Kubernetes 1.15

zak905 on 7 Jan 2020

Delete v1beta1.metrics.k8s.io APIService

kubectl get ns ns-to-delete -o yaml

...
status:
  conditions:
  - lastTransitionTime: "2020-01-08T05:36:52Z"
    message: 'Discovery failed for some groups, 1 failing: unable to retrieve the
      complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently
      unable to handle the request'
...

kubectl get APIService
...
v1beta1.metrics.k8s.io                 kube-system/metrics-server   False (ServiceNotFound)

 kubectl delete v1beta1.metrics.k8s.io APIService

xuchenCN on 8 Jan 2020

👍3 ❤1

The cert-manager was unavailable maybe since it was set up incorrectly. For example, use a wrong syntax in ingress controller. For our system, it was

"certmanager.k8s.io/cluster-issuer": "letsencrypt-prod"

and it was changed to

"cert-manager.io/cluster-issuer": "letsencrypt-prod"

make it available.

xzhang007 on 8 Jan 2020

As mentioned before in this issue there is another way to terminate a namespace using API not exposed by kubectl by using a modern version of kubectl where kubectl replace --raw is available (not sure from which version). This way you will not have to spawn a kubectl proxy process and avoid dependency with curl (that in some environment like busybox is not available). In the hope that this will help someone else I left this here:

kubectl get namespace "stucked-namespace" -o json \
            | tr -d "\n" | sed "s/\"finalizers\": \[[^]]\+\]/\"finalizers\": []/" \
            | kubectl replace --raw /api/v1/namespaces/stucked-namespace/finalize -f -

teoincontatto on 9 Jan 2020

👍38 ❤11 🎉11 🚀9 👎4

Has it been established whether this is a fixable issue? Seems to be a lot hacky solutions here but nothing addressing the underlying issue which is that none of us can delete our namespaces....
I have this on EKS v1.14 cluster

tomjohnburton on 28 Jan 2020

👍4

Has it been established whether this is a fixable issue? Seems to be a lot hacky solutions here but nothing addressing the underlying issue which is that none of us can delete our namespaces

The fundamental issue is that an aggregated API group in your cluster is unavailable. It is intentional that the namespace cleanup controller blocks until all APIs are available, so that it can verify all resources from all API groups are cleaned up for that namespace.

liggitt on 28 Jan 2020

👍10 🚀2 ❤1 👎1

for ppl trying to curl the API:

# Check all possible clusters, as your .KUBECONFIG may have multiple contexts:
kubectl config view -o jsonpath='{"Cluster name\tServer\n"}{range .clusters[*]}{.name}{"\t"}{.cluster.server}{"\n"}{end}'

# Select name of cluster you want to interact with from above output:
export CLUSTER_NAME="some_server_name"

# Point to the API server referring the cluster name
APISERVER=$(kubectl config view -o jsonpath="{.clusters[?(@.name==\"$CLUSTER_NAME\")].cluster.server}")

# Gets the token value
TOKEN=$(kubectl get secrets -o jsonpath="{.items[?(@.metadata.annotations['kubernetes\.io/service-account\.name']=='default')].data.token}"|base64 --decode)

# Explore the API with TOKEN
curl -X GET $APISERVER/api --header "Authorization: Bearer $TOKEN" --insecure

https://kubernetes.io/docs/tasks/administer-cluster/access-cluster-api/#without-kubectl-proxy

zakkg3 on 4 Feb 2020

❤1

Here's a script to do this automatically. Needs jq:


#!/bin/bash

if [ -z "${1}" ] ; then
  echo -e "\nUsage: ${0} <name_of_the_namespace_to_remove_the_finalizer_from>\n"
  echo "Valid cluster names, based on your kube config:"
  kubectl config view -o jsonpath='{"Cluster name\tServer\n"}{range .clusters[*]}{.name}{"\t"}{.cluster.server}{"\n"}{end}'
  exit 1
fi

kubectl proxy --port=8901 &
PID=$!
sleep 1

echo -n 'Current context : '
kubectl config current-context 
read -p "Are you sure you want to remove the finalizer from namespace ${1}? Press Ctrl+C to abort."

kubectl get namespace "${1}" -o json \
            | jq '.spec.finalizers = [ ]' \
            | curl -k \
            -H "Content-Type: application/json" \
            -X PUT --data-binary @- "http://localhost:8901/api/v1/namespaces/${1}/finalize"

kill -15 $PID

alpozcan on 6 Mar 2020

👍19 👎3 ❤1

Everyone: scripts to automate the finalizer removal do more harm than good. They may leave time-bombs in the aggregated apiserver(s) that aren't available; if someone recreates the namespace, suddenly a bunch of old objects may re-appear.

The real solution is:

$ kubectl get api-services

# something in the list is unavailable. Figure out what it is and fix it.

# ... the namespace lifecycle controller will finish deleting the namespace.

lavalamp on 6 Mar 2020

👍5 😄1

Everyone: scripts to automate the finalizer removal do more harm than good. They may leave time-bombs in the aggregated apiserver(s) that aren't available; if someone recreates the namespace, suddenly a bunch of old objects may re-appear.

The real solution is:
$ kubectl get api-services

# something in the list is unavailable. Figure out what it is and fix it.

# ... the namespace lifecycle controller will finish deleting the namespace.

https://github.com/thyarles/knsk

This script does all the checks and tries to do a clean deletion, including looking for orphaned resources. If the user wants to take a risk, the script offers a --force option to perform a non-recommended way of deletion.

thyarles on 6 Mar 2020

👍9 🎉4

typo, should be apiservices

liggitt on 9 Mar 2020

👍5

This command shows not available apis:

kubectl get apiservices --template='{{range $api := .items}}{{range $api.status.conditions}}{{if eq .type "Available"}}{{$api.metadata.name}} {{.status}}{{"\n"}}{{end}}{{end}}{{end}}' | grep -i false

alebedev87 on 23 Apr 2020

This article will surely be useful to you:

https://access.redhat.com/solutions/5038511

Actually what exists is a conflict in the apiservices, they could validate the health status of the apis in openshift:

oc get apiservices -o=custom-columns="name:.metadata.name,status:.status.conditions[0].status"

the api that fails will need to restart it, restarting the pod or the deployement that belongs to that API, after that try to delete the namespace.

$ oc delete namspace

and ready, business fixed !!

Harriiup on 30 Apr 2020

👎9

Pretty disrespectful to use your own language in a place where everyone agrees to speak English. 👎

theAkito on 30 Apr 2020

👎1

Where does everyone agree to speak English?

On Thu, Apr 30, 2020 at 17:58 theAkito notifications@github.com wrote:

Pretty disrespectful to use your own language in a place where everyone
agrees to speak English. 👎

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/60807#issuecomment-622137770,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ALGMKB6K4OU4X3XOYMALOBLRPHYCDANCNFSM4ETUOEPA
.

>

Chris, Lead Architect @ brace.ai

--

Confidentiality Notice: This email is intended for the sole use of the
intended recipient(s) and may contain confidential, proprietary or
privileged information. If you are not the intended recipient, you are
notified that any use, review, dissemination, copying or action taken based
on this message or its attachments, if any, is prohibited. If you are not
the intended recipient, please contact the sender by reply email and
destroy or delete all copies of the original message and any attachments.

chris-brace on 1 May 2020

👎4 ❤3 😄3

ready, excuse me it was for my speed, it was fixed

Harriiup on 1 May 2020

We have a multi lingual user base, it's bad enough that none of our tools are internationalized, we can at least be nice here on github, please.

lavalamp on 1 May 2020

👍16 👎2 👀1

@teoincontatto

As mentioned before in this issue there is another way to terminate a namespace using API not exposed by kubectl by using a modern version of kubectl where kubectl replace --raw is available (not sure from which version). This way you will not have to spawn a kubectl proxy process and avoid dependency with curl (that in some environment like busybox is not available). In the hope that this will help someone else I left this here:
kubectl get namespace "stucked-namespace" -o json \
            | tr -d "\n" | sed "s/\"finalizers\": \[[^]]\+\]/\"finalizers\": []/" \
            | kubectl replace --raw /api/v1/namespaces/stucked-namespace/finalize -f -

This worked perfectly!

jaydeland on 7 May 2020

👍29 🚀6 🎉6 ❤3 👎2

We have a multi lingual user base, it's bad enough that none of our tools are internationalized, we can at least be nice here on github, please.

Still trying to understand. Forgive me. I may have clicked thumbs down by mistake.
Yes, indeed, tools haven't been done to perfection.
Those, giving a thumbs down without an explanation, doesn't make sense.

svalluru-aai on 9 May 2020

Almost all the time I experience this issue, it's up to CRD's. Delete CRD's if they are used only in that namespace and then you can proceed with deleting finalizer and namespace.

Aracki on 18 May 2020

As mentioned before in this issue there is another way to terminate a namespace using API not exposed by kubectl by using a modern version of kubectl where kubectl replace --raw is available (not sure from which version). This way you will not have to spawn a kubectl proxy process and avoid dependency with curl (that in some environment like busybox is not available). In the hope that this will help someone else I left this here:
kubectl get namespace "stucked-namespace" -o json \
            | tr -d "\n" | sed "s/\"finalizers\": \[[^]]\+\]/\"finalizers\": []/" \
            | kubectl replace --raw /api/v1/namespaces/stucked-namespace/finalize -f -

@teoincontatto Thank you! This finally worked!

AnticliMaxtic on 28 May 2020

🚀10 👍2

Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.

kubectl get namespace linkerd -o json > linkerd.json

# Where:/api/v1/namespaces/<your_namespace_here>/finalize
kubectl replace --raw "/api/v1/namespaces/linkerd/finalize" -f ./linkerd.json

After running that command, the namespace should now be absent from your namespaces list.. And it works for me.

Not only namespace but also support the other resources.

Aisuko on 29 May 2020

👍12 ❤5 🚀3 🎉1

i fixed the problem by removing the finalizers lines using the: kubectl edit annoying-ns

ayoubelmimouni on 1 Jun 2020

👍2

Hmm ... I have this problem right now :)
Today I did an update of my eks cluster from 1.15 to 1.16.
Everything looks fine so far.
But my development ns "configcluster" was a kind of "damanged".
So I decide to clean it up.

k delete ns configcluster
....
now this hangs (3h +) :/

$ kubectl get namespace configcluster -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"name":"configcluster"}}
  creationTimestamp: "2020-06-19T06:40:15Z"
  deletionTimestamp: "2020-06-19T09:19:16Z"
  name: configcluster
  resourceVersion: "22598109"
  selfLink: /api/v1/namespaces/configcluster
  uid: e50f0b53-b21e-4e6e-8946-c0a0803f031b
spec:
  finalizers:
  - kubernetes
status:
  conditions:
  - lastTransitionTime: "2020-06-19T09:19:21Z"
    message: 'Discovery failed for some groups, 1 failing: unable to retrieve the
      complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently
      unable to handle the request'
    reason: DiscoveryFailed
    status: "True"
    type: NamespaceDeletionDiscoveryFailure
  - lastTransitionTime: "2020-06-19T09:19:22Z"
    message: All legacy kube types successfully parsed
    reason: ParsedGroupVersions
    status: "False"
    type: NamespaceDeletionGroupVersionParsingFailure
  - lastTransitionTime: "2020-06-19T09:19:22Z"
    message: All content successfully deleted
    reason: ContentDeleted
    status: "False"
    type: NamespaceDeletionContentFailure
  phase: Terminating

ahoehma on 19 Jun 2020

How do we get more exposure to this thorn in the foot issue?

On Fri, Jun 19, 2020 at 4:46 AM Andreas Höhmann notifications@github.com
wrote:

Hmm ... I have this problem right now :)
Today I did an update of my eks cluster from 1.15 to 1.16.
Everything looks fine so far.
But my development ns "configcluster" was a kind of "damanged".
So I decide to clean it up.

k delete ns configcluster
....
now this hangs (3h +) :/

$ kubectl get namespace configcluster -o yaml
apiVersion: v1
kind: Namespace
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"name":"configcluster"}}
creationTimestamp: "2020-06-19T06:40:15Z"
deletionTimestamp: "2020-06-19T09:19:16Z"
name: configcluster
resourceVersion: "22598109"
selfLink: /api/v1/namespaces/configcluster
uid: e50f0b53-b21e-4e6e-8946-c0a0803f031b
spec:
finalizers:

kubernetes
status:
conditions:

lastTransitionTime: "2020-06-19T09:19:21Z"
message: 'Discovery failed for some groups, 1 failing: unable to retrieve the
complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently
unable to handle the request'
reason: DiscoveryFailed
status: "True"
type: NamespaceDeletionDiscoveryFailure

lastTransitionTime: "2020-06-19T09:19:22Z"
message: All legacy kube types successfully parsed
reason: ParsedGroupVersions
status: "False"
type: NamespaceDeletionGroupVersionParsingFailure

lastTransitionTime: "2020-06-19T09:19:22Z"
message: All content successfully deleted
reason: ContentDeleted
status: "False"
type: NamespaceDeletionContentFailure
phase: Terminating

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/60807#issuecomment-646543073,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AFLKRCLHIZ77X2Z3F5GAOCTRXMXVTANCNFSM4ETUOEPA
.

bobhenkel on 19 Jun 2020

@bobhenkel well this issue is closed, so effectively this means that there is no issue (as far as any actionable items are concerned). If you need practical help with dealing with a similar situation, please read the thread above, there are some good pieces of advise there (and also some bad ones).

AndrewSav on 20 Jun 2020

👍1

In my case, I had to manually delete my ingress load balancer from the GCP Network Service console. I had manually created the load balancer frontend directly in the console. Once I deleted the load balancer the namespace was automatically deleted.

I'm suspecting that Kubernetes didn't want to delete since the state of the load balancer was different than the state in the manifest.

I will try to automate the ingress frontend creation using annotations next to see if I can resolve this issue.

whyvez on 15 Jul 2020

Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.
kubectl get namespace linkerd -o json > linkerd.json

# Where:/api/v1/namespaces/<your_namespace_here>/finalize
kubectl replace --raw "/api/v1/namespaces/linkerd/finalize" -f ./linkerd.json
After running that command, the namespace should now be absent from your namespaces list.. And it works for me.

Not only namespace but also support the other resources.

you are a star it worked

salluu on 25 Jul 2020

🚀3 👍1

Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.
kubectl get namespace linkerd -o json > linkerd.json

# Where:/api/v1/namespaces/<your_namespace_here>/finalize
kubectl replace --raw "/api/v1/namespaces/linkerd/finalize" -f ./linkerd.json
After running that command, the namespace should now be absent from your namespaces list.. And it works for me.

Not only namespace but also support the other resources.

Tried a lot of solutions but this is the one that worked for me. Thank you!

Navaneeth-pk on 4 Aug 2020

curl -k -H "Content-Type: application/json" -X PUT --data-binary @tmp.json https://kubernetes-cluster-ip/api/v1/namespaces/annoying-namespace-to-delete/finalize

Better https://stackoverflow.com/a/59667608/429476

alexcpn on 7 Aug 2020

This should really be the "accepted" answer - it completely resolved the root of this issue!

Take from the link above:

This is not the right way, especially in a production environment.

Today I got into the same problem. By removing the finalizer you’ll end up with leftovers in various states. You should actually find what is keeping the deletion from complete.

See https://github.com/kubernetes/kubernetes/issues/60807#issuecomment-524772920

(also, unfortunately, ‘kubetctl get all’ does not report all things, you need to use similar commands like in the link)

My case — deleting ‘cert-manager’ namespace. In the output of ‘kubectl get apiservice -o yaml’ I found APIService ‘v1beta1.admission.certmanager.k8s.io’ with status=False . This apiservice was part of cert-manager, which I just deleted. So, in 10 seconds after I ‘kubectl delete apiservice v1beta1.admission.certmanager.k8s.io’ , the namespace disappeared.

Hope that helps.

With that being said, I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.

You can find it here: https://github.com/oze4/service.remove-terminating-namespaces

oze4 on 22 Aug 2020

I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.

You can find it here: https://github.com/oze4/service.remove-terminating-namespaces

oze4 on 22 Aug 2020

👍1

Yet another oneliner:

for ns in $(kubectl get ns --field-selector status.phase=Terminating -o jsonpath='{.items[*].metadata.name}'); do  kubectl get ns $ns -ojson | jq '.spec.finalizers = []' | kubectl replace --raw "/api/v1/namespaces/$ns/finalize" -f -; done

But deleting stuck namespaces is not a good solution. Right way is to find out why it's stuck. Very common reason is there's an unavailable API service(s) which prevents cluster from finalizing namespaces.
For example here I haven't deleted Knative properly:

$ kubectl get apiservice|grep False
NAME                                   SERVICE                             AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io          knative-serving/autoscaler          False (ServiceNotFound)   278d

Deleting it solved the problem

k delete apiservice v1beta1.custom.metrics.k8s.io
apiservice.apiregistration.k8s.io "v1beta1.custom.metrics.k8s.io" deleted

$  k create ns test2
namespace/test2 created
$ k delete ns test2
namespace "test2" deleted
$ kgns test2
Error from server (NotFound): namespaces "test2" not found

savealive on 23 Aug 2020

🚀14 ❤9 🎉6 👍5 👀2

I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.

You can find it here: https://github.com/oze4/service.remove-terminating-namespaces

good job.

ciiiii on 31 Aug 2020

👍1

I had a similar issue on 1.18 in a lab k8s cluster and adding a note to maybe help others. I had been working with the metrics API and with custom metrics in particular. After deleting those k8s objects to recreate it, it stalled on deleting the namespace with an error that the metrics api endpoint could not be found. Putting that back in on another namespace, everything cleared up immediately.

This was in the namespace under status.conditions.message:

Discovery failed for some groups, 4 failing: unable to retrieve the
complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server is currently
unable to handle the request, custom.metrics.k8s.io/v1beta2: the server is currently
unable to handle the request, external.metrics.k8s.io/v1beta1: the server is
currently unable to handle the request, metrics.k8s.io/v1beta1: the server is

nmarus on 10 Sep 2020

👀1

Yet another oneliner:

for ns in $(kubectl get ns --field-selector status.phase=Terminating -o jsonpath='{.items[*].metadata.name}'); do  kubectl get ns $ns -ojson | jq '.spec.finalizers = []' | kubectl replace --raw "/api/v1/namespaces/$ns/finalize" -f -; done

But deleting stuck namespaces is not a good solution. Right way is to find out why it's stuck. Very common reason is there's an unavailable API service(s) which prevents cluster from finalizing namespaces.
For example here I haven't deleted Knative properly:

$ kubectl get apiservice|grep False
NAME                                   SERVICE                             AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io          knative-serving/autoscaler          False (ServiceNotFound)   278d

Deleting it solved the problem

k delete apiservice v1beta1.custom.metrics.k8s.io
apiservice.apiregistration.k8s.io "v1beta1.custom.metrics.k8s.io" deleted

$  k create ns test2
namespace/test2 created
$ k delete ns test2
namespace "test2" deleted
$ kgns test2
Error from server (NotFound): namespaces "test2" not found

Definitely the cleanest one liner! It's important to note that none of these "solutions" actually solve the root issue.

See here for the correct solution

That is the message would should be spreading :smile: not "yet another one liner".

oze4 on 14 Sep 2020

👍2 ❤1

efinitely the cleanest one liner! It's important to note that none of these "solutions" actually solve the root issue.

This solution solves one of the all possibilities. To look for all possible root causes and fix them, I use this script: https://github.com/thyarles/knsk

thyarles on 14 Sep 2020

🎉4 ❤1 😄1

@thyarles very nice!

oze4 on 14 Sep 2020

🚀1

Please do not use modify finalize to delete namespace. That will cause an error

Please find out the cause of namespace terminating. Currently known troubleshooting directions

pod terminating
cert-manager webhook block secrte

chinazj on 21 Sep 2020

👍2

I encounter the same problem:

# sudo kubectl get ns
NAME                   STATUS        AGE
cattle-global-data     Terminating   8d
cattle-global-nt       Terminating   8d
cattle-system          Terminating   8d
cert-manager           Active        8d
default                Active        10d
ingress-nginx          Terminating   9d
kube-node-lease        Active        10d
kube-public            Active        10d
kube-system            Active        10d
kubernetes-dashboard   Terminating   4d6h
local                  Active        8d
p-2sfgk                Active        8d
p-5kdx9                Active        8d
# sudo kubectl get all -n kubernetes-dashboard
No resources found in kubernetes-dashboard namespace.
# sudo kubectl get namespace kubernetes-dashboard  -o json 
{
    "apiVersion": "v1",
    "kind": "Namespace",
    "metadata": {
        "annotations": {
            "cattle.io/status": "{\"Conditions\":[{\"Type\":\"ResourceQuotaInit\",\"Status\":\"True\",\"Message\":\"\",\"LastUpdateTime\":\"2020-09-29T01:15:46Z\"},{\"Type\":\"InitialRolesPopulated\",\"Status\":\"True\",\"Message\":\"\",\"LastUpdateTime\":\"2020-09-29T01:15:46Z\"}]}",
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Namespace\",\"metadata\":{\"annotations\":{},\"name\":\"kubernetes-dashboard\"}}\n",
            "lifecycle.cattle.io/create.namespace-auth": "true"
        },
        "creationTimestamp": "2020-09-29T01:15:45Z",
        "deletionGracePeriodSeconds": 0,
        "deletionTimestamp": "2020-10-02T07:59:52Z",
        "finalizers": [
            "controller.cattle.io/namespace-auth"
        ],
        "managedFields": [
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:annotations": {
                            "f:cattle.io/status": {},
                            "f:lifecycle.cattle.io/create.namespace-auth": {}
                        },
                        "f:finalizers": {
                            ".": {},
                            "v:\"controller.cattle.io/namespace-auth\"": {}
                        }
                    }
                },
                "manager": "Go-http-client",
                "operation": "Update",
                "time": "2020-09-29T01:15:45Z"
            },
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:metadata": {
                        "f:annotations": {
                            ".": {},
                            "f:kubectl.kubernetes.io/last-applied-configuration": {}
                        }
                    }
                },
                "manager": "kubectl-client-side-apply",
                "operation": "Update",
                "time": "2020-09-29T01:15:45Z"
            },
            {
                "apiVersion": "v1",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                    "f:status": {
                        "f:phase": {}
                    }
                },
                "manager": "kube-controller-manager",
                "operation": "Update",
                "time": "2020-10-02T08:13:49Z"
            }
        ],
        "name": "kubernetes-dashboard",
        "resourceVersion": "3662184",
        "selfLink": "/api/v1/namespaces/kubernetes-dashboard",
        "uid": "f1944b81-038b-48c2-869d-5cae30864eaa"
    },
    "spec": {},
    "status": {
        "conditions": [
            {
                "lastTransitionTime": "2020-10-02T08:13:49Z",
                "message": "All resources successfully discovered",
                "reason": "ResourcesDiscovered",
                "status": "False",
                "type": "NamespaceDeletionDiscoveryFailure"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All legacy kube types successfully parsed",
                "reason": "ParsedGroupVersions",
                "status": "False",
                "type": "NamespaceDeletionGroupVersionParsingFailure"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All content successfully deleted, may be waiting on finalization",
                "reason": "ContentDeleted",
                "status": "False",
                "type": "NamespaceDeletionContentFailure"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All content successfully removed",
                "reason": "ContentRemoved",
                "status": "False",
                "type": "NamespaceContentRemaining"
            },
            {
                "lastTransitionTime": "2020-10-02T08:11:49Z",
                "message": "All content-preserving finalizers finished",
                "reason": "ContentHasNoFinalizers",
                "status": "False",
                "type": "NamespaceFinalizersRemaining"
            }
        ],
        "phase": "Terminating"
    }

#  sudo kubectl version

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

wendaotao on 3 Oct 2020

👀1

You can use etcdctl to find undeleted resources

ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
get /registry --prefix | grep <namespace>

chinazj on 11 Oct 2020

Just copy and paste in your terminal

for NS in $(kubectl get ns 2>/dev/null | grep Terminating | cut -f1 -d ' '); do
  kubectl get ns $NS -o json > /tmp/$NS.json
  sed -i '' "s/\"kubernetes\"//g" /tmp/$NS.json
  kubectl replace --raw "/api/v1/namespaces/$NS/finalize" -f /tmp/$NS.json
done

grebois on 4 Nov 2020

👍8 🎉4 🚀3 👎2

/tmp/$NS.json

this worked for me, and I ran after verifying there were no dangling k8s objects in the ns. Thanks!

1vanzamarripa on 6 Nov 2020

I used this to remove a namespace stuck at Terminated

example :

kubectl get namespace openebs -o json | jq -j '.spec.finalizers=null' > tmp.json 
kubectl replace --raw "/api/v1/namespaces/openebs/finalize" -f ./tmp.json

survivant on 27 Nov 2020

👎1

For all the googlers who bumped into stuck namespaces at Terminating on Rancher specific namespaces (e.g cattle-system), the following modified command (grebois's original) worked for me:

for NS in $(kubectl get ns 2>/dev/null | grep Terminating | cut -f1 -d ' '); do
  kubectl get ns $NS -o json > /tmp/$NS.json
  sed -i "s/\"controller.cattle.io\/namespace-auth\"//g" /tmp/$NS.json
  kubectl replace --raw "/api/v1/namespaces/$NS/finalize" -f /tmp/$NS.json
done

gondaz on 3 Dec 2020

👎1

Folks, just FYI, when the video for this kubecon talk is out I plan to link to it and some of the helpful comments above, and lock this issue.

lavalamp on 3 Dec 2020

👍1

I recorded a 10 minute explanation of what is going on and presented it at this SIG Deep Dive session.

Here's a correct comment with 65 upvotes

Mentioned several times above, this medium post is an example of doing things the right way. Find and fix the broken api service.

All the one liners that just remove the finalizers on the namespace do address the root cause and leave your cluster subtly broken, which will bite you later. So please don't do that. The root cause fix is usually easier anyway. It seems that people like to post variations on this theme even though there's numerous correct answers in the thread already, so I'm going to lock the issue now, to ensure that this comment stays at the bottom.

lavalamp on 4 Dec 2020

Kubernetes: deleting namespace stuck at "Terminating" state

Most helpful comment

All 180 comments

>

See here for the correct solution

Related issues