Kubeadm: how to renew the certificate when apiserver cert expired?

Created on 30 Nov 2017  ·  38Comments  ·  Source: kubernetes/kubeadm

Is this a request for help?

If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.

If no, delete this section and continue on.

What keywords did you search in kubeadm issues before filing this one?

If you have found any duplicates, you should instead reply there and close this page.

If you have not found any duplicates, delete this section and continue on.

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT or FEATURE REQUEST

Versions

kubeadm version (use kubeadm version):1.7.5

Environment:

  • Kubernetes version (use kubectl version):1.7.5
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Others:

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Most helpful comment

If you are using a version of kubeadm prior to 1.8, where I understand certificate rotation #206 was put into place (as a beta feature) or your certs already expired, then you will need to manually update your certs (or recreate your cluster which it appears some (not just @kachkaev) end up resorting to).

You will need to SSH into your master node. If you are using kubeadm >= 1.8 skip to 2.

  1. Update Kubeadm, if needed. I was on 1.7 previously.
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
  1. Backup old apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.old
$ sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/pki/front-proxy-client.crt.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.key /etc/kubernetes/pki/front-proxy-client.key.old
  1. Generate new apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo kubeadm alpha phase certs apiserver --apiserver-advertise-address <IP address of your master server>
$ sudo kubeadm alpha phase certs apiserver-kubelet-client
$ sudo kubeadm alpha phase certs front-proxy-client
  1. Backup old configuration files
$ sudo mv /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.old
$ sudo mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.old
$ sudo mv /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.old
$ sudo mv /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.old
  1. Generate new configuration files.

There is an important note here. If you are on AWS, you will need to explicitly pass the --node-name parameter in this request. Otherwise you will get an error like: Unable to register node "ip-10-0-8-141.ec2.internal" with API server: nodes "ip-10-0-8-141.ec2.internal" is forbidden: node ip-10-0-8-141 cannot modify node ip-10-0-8-141.ec2.internal in your logs sudo journalctl -u kubelet --all | tail and the Master Node will report that it is Not Ready when you run kubectl get nodes.

Please be certain to replace the values passed in --apiserver-advertise-address and --node-name with the correct values for your environment.

$ sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address 10.0.8.141 --node-name ip-10-0-8-141.ec2.internal
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"

  1. Ensure that your kubectl is looking in the right place for your config files.
$ mv .kube/config .kube/config.old
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ sudo chmod 777 $HOME/.kube/config
$ export KUBECONFIG=.kube/config
  1. Reboot your master node
$ sudo /sbin/shutdown -r now
  1. Reconnect to your master node and grab your token, and verify that your Master Node is "Ready". Copy the token to your clipboard. You will need it in the next step.
$ kubectl get nodes
$ kubeadm token list

If you do not have a valid token. You can create one with:

$ kubeadm token create

The token should look something like 6dihyb.d09sbgae8ph2atjw

  1. SSH into each of the slave nodes and reconnect them to the master
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
$ sudo kubeadm join --token=<token from step 8>  <ip of master node>:<port used 6443 is the default> --node-name <should be the same one as from step 5>

  1. Repeat Step 9 for each connecting node. From the master node, you can verify that all slave nodes have connected and are ready with:
$ kubectl get nodes

Hopefully this gets you where you need to be @davidcomeyne.

All 38 comments

@zalmanzhao did you manage to solve this issue?

I created a kubeadm v1.9.3 cluster just over a year ago and it was working fine all this time. I went to update one deployment today and realised I was locked out of the API because the cert got expired. I can't even kubeadm alpha phase certs apiserver, because I get failure loading apiserver certificate: the certificate has expired (kubeadm version is currently 1.10.6 since I want to upgrade).

Adding insecure-skip-tls-verify: true to ~/.kube/configclusters[0].cluser does not help too – I see You must be logged in to the server (Unauthorized) when trying to kubectl get pods (https://github.com/kubernetes/kubernetes/issues/39767).

The cluster is working, but it lives its own life until it self-destroys or until things get fixed 😅 Unfortunately, I could not find a solution for my situation in #206 and am wondering how to get out of it. The only relevant material I could dig out was a blog post called _‘How to change expired certificates in kubernetes cluster’_, which looked promising at first glance. However, it did not fit in the end because my master machine did not have /etc/kubernetes/ssl/ folder (only /etc/kubernetes/pki/) – either I have a different k8s version or I simply deleted that folder without noticing.

@errordeveloper could you please recommend something? I'd love to fix things without kubeadm reset and payload recreation.

@kachkaev Did you have any luck on renewing the certs without resetting the kubeadm?
If so, please share, I'm having the same issue here with k8s 1.7.4. And I can't seem to upgrade ($ kubeadm upgrade plan) because the error pops up again telling me the the certificate has expired and that it cannot list the masters in my cluster:

[ERROR APIServerHealth]: the API Server is unhealthy; /healthz didn't return "ok"
[ERROR MasterNodesReady]: couldn't list masters in cluster: Get https://172.31.18.88:6443/api/v1/nodes?labelSelector=node-role.kubernetes.io%2Fmaster%3D: x509: certificate has expired or is not yet valid

Unfortunately, I gave up in the end. The solution was to create a new cluster, restore all the payload on it, switch DNS records and finally delete the original cluster 😭 At least there was no downtime because I was lucky enough to have healthy pods on the old k8s during the transition.

Thanks @kachkaev for responding. I will nonetheless give it another try.
If I find something I will make sure to post it here...

If you are using a version of kubeadm prior to 1.8, where I understand certificate rotation #206 was put into place (as a beta feature) or your certs already expired, then you will need to manually update your certs (or recreate your cluster which it appears some (not just @kachkaev) end up resorting to).

You will need to SSH into your master node. If you are using kubeadm >= 1.8 skip to 2.

  1. Update Kubeadm, if needed. I was on 1.7 previously.
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
  1. Backup old apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.old
$ sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/pki/front-proxy-client.crt.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.key /etc/kubernetes/pki/front-proxy-client.key.old
  1. Generate new apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo kubeadm alpha phase certs apiserver --apiserver-advertise-address <IP address of your master server>
$ sudo kubeadm alpha phase certs apiserver-kubelet-client
$ sudo kubeadm alpha phase certs front-proxy-client
  1. Backup old configuration files
$ sudo mv /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.old
$ sudo mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.old
$ sudo mv /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.old
$ sudo mv /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.old
  1. Generate new configuration files.

There is an important note here. If you are on AWS, you will need to explicitly pass the --node-name parameter in this request. Otherwise you will get an error like: Unable to register node "ip-10-0-8-141.ec2.internal" with API server: nodes "ip-10-0-8-141.ec2.internal" is forbidden: node ip-10-0-8-141 cannot modify node ip-10-0-8-141.ec2.internal in your logs sudo journalctl -u kubelet --all | tail and the Master Node will report that it is Not Ready when you run kubectl get nodes.

Please be certain to replace the values passed in --apiserver-advertise-address and --node-name with the correct values for your environment.

$ sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address 10.0.8.141 --node-name ip-10-0-8-141.ec2.internal
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"

  1. Ensure that your kubectl is looking in the right place for your config files.
$ mv .kube/config .kube/config.old
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ sudo chmod 777 $HOME/.kube/config
$ export KUBECONFIG=.kube/config
  1. Reboot your master node
$ sudo /sbin/shutdown -r now
  1. Reconnect to your master node and grab your token, and verify that your Master Node is "Ready". Copy the token to your clipboard. You will need it in the next step.
$ kubectl get nodes
$ kubeadm token list

If you do not have a valid token. You can create one with:

$ kubeadm token create

The token should look something like 6dihyb.d09sbgae8ph2atjw

  1. SSH into each of the slave nodes and reconnect them to the master
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
$ sudo kubeadm join --token=<token from step 8>  <ip of master node>:<port used 6443 is the default> --node-name <should be the same one as from step 5>

  1. Repeat Step 9 for each connecting node. From the master node, you can verify that all slave nodes have connected and are ready with:
$ kubectl get nodes

Hopefully this gets you where you need to be @davidcomeyne.

Thanks a bunch @danroliver !
I will definitely try that and post my findings here.

@danroliver Thanks! Just tried it on an old single-node cluster, so did steps up to 7. It worked.

@danroliver Worked for me. Thank you.

Did not work for me, had to set up a new cluster. But glad it helped others!

thank you @danroliver . it works for me
and my kubeadm version is 1.8.5

Thanks @danroliver putting together the steps. I had to make small additions to your steps. My cluster is running v1.9.3 and it is in a private datacenter off of the Internet.

On the Master

  1. Prepare a kubeadm config.yml.
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
api:
  advertiseAddress: <master-ip>
kubernetesVersion: 1.9.3
  1. Backup certs and conf files
mkdir ~/conf-archive/
for f in `ls *.conf`;do mv $f ~/conf-archive/$f.old;done

mkdir ~/pki-archive
for f in `ls apiserver* front-*client*`;do mv $f ~/pki-archive/$f.old;done
  1. The kubeadm commands on master had --config config.yml like this:
kubeadm alpha phase certs apiserver --config ./config.yml 
kubeadm alpha phase certs apiserver-kubelet-client --config ./config.yml 
kubeadm alpha phase certs front-proxy-client --config ./config.yml
kubeadm alpha phase kubeconfig all --config ./config.yml --node-name <master-node>
reboot
  1. Create token

On the minions

I had to move

mv /etc/kubernetes/pki/ca.crt ~/archive/
mv /etc/kubernetes/kubelet.conf ~/archive/
systemctl stop kubelet
kubeadm join --token=eeefff.55550009999b3333 --discovery-token-unsafe-skip-ca-verification <master-ip>:6443

Thanks @danroliver! Only my single-node cluster it was enough to follow steps 1-6 (no reboot) then send a SIGHUP to kube-apiserver. Just found the container id with docker ps and set the signal with docker kill -s HUP <container id>.

Thanks a lot @danroliver! On our single-master/multi-workers cluster, doing the steps from 1 to 7 were enough, we did not have to reconnect every worker node to the master (which was the most painful part).

Thanks for this great step-by-step, @danroliver! I'm wondering how this process might be applied to a multi-master cluster (bare metal, currently running 1.11.1), and preferably without downtime. My certs are not yet expired, but I am trying to learn how to regenerate/renew them before that happens.

@kcronin
please take a look at this new document:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
hope that helps.

@danroliver: Thank you very much, it's working.

It's not necessary to reboot the servers.
It's enought to recreate kube-system pods (apiserver, schduler, ...) by these two commands:

systemctl restart kubelet
for i in $(docker ps | egrep 'admin|controller|scheduler|api|fron|proxy' | rev | awk '{print $1}' | rev);
do docker stop $i; done

I had to deal with this also on a 1.13 cluster, in my case the certificates were about to expire so slightly different
Also dealing with a single master\control instance on premise so did not have to worry about a HA setup or AWS specifics
Have not included the back steps as the other guys have included above

Since the certs had not expired, the cluster already had workloads which I wanted to continue working
Did not have to deal with etcd certs either at this time so have omitted

So at a high level I had to

  • On the master

    • Generate new certificates on the master

    • Generate new kubeconfigs with embedded certificates

    • Generate new kubelet certicates - client and server

    • Generate a new token for the worker node kubelets

  • For each worker

    • Drain the worker first on the master

    • ssh to the worker, stop the kubelet, remove files and restart the kubelet

    • Uncordon the worker on the master

  • On master at the end

    • Delete token

# On master - See https://kubernetes.io/docs/setup/certificates/#all-certificates

# Generate the new certificates - you may have to deal with AWS - see above re extra certificate SANs
sudo kubeadm alpha certs renew apiserver
sudo kubeadm alpha certs renew apiserver-etcd-client
sudo kubeadm alpha certs renew apiserver-kubelet-client
sudo kubeadm alpha certs renew front-proxy-client

# Generate new kube-configs with embedded certificates - Again you may need extra AWS specific content - see above
sudo kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin  > admin.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > controller-manager.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-scheduler > scheduler.conf

# chown and chmod so they match existing files
sudo chown root:root {admin,controller-manager,kubelet,scheduler}.conf
sudo chmod 600 {admin,controller-manager,kubelet,scheduler}.conf

# Move to replace existing kubeconfigs
sudo mv admin.conf /etc/kubernetes/
sudo mv controller-manager.conf /etc/kubernetes/
sudo mv kubelet.conf /etc/kubernetes/
sudo mv scheduler.conf /etc/kubernetes/

# Restart the master components
sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

# Verify master component certificates - should all be 1 year in the future
# Cert from api-server
echo -n | openssl s_client -connect localhost:6443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from controller manager
echo -n | openssl s_client -connect localhost:10257 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from scheduler
echo -n | openssl s_client -connect localhost:10259 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

# Generate kubelet.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo chown root:root kubelet.conf
sudo chmod 600 kubelet.conf

# Drain
kubectl drain --ignore-daemonsets $(hostname)
# Stop kubelet
sudo systemctl stop kubelet
# Delete files
sudo rm /var/lib/kubelet/pki/*
# Copy file
sudo mv kubelet.conf /etc/kubernetes/
# Restart
sudo systemctl start kubelet
# Uncordon
kubectl uncordon $(hostname)

# Check kubelet
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

Lets create a new token for nodes re-joining the cluster (After kubelet restart)

# On master
sudo kubeadm token create

Now for each worker - one at a time

kubectl drain --ignore-daemonsets --delete-local-data WORKER-NODE-NAME

ssh to worker node

# Stop kubelet
sudo systemctl stop kubelet

# Delete files
sudo rm /etc/kubernetes/kubelet.conf
sudo rm /var/lib/kubelet/pki/*

# Alter the bootstrap token
new_token=TOKEN-FROM-CREATION-ON-MASTER
sudo sed -i "s/token: .*/token: $new_token/" /etc/kubernetes/bootstrap-kubelet.conf

# Start kubelet
sudo systemctl start kubelet

# Check kubelet certificate
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet.crt -text -noout | grep Not

Back to master and uncordon the worker

kubectl uncordon WORKER-NODE-NAME

After all workers have been updated - Remove token - will expire in 24h but lets get rid of it

On master
sudo kubeadm token delete TOKEN-FROM-CREATION-ON-MASTER

@pmcgrath Thanks for posting those steps. I managed to follow them and renew my certificates, and get a working cluster.

If you are using a version of kubeadm prior to 1.8, where I understand certificate rotation #206 was put into place (as a beta feature) or your certs already expired, then you will need to manually update your certs (or recreate your cluster which it appears some (not just @kachkaev) end up resorting to).

You will need to SSH into your master node. If you are using kubeadm >= 1.8 skip to 2.

1. Update Kubeadm, if needed. I was on 1.7 previously.
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
1. Backup old apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.old
$ sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.old
$ sudo mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/pki/front-proxy-client.crt.old
$ sudo mv /etc/kubernetes/pki/front-proxy-client.key /etc/kubernetes/pki/front-proxy-client.key.old
1. Generate new apiserver, apiserver-kubelet-client, and front-proxy-client certs and keys.
$ sudo kubeadm alpha phase certs apiserver --apiserver-advertise-address <IP address of your master server>
$ sudo kubeadm alpha phase certs apiserver-kubelet-client
$ sudo kubeadm alpha phase certs front-proxy-client
1. Backup old configuration files
$ sudo mv /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.old
$ sudo mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.old
$ sudo mv /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.old
$ sudo mv /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.old
1. Generate new configuration files.

There is an important note here. If you are on AWS, you will need to explicitly pass the --node-name parameter in this request. Otherwise you will get an error like: Unable to register node "ip-10-0-8-141.ec2.internal" with API server: nodes "ip-10-0-8-141.ec2.internal" is forbidden: node ip-10-0-8-141 cannot modify node ip-10-0-8-141.ec2.internal in your logs sudo journalctl -u kubelet --all | tail and the Master Node will report that it is Not Ready when you run kubectl get nodes.

Please be certain to replace the values passed in --apiserver-advertise-address and --node-name with the correct values for your environment.

$ sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address 10.0.8.141 --node-name ip-10-0-8-141.ec2.internal
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
1. Ensure that your `kubectl` is looking in the right place for your config files.
$ mv .kube/config .kube/config.old
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ sudo chmod 777 $HOME/.kube/config
$ export KUBECONFIG=.kube/config
1. Reboot your master node
$ sudo /sbin/shutdown -r now
1. Reconnect to your master node and grab your token, and verify that your Master Node is "Ready". Copy the token to your clipboard. You will need it in the next step.
$ kubectl get nodes
$ kubeadm token list

If you do not have a valid token. You can create one with:

$ kubeadm token create

The token should look something like 6dihyb.d09sbgae8ph2atjw

1. SSH into each of the slave nodes and reconnect them to the master
$ sudo curl -sSL https://dl.k8s.io/release/v1.8.15/bin/linux/amd64/kubeadm > ./kubeadm.1.8.15
$ chmod a+rx kubeadm.1.8.15
$ sudo mv /usr/bin/kubeadm /usr/bin/kubeadm.1.7
$ sudo mv kubeadm.1.8.15 /usr/bin/kubeadm
$ sudo kubeadm join --token=<token from step 8>  <ip of master node>:<port used 6443 is the default> --node-name <should be the same one as from step 5>
1. Repeat Step 9 for each connecting node. From the master node, you can verify that all slave nodes have connected and are ready with:
$ kubectl get nodes

Hopefully this gets you where you need to be @davidcomeyne.

This is what I need only for 1.14.2 .. any hints on how to

I had to deal with this also on a 1.13 cluster, in my case the certificates were about to expire so slightly different
Also dealing with a single master\control instance on premise so did not have to worry about a HA setup or AWS specifics
Have not included the back steps as the other guys have included above

Since the certs had not expired, the cluster already had workloads which I wanted to continue working
Did not have to deal with etcd certs either at this time so have omitted

So at a high level I had to

* On the master

  * Generate new certificates on the master
  * Generate new kubeconfigs with embedded certificates
  * Generate new kubelet certicates - client and server
  * Generate a new token for the worker node kubelets

* For each worker

  * Drain the worker first on the master
  * ssh to the worker, stop the kubelet, remove files and restart the kubelet
  * Uncordon the worker on the master

* On master at the end

  * Delete token
# On master - See https://kubernetes.io/docs/setup/certificates/#all-certificates

# Generate the new certificates - you may have to deal with AWS - see above re extra certificate SANs
sudo kubeadm alpha certs renew apiserver
sudo kubeadm alpha certs renew apiserver-etcd-client
sudo kubeadm alpha certs renew apiserver-kubelet-client
sudo kubeadm alpha certs renew front-proxy-client

# Generate new kube-configs with embedded certificates - Again you may need extra AWS specific content - see above
sudo kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin  > admin.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > controller-manager.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-scheduler > scheduler.conf

# chown and chmod so they match existing files
sudo chown root:root {admin,controller-manager,kubelet,scheduler}.conf
sudo chmod 600 {admin,controller-manager,kubelet,scheduler}.conf

# Move to replace existing kubeconfigs
sudo mv admin.conf /etc/kubernetes/
sudo mv controller-manager.conf /etc/kubernetes/
sudo mv kubelet.conf /etc/kubernetes/
sudo mv scheduler.conf /etc/kubernetes/

# Restart the master components
sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

# Verify master component certificates - should all be 1 year in the future
# Cert from api-server
echo -n | openssl s_client -connect localhost:6443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from controller manager
echo -n | openssl s_client -connect localhost:10257 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from scheduler
echo -n | openssl s_client -connect localhost:10259 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

# Generate kubelet.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo chown root:root kubelet.conf
sudo chmod 600 kubelet.conf

# Drain
kubectl drain --ignore-daemonsets $(hostname)
# Stop kubelet
sudo systemctl stop kubelet
# Delete files
sudo rm /var/lib/kubelet/pki/*
# Copy file
sudo mv kubelet.conf /etc/kubernetes/
# Restart
sudo systemctl start kubelet
# Uncordon
kubectl uncordon $(hostname)

# Check kubelet
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

Lets create a new token for nodes re-joining the cluster (After kubelet restart)

# On master
sudo kubeadm token create

Now for each worker - one at a time

kubectl drain --ignore-daemonsets --delete-local-data WORKER-NODE-NAME

ssh to worker node

# Stop kubelet
sudo systemctl stop kubelet

# Delete files
sudo rm /etc/kubernetes/kubelet.conf
sudo rm /var/lib/kubelet/pki/*

# Alter the bootstrap token
new_token=TOKEN-FROM-CREATION-ON-MASTER
sudo sed -i "s/token: .*/token: $new_token/" /etc/kubernetes/bootstrap-kubelet.conf

# Start kubelet
sudo systemctl start kubelet

# Check kubelet certificate
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout | grep Not
sudo openssl x509 -in /var/lib/kubelet/pki/kubelet.crt -text -noout | grep Not

Back to master and uncordon the worker

kubectl uncordon WORKER-NODE-NAME

After all workers have been updated - Remove token - will expire in 24h but lets get rid of it

On master
sudo kubeadm token delete TOKEN-FROM-CREATION-ON-MASTER

I know this issue is closed but I have the same problem on 1.14.2 and the guide gives no errors but I cannot connect to the cluster and reissue the token (I get auth failed)

A k8s cluster created using kubeadm v1.9.x experienced the same issue (apiserver-kubelet-client.crt expired on 2 July) at the age of v1.14.1 lol

I had to refer to 4 different sources to renew the certificates, regenerate the configuration files and bring the simple 3 node cluster back.

@danroliver gave very good and structured instructions, very close to the below guide from IBM.
[Renewing Kubernetes cluster certificates] from IBM WoW! (https://www.ibm.com/support/knowledgecenter/en/SSCKRH_1.1.0/platform/t_certificate_renewal.html)

NOTE: IBM Financial Crimes Insight with Watson private is powered by k8s, never knew that.

Problem with step 3 and step 5

Step 3 should NOT have the phase in the command

$ sudo kubeadm alpha certs renew apiserver
$ sudo kubeadm alpha certs renew apiserver-kubelet-client
$ sudo kubeadm alpha certs renew front-proxy-client

Step 5 should be using below, kubeadm alpha does not have kubeconfig all, that is a kubeadm init phase instead

# kubeadm init phase kubeconfig all
I0705 12:42:24.056152   32618 version.go:240] remote version is much newer: v1.15.0; falling back to: stable-1.14
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file

in 1.15 we have added better documentation for certificate renewal:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/

also, after 1.15 kubeadm upgrade automatically will renewal the certificates for you!

A k8s cluster created using kubeadm v1.9.x experienced the same issue (apiserver-kubelet-client.crt expired on 2 July) at the age of v1.14.1 lol

versions older than 1.13 are already unsupported.
we strongly encourage the users to keep up with this fast moving project.

currently there are discussions going on by the LongTermSupport Working Group, to have versions of kubernetes being supported for longer periods of time, but establishing the process might take a while.

Thanks @pmorie .
Works for kube version 1.13.6

Just a comment and feature request: This cert expiration hit us in production on our Kubernetes 1.11.x cluster this morning. We tried everything above (and to links), but hit numerous errors, gave up after a few hours getting completely stuck with a large hosed cluster. Fortunately, we were about 2 weeks away from upgrading to Kubernetes 1.15 (and building a new cluster) so we ended up just creating a new 1.15 cluster from scratch and copying over all our user data.

I very much wish there had been some warning before this happened. We just went from "incredibly stable cluster" to "completely broken hellish nightmare" without any warning, and had probably our worst downtime ever. Fortunately, it was a west coast Friday afternoon, so relatively minimally impactful.

Of everything discussed above and in all the linked tickets, the one thing that would have made a massive
difference for us isn't mentioned: start displaying a warning when certs are going to expire soon. (E.g., if you use kubectl, and the cert is going to expire within a few weeks, please tell me!).

Sorry for your troubles. Normally it is the responsibility of the operator
to monitor the certs on disk for expiration. But i do agree that the lack
of easy monitoring can cause trouble. That is one of the reasons we added a
command to check cert expiration in kubeadm. See
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/

Also please note that after 1.15 kubeadm will auto renew certificates on
upgrade. Which encourages the users to upgrade more often too.
On Jul 20, 2019 03:49, "William Stein" notifications@github.com wrote:

Just a comment and feature request: This cert expiration hit us in
production on our Kubernetes 1.11.x cluster this morning. We tried
everything above (and to links), but hit numerous errors, gave up after a
few hours getting completely stuck with a large hosed cluster. Fortunately,
we were about 2 weeks away from upgrading to Kubernetes 1.15 (and building
a new cluster) so we ended up just creating a new 1.15 cluster from scratch
and copying over all our user data.

I very much wish there had been some warning before this happened. We just
went from "incredibly stable cluster" to "completely broken hellish
nightmare" without any warning, and had probably our worst downtime ever.
Fortunately, it was a west coast Friday afternoon, so relatively minimally
impactful.

Of everything discussed above and in all the linked tickets, the one thing
that would have made a massive
difference for us isn't mentioned: start displaying a warning when certs
are going to expire soon
. (E.g., if you use kubectl, and the cert is
going to expire within a few weeks, please tell me!).


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubeadm/issues/581?email_source=notifications&email_token=AACRATDWBQHYVVRG4LYVTXLQAJOJHA5CNFSM4EGBFHKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2NCYFA#issuecomment-513420308,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AACRATC437G4OZ3ZOEQM5LLQAJOJHANCNFSM4EGBFHKA
.

@neolit123 Thanks; we will add something to our own monitoring infrastructure to periodically check for upcoming cert issues, as explained in your comment.

@danroliver Thx a lot for your reply. It saved lots of time for me.
One point worth mentioning is the "etcd" related certificates, which should be renewed in the same way. There is no need for configuration reloading since it is used in metadata YAML files as references.

For Kubernetes v1.14 I find this procedure proposed by @desdic the most helpful:

$ cd /etc/kubernetes/pki/
$ mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} ~/
$ kubeadm init phase certs all --apiserver-advertise-address <IP>
  • backup and re-generate all kubeconfig files:
$ cd /etc/kubernetes/
$ mv {admin.conf,controller-manager.conf,mv kubelet.conf,scheduler.conf} ~/
$ kubeadm init phase kubeconfig all
$ reboot
  • copy new admin.conf:
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

For Kubernetes v1.14 I find this procedure the most helpful:

* https://stackoverflow.com/a/56334732/1147487

I created the fix once I had my own cluster fixed .. hoped that someone else could use it

@danroliver gave very good and structured instructions, very close to the below guide from IBM.
[Renewing Kubernetes cluster certificates] from IBM WoW! (https://www.ibm.com/support/knowledgecenter/en/SSCKRH_1.1.0/platform/t_certificate_renewal.html)

Nice! I wonder when this was published. I certainly would have found this helpful when I was going through this.

Note about tokens in K8s 1.13.x (possibly other K8s versions)
If you've ended up re-generating your CA certificate (/etc/kubernetes/pki/ca.crt), your tokens (see kubectl -n kube-system get secret | grep token) might have old CA, and will have to be regenerated. Troubled tokens included kube-proxy-token, coredns-token in my case (and others), which caused cluster-critical services to unable to authenticate with K8s API.
To regenerate tokens, delete old ones, and they will be recreated.
Same goes for any services talking to K8s API, such as PV Provisioner, Ingress Controllers, cert-manager, etc..

Thanks for this great step-by-step, @danroliver! I'm wondering how this process might be applied to a multi-master cluster (bare metal, currently running 1.11.1), and preferably without downtime. My certs are not yet expired, but I am trying to learn how to regenerate/renew them before that happens.

Hi @kcronin, how did you solved with multi-master config? I don't know how to proceed with --apiserver-advertise-address as I have 3 IPs and not only one.

Thanks

@pmcgrath In case I have 3 masters, should I repeat the steps on each master? or what is the . case

@SuleimanWA, you can copy admin.conf over, as well as CA file, if CA was regenerated.
For everything else, you should repeat steps to regenerate certs (for etcd, kubelet, scheduler, etc..) on every master

@anapsix
I'm running a 1.13.x cluster, and apiserver is reporting Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid] after I renewed the certs by running kubeadm alpha certs renew all.

To regenerate tokens, delete old ones, and they will be recreated.

Which token are you referring to in this case? Is the one generated by kubeadm or how can I delete the token ?

-----UPDATE-----
I figured out it's the secret itself. In my case the kube-controller was not up so the secret was not auto-generated.

high version use:

kubeadm alpha certs renew all

When first master node's kubelet down (systemctl stop kubelet), other master nodes can't contact CA on the first master node. This resulting in the following message until kubelet on original master node brought back online:

kubectl get nodes
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get nodes)

Is there a way to have CA role transfer to other master nodes while the kublet on original CA node down?

@anapsix
I'm running a 1.13.x cluster, and apiserver is reporting Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid] after I renewed the certs by running kubeadm alpha certs renew all.

To regenerate tokens, delete old ones, and they will be recreated.

Which token are you referring to in this case? Is the one generated by kubeadm or how can I delete the token ?

-----UPDATE-----
I figured out it's the secret itself. In my case the kube-controller was not up so the secret was not auto-generated.

Hi, i have done this task but not on 1.13 version. May i ask few things if you have done this already?
So basically i will be doing:
kubeadm alpha certs renew all (which updates the control plane cert uber pki/ folder on Masters).
kubeadm init phase kubeconfig to update the kube config files. (On Master and worker).
Restart kubelet on all nodes.

Do i still need to create a token and run join on worker nodes? If possible, can you shares the steps you performed?

Was this page helpful?
0 / 5 - 0 ratings