Kubeadm: Using kubeadm to init kubernetes 1.12.0 falied:node “xxx” not found

Created on 3 Oct 2018 · 45Comments · Source: kubernetes/kubeadm

My environment:

CentOS7 linux

/etc/hosts:

192.168.0.106 master01

192.168.0.107 node02

192.168.0.108 node01

On master01 machine:

/etc/hostname:

master01

On master01 machine I execute commands as follows:

1)yum install docker-ce kubelet kubeadm kubectl

2)systemctl start docker.service

3)vim /etc/sysconfig/kubelet

EDIT the file:

KUBELET_EXTRA_ARGS="--fail-swap-on=false"

4)systemctl enable docker kubelet

5)kubeadm init --kubernetes-version=v1.12.0 --pod-network-cidr=10.244.0.0/16 servicecidr=10.96.0.0/12 --ignore-preflight-errors=all

THEN

E1002 23:32:36.072441 49157 kubelet.go:2236] node "master01" not found
E1002 23:32:36.172630 49157 kubelet.go:2236] node "master01" not found
E1002 23:32:36.273892 49157 kubelet.go:2236] node "master01" not found
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/52fbcdb7864cdf8039ded99b501447f13ba81a3897579fb64581c855653f369a/shim.sock" debug=false pid=49212
E1002 23:32:36.359984 49157 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Get https://192.168.0.106:6443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster01&limit=500&resourceVersion=0: dial tcp 192.168.0.106:6443: connect: connection refused
I1002 23:32:36.377368 49157 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
E1002 23:32:36.380290 49157 kubelet.go:2236] node "master01" not found
E1002 23:32:36.380369 49157 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.0.106:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster01&limit=500&resourceVersion=0: dial tcp 192.168.0.106:6443: connect: connection refused
E1002 23:32:36.380409 49157 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Get https://192.168.0.106:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.0.106:6443: connect: connection refused
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/f621eca36ce85e815172c37195ae7ac929112c84f3e37d16dd39c7e44ab13b0c/shim.sock" debug=false pid=49243
I1002 23:32:36.414930 49157 kubelet_node_status.go:70] Attempting to register node master01
E1002 23:32:36.416627 49157 kubelet_node_status.go:92] Unable to register node "master01" with API server: Post https://192.168.0.106:6443/api/v1/nodes: dial tcp 192.168.0.106:6443: connect: connection refused
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/db3f5acb415581d85aef199bea3f85432437c7ef00d357dca1b5684ed95b5591/shim.sock" debug=false pid=49259
E1002 23:32:36.488013 49157 kubelet.go:2236] node "master01" not found
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/505110c39ed4cd5b3fd4fb863012017a71fa782671ead943491afbf38310ffe0/shim.sock" debug=false pid=49275
E1002 23:32:36.588919 49157 kubelet.go:2236] node "master01" not found
E1002 23:32:36.691338 49157 kubelet.go:2236] node "master01" not found

l have tried a lot of times!

Source

Javacppc

Most helpful comment

Same issue here with Kubernetes v1.13.0
CentOS 7
docker-ce 18.06 (latest validated version)
dockerd: active, running
kubelet: active, running
selinux: disabled
firewalld: disabled

ERROR is:
kubelet[98023]: E1212 21:10:01.708004 98023 kubelet.go:2266] node "node1" not found
/etc/hosts contain the node, it's pingeable, it's reachable -- actually doing a single-master single-worker (i.e. tainted node).

Where does K8S look for this value? In etcd db? in /etc/hosts?
I can troubleshoot and provide more evidence if needed.

--> does kubeadm init finish and does it print a boostrap token?
It finishes with a long error:

[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

NOTE: None of the commands suggested after the timeout reported anything worth mentioning here.

kubelet and kubeadm version?
---> 1.13.0
kubeadm version: &version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T21:02:01Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}

Also, shouldn't setting a better error message than "node not found" with a bit more clarity/verbosity in the kube logs?

Thanks

tsunamireversal on 12 Dec 2018

👍16

All 45 comments

The first error message: unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

Javacppc on 3 Oct 2018

The first error message: unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

Javacppc on 3 Oct 2018

hi, here are some questions:
1) does kubeadm init finish and does it print a boostrap token?
2) container runtime version?
3) are the kubelet and kubeadm version 1.12?

/priority needs-more-evidence

neolit123 on 3 Oct 2018

👍1

need to execute systemctl start kubelet before kubeadm init

Javacppc on 3 Oct 2018

I am experiencing the same problem， because the core of cup is less than 2

hellolijj on 30 Oct 2018

same issue

etheleon on 18 Nov 2018

@javacppc how did you solve this? When I ran systemctl start kubelet I get error code

etheleon on 19 Nov 2018

same issue with kubernetes 1.12.2.
@Javacppc how did you solve this?

bruceauyeung on 19 Nov 2018

same issue

JohnnySaibot on 20 Nov 2018

👍1

same issue

asantos2000 on 21 Nov 2018

👍1

Hello guys,

I'm facing the same issue here, when I start the cluster I got the message from the token, but I cant install a cloud weave:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" The connection to the server 192.168.56.104:6443 was refused - did you specify the right host or port?

When I go to logs I got the message about the node name:

Dec 02 22:27:55 kubemaster5 kubelet[2838]: E1202 22:27:55.128645 2838 kubelet.go:2236] node "kubemaster5" not found

Can anybody please send me some light?

Thank you!

tbernacchi on 2 Dec 2018

my problem is solved and actually it's not a bug, it's because apiserver failed to start for some reason.

bruceauyeung on 3 Dec 2018

👎55 🚀1 👍1

"apiserver failed to start for some reason"? Can you give some details??

tbernacchi on 3 Dec 2018

👍4

I solved my problem several days ago. Update from 1.11.4 -> 1.12.3. I've got:

api-server - running in a specific virtual interface with its own network. (bare-metal).
After kubeadm init/join with flag apiserver-advertise-address it started on specific interface, but packages with settings/health checks walk through standard route of routing table (default interface). Helped parameter bind-address in /etc/kubernetes/manifests/kube-apiserver.yaml with bind to IP of virtual interface.
flannel - the same situation with network after creating controller, scheduler pods. DNS deployment failed because of connection refused to clusterIP of api server 10.96.0.1:443. (default routing table) I specified node-ip of cluster node by flag --node-ip in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf with IP of virtual interface.

After this i've got ready node with version 1.12.3. Most helpful information was in docker logs + kubectl logs

JohnnySaibot on 3 Dec 2018

👍1

same issue here with v1.13.0

tianjiaolaozu on 12 Dec 2018

👍1

Same issue here with Kubernetes v1.13.0
CentOS 7
docker-ce 18.06 (latest validated version)
dockerd: active, running
kubelet: active, running
selinux: disabled
firewalld: disabled

Where does K8S look for this value? In etcd db? in /etc/hosts?
I can troubleshoot and provide more evidence if needed.

--> does kubeadm init finish and does it print a boostrap token?
It finishes with a long error:

[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

NOTE: None of the commands suggested after the timeout reported anything worth mentioning here.

Also, shouldn't setting a better error message than "node not found" with a bit more clarity/verbosity in the kube logs?

Thanks

tsunamireversal on 12 Dec 2018

👍16

Same issue...

$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Fri 2018-12-14 19:05:47 UTC; 2min 2s ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 9114 (kubelet)
    Tasks: 23 (limit: 4915)
   CGroup: /system.slice/kubelet.service
           └─9114 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-d

Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.862262    9114 kuberuntime_manager.go:657] createPodSandbox for pod "kube-scheduler-pineview_kube-system(7f99b6875de942b000954351c4a
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.862381    9114 pod_workers.go:186] Error syncing pod 7f99b6875de942b000954351c4ac09b5 ("kube-scheduler-pineview_kube-system(7f99b687
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.906855    9114 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to start san
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.906944    9114 kuberuntime_sandbox.go:65] CreatePodSandbox for pod "etcd-pineview_kube-system(b7841e48f3e7b81c3cda6872104ba3de)" fai
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.906981    9114 kuberuntime_manager.go:657] createPodSandbox for pod "etcd-pineview_kube-system(b7841e48f3e7b81c3cda6872104ba3de)" fa
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.907100    9114 pod_workers.go:186] Error syncing pod b7841e48f3e7b81c3cda6872104ba3de ("etcd-pineview_kube-system(b7841e48f3e7b81c3c
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.933627    9114 kubelet.go:2236] node "pineview" not found
Dec 14 19:07:50 pineview kubelet[9114]: E1214 19:07:50.033880    9114 kubelet.go:2236] node "pineview" not found
Dec 14 19:07:50 pineview kubelet[9114]: E1214 19:07:50.134064    9114 kubelet.go:2236] node "pineview" not found
Dec 14 19:07:50 pineview kubelet[9114]: E1214 19:07:50.184943    9114 event.go:212] Unable to write event: 'Post https://192.168.1.235:6443/api/v1/namespaces/default/events: dial tcp 192.

cjbottaro on 14 Dec 2018

👍2

Same issue:

Ubuntu 18.04.1 LTS
Kubernetes v1.13.1 (using cri-o 1.11)

Followed the installation instructions on kubernetes.io:
https://kubernetes.io/docs/setup/independent/install-kubeadm/
https://kubernetes.io/docs/setup/cri/#cri-o

systemctl enable kubelet.service
kubeadm init --pod-network-cidr=192.168.0.0/16 --cri-socket=/var/run/crio/crio.sock

/etc/hosts

127.0.0.1       localhost
::1             localhost ip6-localhost ip6-loopback
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters

127.0.1.1       master01.mydomain.tld master01
::1             master01.mydomain.tld master01

/etc/hostname

systemctl status kubelet

● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Tue 2018-12-18 16:19:54 CET; 20min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 10148 (kubelet)
    Tasks: 21 (limit: 2173)
   CGroup: /system.slice/kubelet.service
           └─10148 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --resolv-conf=/run/systemd/resolve/resolv.conf

Dec 18 16:40:52 master01 kubelet[10148]: E1218 16:40:52.795313   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:52 master01 kubelet[10148]: E1218 16:40:52.896277   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:52 master01 kubelet[10148]: E1218 16:40:52.997864   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.098927   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.200355   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.281586   10148 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.178.27:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster01limit=500&resourceVersion=0: dial tcp 192.168.178.27:6443: connect: connection refused
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.282143   10148 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168.178.27:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.178.27:6443: connect: connection refused
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.283945   10148 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://192.168.178.27:6443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster01limit=500&resourceVersion=0: dial tcp 192.168.178.27:6443: connect: connection refused
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.301468   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.402256   10148 kubelet.go:2266] node "master01" not found

fhemberger on 18 Dec 2018

❤7

@fhemberger I figured out my issue. It was using snap to install Docker. If I uninstalled that and reinstalled using apt, then kubeadm worked fine.

cjbottaro on 18 Dec 2018

👍2

@cjbottaro I don't use Docker at all but cri-o.

fhemberger on 18 Dec 2018

👍8

same issue here with v1.13.1

bulolo on 22 Dec 2018

If you're using systemd and cri-o, make sure to set it as cgroup driver in /var/lib/kubelet/config.yaml (or pass the snippet below as part of kubeadm init --config=config.yaml).

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

If you notice this in your kubelet logs:

remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = cri-o configured with systemd cgroup manager, but did not receive slice as parent: /kubepods/besteffort/…

fhemberger on 23 Dec 2018

👍5 🎉3

I met the same issue today.

I fixed it by removing rm -rf /var/lib/kubelet/ and re-install

JishanXing on 26 Dec 2018

👍5 🎉4

@JishanXing thank you! This also resolved my issue for running on Raspbian Sketch lite

james-millner on 30 Dec 2018

I fixed it by removing /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf

joeke80215 on 27 Jan 2019

👍1

It would be better to use the kubeadm reset command.

CaledoniaProject on 4 Feb 2019

@fhemberger how to solve it ， the same question , thanks

KeyaJohn on 15 Feb 2019

👍2

I met the same issue when i upgrade k8s from 1.13.3 to 1.13.4 ...
I solve it after i edit /etc/kubernetes/manifests/kube-scheduler.yaml . Modify the image version
image: k8s.gcr.io/kube-scheduler:v1.13.3 ==> image: k8s.gcr.io/kube-scheduler:v1.13.4
the same to kube-controller-manager.yaml and kube-apiserver.yaml.

kitrap on 24 Mar 2019

The latest way is add option --image-repository registry.aliyuncs.com/google_containers , my k8s verion is 1.14.0,docker Version: 18.09.2,
` kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.14.0 --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [jin-virtual-machine kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.232.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [jin-virtual-machine localhost] and IPs [192.168.232.130 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [jin-virtual-machine localhost] and IPs [192.168.232.130 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 17.004356 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node jin-virtual-machine as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node jin-virtual-machine as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: xucir0.o4kzo3qqjyjnzphl
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.232.130:6443 --token xucir0.o4kzo3qqjyjnzphl
--discovery-token-ca-cert-hash sha256:022048b22926a2cb2f8295ce2e3f1f6fa7ffe1098bc116f7d304a26bcfb78656
`

jinjin123 on 28 Mar 2019

👍3

Ran into same issue with kubernetes v1.14.1 and cri-o v1.14.0 on a GCP Ubuntu 18.04 VM. Things worked fine when using docker though. referencing: https://github.com/cri-o/cri-o/issues/2357

sdeoras on 12 May 2019

My problem was with the different cgroup drivers. CRIO uses systemd by default, kubelet uses cgroupfs by default.

cat /etc/crio/crio.conf | grep cgroup
# cgroup_manager is the cgroup management implementation to be used
cgroup_manager = "systemd"

If that is your case, see https://kubernetes.io/docs/setup/independent/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node

Just write the file

echo "KUBELET_EXTRA_ARGS=--cgroup-driver=systemd" > /etc/default/kubelet

and run kubeadm init after that. Or change cgroup_manager to cgroupfs

outcoldman on 16 May 2019

👍4

unlike docker, cri-o and containerd are slightly more tricky to manage in terms of cgroup driver detection, but there are some plans to support that from kubeadm.

docker is handled already.

neolit123 on 16 May 2019

So apparently there is no solution but reseting the cluster $(yes | kubeadm reset), which is not a solution in my opinion!

laith-leo on 17 Jun 2019

👍6 😕2

Changing image repo worked for me, but this is not a great fix.
--image-repository registry.aliyuncs.com/google_containers

cloudoutloud on 27 Jun 2019

my case its worked with this

sed -i 's/cgroup-driver=systemd/cgroup-driver=cgroupfs/g' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

vkovelkar on 28 Jun 2019

I hava the same issue. I use kubeadm init --config=init-config.yaml and failed, this file generated by kubeadm. there is a field advertiseAddress default is 1.2.3.4 in file , which make etcd contianer start fail. when I change to 127.0.0.1, etcd contianer start successfully and kubeadm init success

to solve this problem, use docker ps -a list all container check if some of them exited, if so, use docker logs CONTIANER_ID see what happend. hope it helps

songjiyang on 22 Jul 2019

Hey everyone, does anyone have a solution? Same problem here, but using the k3s.

MateusMac on 29 Jul 2019

@MateusMac you should probably open a bug report against k3s as well.

dims on 29 Jul 2019

Worked for a week on getting kubeadm working on
Ubuntu 18.04
docker 18.06-2-ce
k8s 1.15.1
sudo kubeadm init --pod-network-cidr=10.244.0.0/16

Fails with:

[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

The kubelet logs show it just can't find the nodes to get past first base:

warproot@warp02:~$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Sun 2019-08-04 18:22:26 AEST; 5min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 12569 (kubelet)
    Tasks: 27 (limit: 9830)
   CGroup: /system.slice/kubelet.service
           └─12569 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-dri

Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.322762   12569 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "kube-scheduler-warp02_kube-system(ecae9d12d3610192347be3d1aa5aa552)"
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.322806   12569 kuberuntime_manager.go:692] createPodSandbox for pod "kube-scheduler-warp02_kube-system(ecae9d12d3610192347be3d1aa5aa552)
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.322872   12569 pod_workers.go:190] Error syncing pod ecae9d12d3610192347be3d1aa5aa552 ("kube-scheduler-warp02_kube-system(ecae9d12d36101
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.373094   12569 kubelet.go:2248] node "warp02" not found
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.375587   12569 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://10.1.1.4:6443
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.473295   12569 kubelet.go:2248] node "warp02" not found
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.573567   12569 kubelet.go:2248] node "warp02" not found
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.575495   12569 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Get https://10.1.1.4:6
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.590886   12569 event.go:249] Unable to write event: 'Post https://10.1.1.4:6443/api/v1/namespaces/default/events: dial tcp 10.1.1.4:6443
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.673767   12569 kubelet.go:2248] node "warp02" not found

I should note I have multiple NICs on these bare-metal machines:

warproot@warp02:~$ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:feff:fe65:37f  prefixlen 64  scopeid 0x20<link>
        ether 02:42:fe:65:03:7f  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6  bytes 516 (516.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp35s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.0.2  netmask 255.255.255.0  broadcast 10.0.0.255
        inet6 fe80::32b5:c2ff:fe02:410b  prefixlen 64  scopeid 0x20<link>
        ether 30:b5:c2:02:41:0b  txqueuelen 1000  (Ethernet)
        RX packets 46  bytes 5821 (5.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 70  bytes 7946 (7.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp6s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.1.1.4  netmask 255.255.255.0  broadcast 10.1.1.255
        inet6 fd42:59ff:1166:0:25a7:3617:fee6:424e  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::1a03:73ff:fe44:5694  prefixlen 64  scopeid 0x20<link>
        inet6 fd9e:fdd6:9e01:0:1a03:73ff:fe44:5694  prefixlen 64  scopeid 0x0<global>
        ether 18:03:73:44:56:94  txqueuelen 1000  (Ethernet)
        RX packets 911294  bytes 1361047672 (1.3 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 428759  bytes 29198065 (29.1 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 17  

ib0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 4092
        unspec A0-00-02-10-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ib1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 4092
        unspec A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 25473  bytes 1334779 (1.3 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 25473  bytes 1334779 (1.3 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I don't know if that is a problem, but I set up my /etc/hosts file as

warproot@warp02:~$ cat /etc/hosts
127.0.0.1       localhost.localdomain   localhost
::1             localhost6.localdomain6 localhost6
# add our host name
10.1.1.4 warp02 warp02.ad.xxx.com
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
# add our ipv6 host name
fd42:59ff:1166:0:25a7:3617:fee6:424e warp02 warp02.ad.xxx.com

warproot@warp02:~$

So, it is set-up (I think) to look at the NIC 10.1.1.4 as the "network" for k8s

nslookup against the node-name seems to be working fine:

warproot@warp02:~$ nslookup warp02
Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
Name:   warp02.ad.xxx.com
Address: 10.1.1.4
Name:   warp02.ad.xxx.com
Address: fd42:59ff:1166:0:25a7:3617:fee6:424e

warproot@warp02:~$

I have been through the kubeadm install documentation several times.

Weird. It just can't find the network.

Stumped.

savvyyabby on 4 Aug 2019

For version 1.15.3 I was able to fix this on Ubuntu 18.04 by adding

kind: InitConfiguration
nodeRegistration:
  kubeletExtraArgs:
    cgroup-driver: "systemd"

to my kubeadm config and then running kubeadm init

kris-nova on 23 Aug 2019

I have the same issue here, with version 1.15.3, on Ubuntu 18.04
@kris-nova I would really appreciate if you could specify where this config file is located :-)

UPDATE: I can't really tell why, but it works now, without changing any configuration!
(note: I don't know if it's related, but I updated docker from v.19.03.1 to v.19.03.2, before retrying kubeadm init)

LSmyrnaios on 2 Sep 2019

I was getting below error while running kubeadm init i.e. nodexx not found..

[root@node01 ~]# journalctl -xeu kubelet
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.682095 2968 kubelet.go:2267] node "node01" not found
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.782554 2968 kubelet.go:2267] node "node01" not found
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.829142 2968 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.CSID
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.884058 2968 kubelet.go:2267] node "node01" not found
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.984510 2968 kubelet.go:2267] node "node01" not found
Nov 07 10:34:03 node01 kubelet[2968]: E1107 10:34:03.030884 2968 reflector.go:123]

Solved via:

setenforce 0

sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux

ejaivar on 7 Nov 2019

👍2

same issue

mushdavtyan on 16 Apr 2020

In my case that was caused by a time drift in the master node, _which happened after a power cut-off_.
I had that fixed by running

# Correcting the time as mentioned here https://askubuntu.com/a/254846/861548
sudo service ntp stop
sudo ntpdate -s time.nist.gov
sudo service ntp start
# Then restarting the kubelet
sudo systemctl restart kubelet.service
# I also had to run daemon-reload as I got the following warning
# Warning: The unit file, source configuration file or drop-ins of kubelet.service changed on disk. Run 'systemctl daemon-reload' to reload units.
sudo systemctl daemon-reload
# I also made another restart, which I don't know whether needed or not
sudo systemctl restart kubelet.service

weshouman on 27 Aug 2020

i fixed my same problem node "xxxx" not found ， try to see the container log use docker logs container_id , then I see apiserver try to connect 127.0.0.1:2379 , edit file ·/etc/kubernetes/manifests/etcd.yaml , restart , problem fixed 。