Kubeadm: 使用 kubeadm 初始化 kubernetes 1.12.0 failed:node “xxx” not found

创建于 2018-10-03  ·  45评论  ·  资料来源: kubernetes/kubeadm

我的环境:

CentOS7 linux

/etc/hosts:

192.168.0.106 master01

192.168.0.107 node02

192.168.0.108 node01

在 master01 机器上:

/etc/主机名:

大师01

在 master01 机器上,我执行如下命令:

1) yum 安装 docker-ce kubelet kubeadm kubectl

2)systemctl启动docker.service

3)vim /etc/sysconfig/kubelet

编辑文件:

KUBELET_EXTRA_ARGS="--fail-swap-on=false"

4)systemctl 启用 docker kubelet

5)kubeadm init --kubernetes-version=v1.12.0 --pod-network-cidr=10.244.0.0/16 servicecidr=10.96.0.0/12 --ignore-preflight-errors=all

然后

E1002 23:32:36.072441 49157 kubelet.go:2236] 未找到节点“master01”
E1002 23:32:36.172630 49157 kubelet.go:2236] 未找到节点“master01”
E1002 23:32:36.273892 49157 kubelet.go:2236] 未找到节点“master01”
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim 启动" address="/containerd-shim/moby/52fbcdb7864cdf8039ded99b501447f19b501447f13ba81447f13ba81765a3f13ba8165a3f13ba8165a38c8c8755a38c8c865a383c8c865a38c8c856538c3 pid=49212
E1002 23:32:36.359984 49157 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: 无法列出 *v1.Node: 获取https://192.168.0.106 :6443/ v1/nodes?fieldSelector=metadata.name%3Dmaster01&limit=500&resourceVersion=0: dial tcp 192.168.0.106:6443: connect: 连接被拒绝
I1002 23:32:36.377368 49157 kubelet_node_status.go:276] 设置节点注释以启用卷控制器附加/分离
E1002 23:32:36.380290 49157 kubelet.go:2236] 未找到节点“master01”
E1002 23:32:36.380369 49157 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: 无法列出 *v1.Pod: 获取https://192.168.0.106 :6443/ api/v1/pods?fieldSelector=spec.nodeName%3Dmaster01&limit=500&resourceVersion=0: dial tcp 192.168.0.106:6443: connect: 连接被拒绝
E1002 23:32:36.380409 49157 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: 无法列出 *v1.Service: 获取https://192.168.0.106 :6443/ v1/services?limit=500&resourceVersion=0: dial tcp 192.168.0.106:6443: connect: 连接被拒绝
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim 开始" address="/containerd-shim/moby/f621eca36ce85e815172c37195ae7ac94913e237c92913e34dock1bfc10000000000000 pid=49243
I1002 23:32:36.414930 49157 kubelet_node_status.go:70] 尝试注册节点 master01
E1002 23:32:36.416627 49157 kubelet_node_status.go:92] 无法向 API 服务器注册节点“master01”:发布https://192.168.0.106 :6443/api/v1/nodes:拨 tcp: 1964.136: connect : 拒绝连接
时间="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim 开始" address="/containerd-shim/moby/db3f5acb415581d85aef199bea3f85432401bug35b7s80240135c5b8000000" pid=49259
E1002 23:32:36.488013 49157 kubelet.go:2236] 未找到节点“master01”
time="2018-10-02T23:32:36+08:00" level=info msg="shim docker-containerd-shim 开始" address="/containerd-shim/moby/505110c39ed4cd5b3fd4fb863012017a6718fd4fb863012017a671fae73017a71fae73017a671fae73fock40300000000 pid=49275
E1002 23:32:36.588919 49157 kubelet.go:2236] 未找到节点“master01”
E1002 23:32:36.691338 49157 kubelet.go:2236] 未找到节点“master01”

我试过很多次了!

最有用的评论

Kubernetes v1.13.0 也有同样的问题
CentOS 7
docker-ce 18.06(最新验证版本)
dockerd:活动,正在运行
kubelet:活动,运行
selinux:已禁用
firewalld:已禁用

错误是:
kubelet[98023]: E1212 21:10:01.708004 98023 kubelet.go:2266] node "node1" not found
/etc/hosts 包含节点,它是可访问的,它是可访问的——实际上是在做一个单主单工人(即受污染的节点)。

K8S 在哪里寻找这个值? 在 /etc/hosts 中?
如果需要,我可以排除故障并提供更多证据。

--> kubeadm init 是否完成并打印 boostrap 令牌?
它以一个长错误结束:

[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

注意:超时后建议的命令均未报告此处值得一提的任何内容。

kubelet 和 kubeadm 版本?
---> 1.13.0
kubeadm 版本:&version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"120203 01Z", GoVersion:"go1.11.2", 编译器:"gc", 平台:"linux/amd64"}

此外,不应该在 kube 日志中设置比“未找到节点”更好的错误消息,并使其更加清晰/详细吗?

谢谢

所有45条评论

第一条错误信息:无法加载客户端 CA 文件 /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

第一条错误信息:无法加载客户端 CA 文件 /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

您好,这里有几个问题:
1) kubeadm init 是否完成并打印 boostrap 令牌?
2)容器运行时版本?
3) kubelet 和 kubeadm 版本是 1.12 吗?

/priority 需要更多证据

需要在 kubeadm init 之前执行 systemctl start kubelet

我遇到了同样的问题,因为杯子的核心小于2

同样的问题

@javacppc你是如何解决这个问题的? 当我运行 systemctl start kubelet 我得到error code

与 kubernetes 1.12.2 相同的问题。
@Javacppc你是如何解决这个问题的?

同样的问题

同样的问题

大家好,

我在这里面临同样的问题,当我启动集群时,我收到了来自令牌的消息,但我无法安装云编织:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" The connection to the server 192.168.56.104:6443 was refused - did you specify the right host or port?

当我转到日志时,我收到了有关节点名称的消息:

Dec 02 22:27:55 kubemaster5 kubelet[2838]: E1202 22:27:55.128645 2838 kubelet.go:2236] node "kubemaster5" not found

有人可以给我发点光吗?

谢谢!

我的问题解决了,实际上它不是错误,这是因为 apiserver 由于某种原因未能启动。

“由于某种原因,apiserver 无法启动”? 能不能详细点??

几天前我解决了我的问题。 从 1.11.4 -> 1.12.3 更新。 我有:

  1. api-server - 在具有自己网络的特定虚拟接口中运行。 (裸机)。
    kubeadm init/join与标志apiserver-advertise-address它开始在特定接口上,但设置包/健康检查穿行路由表(默认接口)的标准路线。 帮助参数bind-address/etc/kubernetes/manifests/kube-apiserver.yaml使用绑定到虚拟接口的IP。
  2. flannel - 创建controllerscheduler pod 后与网络相同的情况。 DNS 部署失败,因为connection refused到 api 服务器10.96.0.1:443 clusterIP。 (默认路由表)予指定node-ip群集节点的通过标志--node-ip/etc/systemd/system/kubelet.service.d/10-kubeadm.conf与虚拟接口的IP。

在此之后,我已经准备好版本为 1.12.3 的节点。 最有用的信息在docker logs + kubectl logs

与 v1.13.0 相同的问题

Kubernetes v1.13.0 也有同样的问题
CentOS 7
docker-ce 18.06(最新验证版本)
dockerd:活动,正在运行
kubelet:活动,运行
selinux:已禁用
firewalld:已禁用

错误是:
kubelet[98023]: E1212 21:10:01.708004 98023 kubelet.go:2266] node "node1" not found
/etc/hosts 包含节点,它是可访问的,它是可访问的——实际上是在做一个单主单工人(即受污染的节点)。

K8S 在哪里寻找这个值? 在 /etc/hosts 中?
如果需要,我可以排除故障并提供更多证据。

--> kubeadm init 是否完成并打印 boostrap 令牌?
它以一个长错误结束:

[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node1 localhost] and IPs [10.10.128.186 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

注意:超时后建议的命令均未报告此处值得一提的任何内容。

kubelet 和 kubeadm 版本?
---> 1.13.0
kubeadm 版本:&version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"120203 01Z", GoVersion:"go1.11.2", 编译器:"gc", 平台:"linux/amd64"}

此外,不应该在 kube 日志中设置比“未找到节点”更好的错误消息,并使其更加清晰/详细吗?

谢谢

同样的问题...

$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Fri 2018-12-14 19:05:47 UTC; 2min 2s ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 9114 (kubelet)
    Tasks: 23 (limit: 4915)
   CGroup: /system.slice/kubelet.service
           └─9114 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-d

Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.862262    9114 kuberuntime_manager.go:657] createPodSandbox for pod "kube-scheduler-pineview_kube-system(7f99b6875de942b000954351c4a
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.862381    9114 pod_workers.go:186] Error syncing pod 7f99b6875de942b000954351c4ac09b5 ("kube-scheduler-pineview_kube-system(7f99b687
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.906855    9114 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to start san
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.906944    9114 kuberuntime_sandbox.go:65] CreatePodSandbox for pod "etcd-pineview_kube-system(b7841e48f3e7b81c3cda6872104ba3de)" fai
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.906981    9114 kuberuntime_manager.go:657] createPodSandbox for pod "etcd-pineview_kube-system(b7841e48f3e7b81c3cda6872104ba3de)" fa
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.907100    9114 pod_workers.go:186] Error syncing pod b7841e48f3e7b81c3cda6872104ba3de ("etcd-pineview_kube-system(b7841e48f3e7b81c3c
Dec 14 19:07:49 pineview kubelet[9114]: E1214 19:07:49.933627    9114 kubelet.go:2236] node "pineview" not found
Dec 14 19:07:50 pineview kubelet[9114]: E1214 19:07:50.033880    9114 kubelet.go:2236] node "pineview" not found
Dec 14 19:07:50 pineview kubelet[9114]: E1214 19:07:50.134064    9114 kubelet.go:2236] node "pineview" not found
Dec 14 19:07:50 pineview kubelet[9114]: E1214 19:07:50.184943    9114 event.go:212] Unable to write event: 'Post https://192.168.1.235:6443/api/v1/namespaces/default/events: dial tcp 192.

同样的问题:

Ubuntu 18.04.1 LTS
Kubernetes v1.13.1(使用 cri-o 1.11)

按照 kubernetes.io 上的安装说明进行操作:
https://kubernetes.io/docs/setup/independent/install-kubeadm/
https://kubernetes.io/docs/setup/cri/#cri -o

systemctl enable kubelet.service
kubeadm init --pod-network-cidr=192.168.0.0/16 --cri-socket=/var/run/crio/crio.sock

/etc/hosts

127.0.0.1       localhost
::1             localhost ip6-localhost ip6-loopback
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters

127.0.1.1       master01.mydomain.tld master01
::1             master01.mydomain.tld master01

/etc/hostname


systemctl status kubelet

● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Tue 2018-12-18 16:19:54 CET; 20min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 10148 (kubelet)
    Tasks: 21 (limit: 2173)
   CGroup: /system.slice/kubelet.service
           └─10148 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --resolv-conf=/run/systemd/resolve/resolv.conf

Dec 18 16:40:52 master01 kubelet[10148]: E1218 16:40:52.795313   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:52 master01 kubelet[10148]: E1218 16:40:52.896277   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:52 master01 kubelet[10148]: E1218 16:40:52.997864   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.098927   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.200355   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.281586   10148 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.178.27:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster01limit=500&resourceVersion=0: dial tcp 192.168.178.27:6443: connect: connection refused
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.282143   10148 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168.178.27:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.178.27:6443: connect: connection refused
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.283945   10148 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://192.168.178.27:6443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster01limit=500&resourceVersion=0: dial tcp 192.168.178.27:6443: connect: connection refused
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.301468   10148 kubelet.go:2266] node "master01" not found
Dec 18 16:40:53 master01 kubelet[10148]: E1218 16:40:53.402256   10148 kubelet.go:2266] node "master01" not found

@fhemberger我想出了我的问题。 它使用snap来安装 Docker。 如果我卸载它并使用apt重新安装,那么 kubeadm 工作正常。

@cjbottaro我根本不使用 Docker,只使用 cri-o。

与 v1.13.1 相同的问题

如果您使用 systemd 和 cri-o,请确保将其设置为/var/lib/kubelet/config.yaml cgroup 驱动程序(或将下面的代码段作为kubeadm init --config=config.yaml一部分传递)。

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

如果您在 kubelet 日志中注意到这一点:

remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = cri-o configured with systemd cgroup manager, but did not receive slice as parent: /kubepods/besteffort/…

我今天遇到了同样的问题。

我通过删除rm -rf /var/lib/kubelet/并重新安装来修复它

@JishanXing谢谢! 这也解决了我在 Raspbian Sketch lite 上运行的问题

我通过删除 /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf 修复它

最好使用kubeadm reset命令。

@fhemberger如何解决,同样的问题,谢谢

当我将 k8s 从 1.13.3 升级到 1.13.4 时,我遇到了同样的问题......
我在编辑/etc/kubernetes/manifests/kube-scheduler.yaml后解决了它。 修改镜像版本
image: k8s.gcr.io/kube-scheduler:v1.13.3 ==> image: k8s.gcr.io/kube-scheduler:v1.13.4
与 kube-controller-manager.yaml 和 kube-apiserver.yaml 相同。

最新的方法是添加选项--image-repository registry.aliyuncs.com/google_containers ,我的k8s版本是1.14.0,docker版本:18.09.2,
` kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.14.0 --pod-network-cidr=192.168.0.0/16
[init] 使用 Kubernetes 版本:v1.14.0
[预检] 运行预检
[警告 IsDockerSystemdCheck]:检测到“cgroupfs”作为 Docker cgroup 驱动程序。 推荐的驱动程序是“systemd”。 请按照https://kubernetes.io/docs/setup/cri/ 上的指南进行操作
[预检] 拉取设置 Kubernetes 集群所需的镜像
[预检] 这可能需要一两分钟,具体取决于您的互联网连接速度
[预检] 您也可以使用“kubeadm config images pull”预先执行此操作
[kubelet-start] 将带有标志的 kubelet 环境文件写入文件“/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] 将 kubelet 配置写入文件“/var/lib/kubelet/config.yaml”
[kubelet-start] 激活 kubelet 服务
[certs] 使用 certificateDir 文件夹“/etc/kubernetes/pki”
[certs] 生成“front-proxy-ca”证书和密钥
[certs] 生成“前端代理客户端”证书和密钥
[certs] 生成“ca”证书和密钥
[certs] 生成“apiserver”证书和密钥
[certs] apiserver 服务证书为 DNS 名称 [jin-virtual-machine kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] 和 IPs [10.96.0.1 192.168.232.130] 签名
[certs] 生成“apiserver-kubelet-client”证书和密钥
[certs] 生成“etcd/ca”证书和密钥
[certs] 生成“etcd/server”证书和密钥
[certs] etcd/server 服务证书为 DNS 名称 [jin-virtual-machine localhost] 和 IPs [192.168.232.130 127.0.0.1 ::1] 签名
[certs] 生成“etcd/peer”证书和密钥
[certs] etcd/peer 服务证书为 DNS 名称 [jin-virtual-machine localhost] 和 IPs [192.168.232.130 127.0.0.1 ::1] 签名
[certs] 生成“apiserver-etcd-client”证书和密钥
[certs] 生成“etcd/healthcheck-client”证书和密钥
[certs] 生成“sa”密钥和公钥
[kubeconfig] 使用 kubeconfig 文件夹“/etc/kubernetes”
[kubeconfig] 编写“admin.conf” kubeconfig 文件
[kubeconfig] 编写“kubelet.conf” kubeconfig 文件
[kubeconfig] 编写“controller-manager.conf” kubeconfig 文件
[kubeconfig] 编写“scheduler.conf” kubeconfig 文件
[控制平面] 使用清单文件夹“/etc/kubernetes/manifests”
[控制平面] 为“kube-apiserver”创建静态 Pod 清单
[控制平面] 为“kube-controller-manager”创建静态 Pod 清单
[控制平面] 为“kube-scheduler”创建静态 Pod 清单
[etcd] 在“/etc/kubernetes/manifests”中为本地 etcd 创建静态 Pod 清单
[wait-control-plane] 等待 kubelet 将控制平面作为静态 Pod 从目录“/etc/kubernetes/manifests”启动。 这最多可能需要 4 分钟
[apiclient] 所有控制平面组件在 17.004356 秒后健康
[upload-config] 将 ConfigMap“kubeadm-config”中使用的配置存储在“kube-system”命名空间中
[kubelet] 在命名空间 kube-system 中创建一个 ConfigMap“kubelet-config-1.14”,其中包含集群中 kubelet 的配置
[上传证书] 跳过阶段。 请参阅--experimental-upload-certs
[mark-control-plane] 通过添加标签“node-role.kubernetes.io/master=''”将节点jin-virtual-machine标记为控制平面
[mark-control-plane] 通过添加污点将节点 jin-virtual-machine 标记为控制平面 [ node-role.kubernetes.io/master:NoSchedule ]
[bootstrap-token] 使用令牌:xucir0.o4kzo3qqjyjnzphl
[bootstrap-token] 配置引导令牌、集群信息 ConfigMap、RBAC 角色
[bootstrap-token] 配置 RBAC 规则以允许节点引导令牌发布 CSR,以便节点获得长期证书凭据
[bootstrap-token] 配置 RBAC 规则以允许 csrapprover 控制器自动批准来自节点引导令牌的 CSR
[bootstrap-token] 配置 RBAC 规则,允许集群中所有节点客户端证书的证书轮换
[bootstrap-token] 在“kube-public”命名空间中创建“cluster-info”ConfigMap
[插件] 应用必备插件:CoreDNS
[插件] 应用的基本插件:kube-proxy

您的 Kubernetes 控制平面已成功初始化!

要开始使用您的集群,您需要以普通用户身份运行以下命令:

mkdir -p $HOME/.kube
须藤cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
须藤 chown $(id -u):$(id -g) $HOME/.kube/config

您现在应该将 pod 网络部署到集群。
使用以下列出的选项之一运行“kubectl apply -f [podnetwork].yaml”:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

然后,您可以通过以 root 身份在每个节点上运行以下命令来加入任意数量的工作节点:

kubeadm join 192.168.232.130:6443 --token xucir0.o4kzo3qqjyjnzphl
--discovery-token-ca-cert-hash sha256:022048b22926a2cb2f8295ce2e3f1f6fa7ffe1098bc116f7d304a26bcfb78656
`

在 GCP Ubuntu 18.04 VM 上使用 kubernetes v1.14.1 和 cri-o v1.14.0 遇到了同样的问题。 虽然使用 docker 时一切正常。 参考: https :

我的问题是不同的 cgroup 驱动程序。 CRIO 默认使用 systemd,kubelet 默认使用 cgroupfs。

cat /etc/crio/crio.conf | grep cgroup
# cgroup_manager is the cgroup management implementation to be used
cgroup_manager = "systemd"

如果是这种情况,请参阅https://kubernetes.io/docs/setup/independent/install-kubeadm/#configure -cgroup-driver-used-by-kubelet-on-master-node

只写文件

echo "KUBELET_EXTRA_ARGS=--cgroup-driver=systemd" > /etc/default/kubelet

然后运行 ​​kubeadm init 。 或者将 cgroup_manager 改为cgroupfs

与 docker 不同,cri-o 和 containerd 在 cgroup 驱动程序检测方面的管理稍微复杂一些,但 kubeadm 有一些计划支持它。

docker 已经处理好了。

所以显然没有解决方案,只能重置集群 $(yes | kubeadm reset),在我看来这不是解决方案!

更改图像存储库对我有用,但这不是一个很好的解决方法。
--image-repository registry.aliyuncs.com/google_containers

我的情况与此有关

sed -i 's/cgroup-driver=systemd/cgroup-driver=cgroupfs/g' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

我有同样的问题。 我使用kubeadm init --config=init-config.yaml并且失败了,这个文件是由 kubeadm 生成的。 文件中有一个字段 AdvertiseAddress 默认为 1.2.3.4,这会使 etcd contianer 启动失败。 当我更改为 127.0.0.1 时,etcd contianer 成功启动并且 kubeadm init 成功

为了解决这个问题,使用docker ps -a列出所有容器检查其中一些是否退出,如果是,使用docker logs CONTIANER_ID看看发生了什么。 希望能帮助到你

大家好,请问有解决办法吗? 同样的问题,但使用k3s

@MateusMac您也应该针对 k3s 打开错误报告。

工作了一个星期让kubeadm工作
Ubuntu 18.04
码头工人 18.06-2-ce
k8s 1.15.1
sudo kubeadm init --pod-network-cidr=10.244.0.0/16

Fails with:

[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

kubelet 日志显示它无法找到通过一垒的节点:

warproot<strong i="15">@warp02</strong>:~$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Sun 2019-08-04 18:22:26 AEST; 5min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 12569 (kubelet)
    Tasks: 27 (limit: 9830)
   CGroup: /system.slice/kubelet.service
           └─12569 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-dri

Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.322762   12569 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "kube-scheduler-warp02_kube-system(ecae9d12d3610192347be3d1aa5aa552)"
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.322806   12569 kuberuntime_manager.go:692] createPodSandbox for pod "kube-scheduler-warp02_kube-system(ecae9d12d3610192347be3d1aa5aa552)
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.322872   12569 pod_workers.go:190] Error syncing pod ecae9d12d3610192347be3d1aa5aa552 ("kube-scheduler-warp02_kube-system(ecae9d12d36101
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.373094   12569 kubelet.go:2248] node "warp02" not found
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.375587   12569 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://10.1.1.4:6443
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.473295   12569 kubelet.go:2248] node "warp02" not found
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.573567   12569 kubelet.go:2248] node "warp02" not found
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.575495   12569 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Get https://10.1.1.4:6
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.590886   12569 event.go:249] Unable to write event: 'Post https://10.1.1.4:6443/api/v1/namespaces/default/events: dial tcp 10.1.1.4:6443
Aug 04 18:28:03 warp02 kubelet[12569]: E0804 18:28:03.673767   12569 kubelet.go:2248] node "warp02" not found




我应该注意到我在这些裸机机器上有多个 NIC:

warproot<strong i="6">@warp02</strong>:~$ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:feff:fe65:37f  prefixlen 64  scopeid 0x20<link>
        ether 02:42:fe:65:03:7f  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6  bytes 516 (516.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp35s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.0.2  netmask 255.255.255.0  broadcast 10.0.0.255
        inet6 fe80::32b5:c2ff:fe02:410b  prefixlen 64  scopeid 0x20<link>
        ether 30:b5:c2:02:41:0b  txqueuelen 1000  (Ethernet)
        RX packets 46  bytes 5821 (5.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 70  bytes 7946 (7.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp6s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.1.1.4  netmask 255.255.255.0  broadcast 10.1.1.255
        inet6 fd42:59ff:1166:0:25a7:3617:fee6:424e  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::1a03:73ff:fe44:5694  prefixlen 64  scopeid 0x20<link>
        inet6 fd9e:fdd6:9e01:0:1a03:73ff:fe44:5694  prefixlen 64  scopeid 0x0<global>
        ether 18:03:73:44:56:94  txqueuelen 1000  (Ethernet)
        RX packets 911294  bytes 1361047672 (1.3 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 428759  bytes 29198065 (29.1 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 17  

ib0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 4092
        unspec A0-00-02-10-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ib1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 4092
        unspec A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 25473  bytes 1334779 (1.3 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 25473  bytes 1334779 (1.3 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


我不知道这是否有问题,但我将/etc/hosts文件设置为

warproot<strong i="7">@warp02</strong>:~$ cat /etc/hosts
127.0.0.1       localhost.localdomain   localhost
::1             localhost6.localdomain6 localhost6
# add our host name
10.1.1.4 warp02 warp02.ad.xxx.com
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
# add our ipv6 host name
fd42:59ff:1166:0:25a7:3617:fee6:424e warp02 warp02.ad.xxx.com

warproot<strong i="8">@warp02</strong>:~$ 

因此,设置(我认为)将 NIC 10.1.1.4 视为 k8s 的“网络”

针对节点名称的 nslookup 似乎工作正常:

warproot<strong i="13">@warp02</strong>:~$ nslookup warp02
Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
Name:   warp02.ad.xxx.com
Address: 10.1.1.4
Name:   warp02.ad.xxx.com
Address: fd42:59ff:1166:0:25a7:3617:fee6:424e

warproot<strong i="14">@warp02</strong>:~$ 

我已经多次阅读kubeadm安装文档。

奇怪的。 就是找不到网络。

难倒。

对于1.15.3我能够通过添加在 Ubuntu 18.04 上修复此问题

kind: InitConfiguration
nodeRegistration:
  kubeletExtraArgs:
    cgroup-driver: "systemd"

到我的 kubeadm 配置,然后运行kubeadm init

我在 Ubuntu 18.04 上遇到了同样的问题,版本为 1.15.3
@kris-nova 如果您能指定此配置文件的位置,我将不胜感激:-)

更新:我真的不知道为什么,但它现在可以工作,无需更改任何配置!
(注意:我不知道它是否相关,但在重试kubeadm init之前,我将 docker 从 v.19.03.1 更新到 v.19.03.2)

我在运行 kubeadm init 时遇到以下错误,即找不到 nodexx..

[ root@node01 ~]# journalctl -xeu kubelet
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.682095 2968 kubelet.go:2267] 未找到节点“node01”
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.782554 2968 kubelet.go:2267] 未找到节点“node01”
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.829142 2968 reflector.go:123] k8s.io/client-go/informers/factory.go:134: 无法列出 *v1beta1.CSID
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.884058 2968 kubelet.go:2267] 未找到节点“node01”
Nov 07 10:34:02 node01 kubelet[2968]: E1107 10:34:02.984510 2968 kubelet.go:2267] 未找到节点“node01”
11 月 7 日 10:34:03 node01 kubelet[2968]: E1107 10:34:03.030884 2968 reflector.go:123]

通过解决:

设置强制 0

sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux

同样的问题

在我的情况下,这是由主节点中的时间漂移​​引起的,_发生在断电之后_。
我通过跑步解决了这个问题

# Correcting the time as mentioned here https://askubuntu.com/a/254846/861548
sudo service ntp stop
sudo ntpdate -s time.nist.gov
sudo service ntp start
# Then restarting the kubelet
sudo systemctl restart kubelet.service
# I also had to run daemon-reload as I got the following warning
# Warning: The unit file, source configuration file or drop-ins of kubelet.service changed on disk. Run 'systemctl daemon-reload' to reload units.
sudo systemctl daemon-reload
# I also made another restart, which I don't know whether needed or not
sudo systemctl restart kubelet.service

我解决了我同样的问题node "xxxx" not found ,尝试查看容器日志使用 docker logs container_id ,然后我看到 apiserver 尝试连接 127.0.0.1:2379 ,编辑文件· /etc/kubernetes/manifests/etcd.yaml ,重启,问题解决。

此页面是否有帮助?
0 / 5 - 0 等级