BUG REPORT
kubeadm version (use kubeadm version
):
{
"clientVersion": {
"major": "1",
"minor": "11",
"gitVersion": "v1.11.2",
"gitCommit": "bb9ffb1654d4a729bb4cec18ff088eacc153c239",
"gitTreeState": "clean",
"buildDate": "2018-08-07T23:14:39Z",
"goVersion": "go1.10.3",
"compiler": "gc",
"platform": "linux/amd64"
}
}
Environment:
kubectl version
):{
"clientVersion": {
"major": "1",
"minor": "11",
"gitVersion": "v1.11.2",
"gitCommit": "bb9ffb1654d4a729bb4cec18ff088eacc153c239",
"gitTreeState": "clean",
"buildDate": "2018-08-07T23:17:28Z",
"goVersion": "go1.10.3",
"compiler": "gc",
"platform": "linux/amd64"
},
"serverVersion": {
"major": "1",
"minor": "11",
"gitVersion": "v1.11.2",
"gitCommit": "bb9ffb1654d4a729bb4cec18ff088eacc153c239",
"gitTreeState": "clean",
"buildDate": "2018-08-07T23:08:19Z",
"goVersion": "go1.10.3",
"compiler": "gc",
"platform": "linux/amd64"
}
}
uname -a
):$ kubectl get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-78fcdf6894-bvtcg 1/1 Running 2 3h
kube-system pod/coredns-78fcdf6894-lq7st 1/1 Running 2 3h
kube-system pod/etcd-k8s-master 1/1 Running 1 3h
kube-system pod/kube-apiserver-k8s-master 1/1 Running 1 3h
kube-system pod/kube-controller-manager-k8s-master 1/1 Running 1 3h
kube-system pod/kube-flannel-ds-6tgqf 1/1 Running 2 3h
kube-system pod/kube-flannel-ds-cn4ql 1/1 Running 1 3h
kube-system pod/kube-proxy-cjlvz 1/1 Running 1 3h
kube-system pod/kube-proxy-w7ts7 1/1 Running 1 3h
kube-system pod/kube-scheduler-k8s-master 1/1 Running 1 3h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3h
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/kube-flannel-ds 2 2 2 2 2 beta.kubernetes.io/arch=amd64 3h
kube-system daemonset.apps/kube-proxy 2 2 2 2 2 beta.kubernetes.io/arch=amd64 3h
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/coredns 2 2 2 2 3h
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/coredns-78fcdf6894 2 2 2 3h
I've created a service so a pod can curl another pod, but the name is never resolved.
Exec-ing into the pod:
# cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
In an older installation where kube-dns was the default, I remember a service with IP 10.96.0.10 with name "kube-dns". This installation doesn't have such service.
curl my-service
curl: (6) Could not resolve host: my-service
curl my-service.default.svc.cluster.local
curl: (6) Could not resolve host: my-service.default.svc.cluster.local
curl www.google.com
curl: (6) Could not resolve host: www.google.com
The dns lookup should resolve
Fresh install with kubeadm and flannel, CentOS 7 with one node and master also acting as node.
Create a pod and a service, try to curl the pod inside a pod.
The IP address I see inside /etc/resolv.conf (10.96.0.10) is the same I had with kube-dns, but this time I don't see anything in 10.96.0.10.
$ kubectl logs -f --namespace=kube-system coredns-78fcdf6894-bvtcg
.:53
CoreDNS-1.1.3
linux/amd64, go1.10.1, b0fd575c
2018/08/14 15:34:06 [INFO] CoreDNS-1.1.3
2018/08/14 15:34:06 [INFO] linux/amd64, go1.10.1, b0fd575c
2018/08/14 15:34:06 [INFO] plugin/reload: Running configuration MD5 = 2a066f12ec80aeb2b92740dd74c17138
^C
$ kubectl logs -f --namespace=kube-system coredns-78fcdf6894-lq7st
.:53
2018/08/14 15:34:06 [INFO] CoreDNS-1.1.3
2018/08/14 15:34:06 [INFO] linux/amd64, go1.10.1, b0fd575c
2018/08/14 15:34:06 [INFO] plugin/reload: Running configuration MD5 = 2a066f12ec80aeb2b92740dd74c17138
CoreDNS-1.1.3
linux/amd64, go1.10.1, b0fd575c
For whatever reason, there is no kube-dns
service on your cluster.
You'll first need to re-create that by hand to fix things. Then we can try to figure out how it disappeared.
You can use this yaml to create the service with kubectl apply -f
...
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.96.0.10
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
Note: It's counter-intuitive that the CoreDNS service name is still named "kube-dns", but it does select the coredns pods (which use selector label "kube-dns').
I'm having the same issue as OP, and the description and use-case is just about the same: kubeadm
on Centos 7.5 with one master that is operating as the worker node as well. I have the same issue and I the service DOES exist:
λ k get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default pod/busybox 0/1 Error 0 28m
default pod/gitlab-gitlab-fd8b9fb85-26mkz 0/1 CrashLoopBackOff 6 50m
default pod/gitlab-minio-7fb7886d94-2zsff 1/1 Running 0 50m
default pod/gitlab-postgresql-8684bb6656-ltxjm 1/1 Running 0 50m
default pod/gitlab-redis-785447c586-84x4c 1/1 Running 0 50m
default pod/ldap-79bb8c66b9-68v9f 1/1 Running 0 2d
default pod/local-volume-provisioner-dkxm9 1/1 Running 0 2d
kube-system pod/coredns-78fcdf6894-2t8tv 1/1 Running 0 2d
kube-system pod/coredns-78fcdf6894-wvq26 1/1 Running 0 2d
kube-system pod/etcd-server1.stitches.tech 1/1 Running 0 2d
kube-system pod/kube-apiserver-server1.domain 1/1 Running 0 2d
kube-system pod/kube-controller-manager-server1.domain 1/1 Running 0 2d
kube-system pod/kube-flannel-ds-m9cz5 1/1 Running 0 2d
kube-system pod/kube-proxy-qhr8p 1/1 Running 0 2d
kube-system pod/kube-scheduler-server1.domain 1/1 Running 0 2d
kube-system pod/kubernetes-dashboard-6948bdb78-qnp4b 1/1 Running 0 2d
kube-system pod/tiller-deploy-56c4cf647b-64w8v 1/1 Running 0 2d
metallb-system pod/controller-9c57dbd4-fqhzb 1/1 Running 0 2d
metallb-system pod/speaker-tngv7 1/1 Running 0 2d
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/gitlab-gitlab LoadBalancer 10.102.204.34 192.168.1.201 22:32208/TCP,80:32194/TCP,443:31370/TCP 50m
default service/gitlab-minio ClusterIP None <none> 9000/TCP 50m
default service/gitlab-postgresql ClusterIP 10.108.66.88 <none> 5432/TCP 50m
default service/gitlab-redis ClusterIP 10.97.59.57 <none> 6379/TCP 50m
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d
default service/ldap-service LoadBalancer 10.101.250.10 192.168.1.200 389:32231/TCP 2d
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 2d
kube-system service/kubernetes-dashboard NodePort 10.104.132.52 <none> 443:30924/TCP 2d
kube-system service/tiller-deploy ClusterIP 10.96.67.163 <none> 44134/TCP 2d
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
default daemonset.apps/local-volume-provisioner 1 1 1 1 1 <none> 2d
kube-system daemonset.apps/kube-flannel-ds 1 1 1 1 1 beta.kubernetes.io/arch=amd64 2d
kube-system daemonset.apps/kube-proxy 1 1 1 1 1 beta.kubernetes.io/arch=amd64 2d
metallb-system daemonset.apps/speaker 1 1 1 1 1 <none> 2d
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
default deployment.apps/gitlab-gitlab 1 1 1 0 50m
default deployment.apps/gitlab-minio 1 1 1 1 50m
default deployment.apps/gitlab-postgresql 1 1 1 1 50m
default deployment.apps/gitlab-redis 1 1 1 1 50m
default deployment.apps/ldap 1 1 1 1 2d
kube-system deployment.apps/coredns 2 2 2 2 2d
kube-system deployment.apps/kubernetes-dashboard 1 1 1 1 2d
kube-system deployment.apps/tiller-deploy 1 1 1 1 2d
metallb-system deployment.apps/controller 1 1 1 1 2d
NAMESPACE NAME DESIRED CURRENT READY AGE
default replicaset.apps/gitlab-gitlab-fd8b9fb85 1 1 0 50m
default replicaset.apps/gitlab-minio-7fb7886d94 1 1 1 50m
default replicaset.apps/gitlab-postgresql-8684bb6656 1 1 1 50m
default replicaset.apps/gitlab-redis-785447c586 1 1 1 50m
default replicaset.apps/ldap-79bb8c66b9 1 1 1 2d
kube-system replicaset.apps/coredns-78fcdf6894 2 2 2 2d
kube-system replicaset.apps/kubernetes-dashboard-6948bdb78 1 1 1 2d
kube-system replicaset.apps/tiller-deploy-56c4cf647b 1 1 1 2d
kube-system replicaset.apps/tiller-deploy-64c9d747bd 0 0 0 2d
metallb-system replicaset.apps/controller-9c57dbd4 1 1 1 2d
From the CoreDNS pods, I can't seem to do lookups out to the outside world, which seems strange:
root on server1 at 11:45:48 AM in /internal/gitlab
λ k exec -it coredns-78fcdf6894-2t8tv /bin/sh -n kube-system
/ # cat /etc/resolv.conf
nameserver 192.168.1.254
nameserver 2600:1700:c540:64c0::1
search attlocal.net domain
/ # host gitlab
;; connection timed out; no servers could be reached
/ # host google.com
;; connection timed out; no servers could be reached
To me, this means the CoreDNS pod can't see it's upstream nameserver, which is 192.168.1.254, the IP of the host network. Am I on the right track?
But, what's even stranger, is that a pod running on that master node CAN reach that IP address just fine:
λ kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# ping 192.168.1.254
PING 192.168.1.254 (192.168.1.254): 56 data bytes
64 bytes from 192.168.1.254: seq=0 ttl=63 time=1.102 ms
Can you try with dig
?
dig google.com @192.168.1.254
Also typically systems with a valid ipv6 config will try to resolve with that ipv6 resolver first. If that fails these systems call it a failure. Take a look at the dig command first if that works I would look to see if the system is configured with dual stack ipv4 ipv6 or not.
Thanks again to @mauilion for spending so much time helping me diagnose this issue today!
My solution (albeit it quite terrible for now) was just to disable the firewalld
service on my host OS:
sudo systemctl stop firewalld
sudo systemctl disable firewalld
Keep in mind what that command is actually doing. Do so at your own risk.
I ran into the same issue with kubernetes 1.11.2 and flannel 0.10.0 deployed to a CentOS 7 VM via kubeadm with a kube-proxy configured to use iptables. What I noticed is that I had no pod to pod or pod to service communications after initial deployment. Looking at the FORWARD chain on iptables, kube-proxy set up a KUBE-FORWARD chain as the first rule which should, upon inspection, handle all the traffic I described above. Flannel appended two rules after the DROP and REJECT rules that are default in CentOS 7 FORWARD chain. I noticed when I removed the REJECT rule, then the rules added by Flannel would process the traffic, and my pods could communicate with other pods and with service ips.
Since kube-proxy monitors the KUBE-FORWARD change and keeps it from changing, I added two rules after the KUBE-FORWARD rule that added the ctstate of NEW. Once I added these rules, then internal traffic would get processed as I expected.
Please check the clusterDNS
variable in /var/lib/kubelet/config.yaml
. For our configuration this was set (incorrectly) to 10.96.0.10
whereas it should have been 10.244.240.10
(that's what we've bootstrapped our cluster with) . Changing this and restarting kubelet fixed the issue for us. Your mileage may vary though.
@pkeuter, 10.244.0.0/16 is the default the _pod_ cidr for flannel. If that's so in your case, then 10.244.240.10
would be a pod IP, which you shouldn't use as your cluster-dns ip setting (re: it could change, no load balancing).
It is not:
We've bootstrapped the cluster with: --pod-network-cidr=10.244.0.0/16 --service-cidr=10.244.240.0/20
, but as I see now there is some overlap, which I should change anyway :-) So thanks for that @chrisohaver!
Please check the
clusterDNS
variable in/var/lib/kubelet/config.yaml
. For our configuration this was set (incorrectly) to10.96.0.10
whereas it should have been10.244.240.10
(that's what we've bootstrapped our cluster with) . Changing this and restarting kubelet fixed the issue for us. Your mileage may vary though.
Thank you for this - it helped me track down why my internal DNS requests were not resolving.
For reference, I had to set my clusterDNS value to 192.168.0.10 as I init kubeadm with --service-cidr=192.168.0.0/16 and my kube-dns service has that as its external IP.
Also of note, simply restarting kubelet was not enough - I had to restart my pods so /etc/resolv.conf was updated. One that was done requests are resolving as expected.
There were a number of conflating issues on coreDNS that have since been resolved. Given the overloaded set of issues I'm going to close this one.
If there are specific repro's on 1.12+ feel free to open and we'll address ASAP.
Please check the
clusterDNS
variable in/var/lib/kubelet/config.yaml
. For our configuration this was set (incorrectly) to10.96.0.10
whereas it should have been10.244.240.10
(that's what we've bootstrapped our cluster with) . Changing this and restarting kubelet fixed the issue for us. Your mileage may vary though.
great, and i use calico which clusterDNS address i should set ?
I did same but facing same error my coredns pods not starting giving error state
I changed my ClusterDNS but still no effect @justlooks
+1 Facing the same issue in CentOS 7 and kubeadm 1.11
@timothysc
Adding iptables -p FORWARD ACCEPT
fixed the issue
+1 Facing the same issue in CentOS 7 and kubeadm 1.12
found the resolution for the issue .
removed resources limit on core dns daemon controller as it was reaching cpu limit. which was making it restart.
Maybe the flannel problem, in my case, the vagrant has mutil network interface, so must specify the interface when deploy flannel: - --iface=eth1
, otherwise the same dns problem is going to happen...
https://github.com/kubernetes/kubernetes/issues/39701
vim https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
modified as following:
......
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.11.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth1
......
Thanks @pkeuter , It fixed the issue and I had to delete the coredns pods and let them to recreate it.
Most helpful comment
Please check the
clusterDNS
variable in/var/lib/kubelet/config.yaml
. For our configuration this was set (incorrectly) to10.96.0.10
whereas it should have been10.244.240.10
(that's what we've bootstrapped our cluster with) . Changing this and restarting kubelet fixed the issue for us. Your mileage may vary though.