Kubernetes: LoadBalancer delayed routing to pods after RollingUpdate

Created on 5 Dec 2016  ·  3Comments  ·  Source: kubernetes/kubernetes

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):
No

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):

https://github.com/kubernetes/kubernetes/issues?page=3&q=is%3Aissue+is%3Aopen+loadbalancer+update&utf8=%E2%9C%93


Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Bug report

Kubernetes version (use kubectl version):
Client 1.4.4
Cluster 1.4.6

Environment:

  • Cloud provider or hardware configuration: GKE
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
    3 node cluster

What happened:
Apply a rolling update, see the new pods come up to Running and Ready, the old pod(s) terminate.
Results in timeouts when hitting the load balancer. Wait a few minutes, and traffic is routed correctly.

What you expected to happen:
Traffic is seamlessly routed to the new pods.

How to reproduce it (as minimally and precisely as possible):
https://gist.github.com/1d668ba12b12f450e8feffb21383ba44

kubectl apply -f deployment.yaml
kubectl get svc, wait for the external IP to come up.
Observe that traffic is routed correctly.

Edit something (like the environment variable)
kubectl apply -f deployment.yaml
Wait for old pods to be terminated. Observe timeouts until the load balancer updates itself.

Anything else do we need to know:

Most helpful comment

Confirmed. Combination of draining and the readiness check resulted in zero downtime:
https://gist.github.com/ac98158ccfd0c006de0bb0bc7d31a596

Sorry for the erroneous report.

All 3 comments

I think terminating pods will lead to failing requests by default as there is no "connection draining" in Kubernetes --- you would have to manually toggle the app's readiness probe "just in time": https://github.com/RisingStack/kubernetes-graceful-shutdown-example

Not sure whether this is your issue (don't see the source of your app / Docker image).

App source code:
https://gist.github.com/d68192f04e3ff50bf9bf7e90ee879077

I'll try altering the source code to drain requests. Makes sense that this might be the problem.

Confirmed. Combination of draining and the readiness check resulted in zero downtime:
https://gist.github.com/ac98158ccfd0c006de0bb0bc7d31a596

Sorry for the erroneous report.

Was this page helpful?
0 / 5 - 0 ratings