Kubernetes: Is there a way to generate yml files that will produce the existing cluster?

Created on 27 Apr 2016  ·  24Comments  ·  Source: kubernetes/kubernetes

Given a kubernetes cluster that is running some number of pods, services, deployments, etc, I would like to generate one or more files ( yml format preferred) that would re-produce the current configuration, when applied to a new cluster.

My use case is a promotion system. I have my 'stack files' as yml files in a git repo, but I need to allow humans to approve changes before they are applied to the cluster.

One way to do this is to use an 'open loop' system. I can use tags or other mechanisms to determine which versions have been applied to the cluster, and then compare the latest version available with the latest deployed version.

The problem with the open-loop system is that it does not consider that changes could have been made outside the files, or that changes applied could have had problems, etc.

If I could extract the 'equivalent' files from a running cluster, I could compare them with the ones that are about to be applied. This is a much stronger, 'closed loop' system-- it is able to correctly understand what will happen when the changes are applied, even if we have lost track of the real target state.

if there were such a thing as kubectl apply -f --dry-run, which lists only the changes that will be made, rather than actually doing the changes, that would work as well. That is already being discussed over at issue https://github.com/kubernetes/kubernetes/issues/11488

Does anyone have thoughts on this? We are new to kubernetes, but we have created the the functionality I'm describing above for our RedHat/Satellite rpm-based deployments, so i want to re-create it in K8s. Of course, in k8s, we have the complexity that the infrastructure itself can change, not just installed package versions!

Most helpful comment

kubectl get po,deployment,rc,rs,ds,no,job -o yaml?

All 24 comments

kubectl get po,deployment,rc,rs,ds,no,job -o yaml?

Ah yes, of course! This works, but it is not what what I was looking for. It answers my question, but it doesnt give me files that match the ones I used.

I learned that the answer to this question is to read the 'last-applied-configuration' annotation that kubectl adds. this will give the files that were used to produce the config.

@dcowden also see kubectl get --export

ah thats even better! thanks!

Combining other answers, this is what I came up with for bash:

for n in $(kubectl get -o=name pvc,configmap,serviceaccount,secret,ingress,service,deployment,statefulset,hpa,job,cronjob)
do
    mkdir -p $(dirname $n)
    kubectl get -o=yaml --export $n > $n.yaml
done

k8s 1.8

kubectl get all --export=true -o yaml

For folks coming here from Google, on my test instance, the last comment's all doesn't appear to include ingress, and also you have to say --all-namespaces to get it to dump other namespaces.

Related: https://github.com/kubernetes/kubernetes/issues/42885 and https://github.com/kubernetes/kubernetes/pull/42954#issuecomment-285949856 etc.

A variation on top of the solution provided by @alahijani

for n in $(kubectl get -o=name pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob | grep -v 'secret/default-token')
do
    kubectl get -o=yaml --export $n > $(dirname $n)_$(basename $n).yaml
done

This is to have all yaml files in a single dir for an easy kubectl apply -f. It also excludes the default service account secret which can not be exported.

Another version: Exporting all yaml's from all namespaces. For each namespace a directory is made.

  • including persistent volumes!
i=$((0))
for n in $(kubectl get -o=custom-columns=NAMESPACE:.metadata.namespace,KIND:.kind,NAME:.metadata.name pv,pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob --all-namespaces | grep -v 'secrets/default-token')
do
    if (( $i < 1 )); then
        namespace=$n
        i=$(($i+1))
        if [[ "$namespace" == "PersistentVolume" ]]; then
            kind=$n
            i=$(($i+1))
        fi
    elif (( $i < 2 )); then
        kind=$n
        i=$(($i+1))
    elif (( $i < 3 )); then
        name=$n
        i=$((0))
        echo "saving ${namespace} ${kind} ${name}"
        if [[ "$namespace" != "NAMESPACE" ]]; then
            mkdir -p $namespace
            kubectl get $kind -o=yaml --export $name -n $namespace > $namespace/$kind.$name.yaml
        fi
    fi
done

and for importing again:

path=$(pwd)
for n in $(ls -d */)
do
    echo "Creating namespace ${n:0:-1}"
    kubectl create namespace ${n:0:-1}

    for yaml in $(ls $path/$n)
    do
        echo -e "\t Importing $yaml"
        kubectl apply -f $path/$n$yaml -n ${n:0:-1}
    done

done

Another little tweak to exclude service account tokens:

#!/bin/env bash

## https://github.com/kubernetes/kubernetes/issues/24873#issuecomment-416189335

i=$((0))
for n in $(kubectl get -o=custom-columns=NAMESPACE:.metadata.namespace,KIND:.kind,NAME:.metadata.name pv,pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob --all-namespaces | grep -v 'secrets/default-token')
do
    if (( $i < 1 )); then
        namespace=$n
        i=$(($i+1))
        if [[ "$namespace" == "PersistentVolume" ]]; then
            kind=$n
            i=$(($i+1))
        fi
    elif (( $i < 2 )); then
        kind=$n
        i=$(($i+1))
    elif (( $i < 3 )); then
        name=$n
        i=$((0))
        if [[ "$namespace" != "NAMESPACE" ]]; then
            mkdir -p $namespace

            yaml=$((kubectl get $kind -o=yaml $name -n $namespace ) 2>/dev/null)
            if [[ $kind != 'Secret' || $yaml != *"type: kubernetes.io/service-account-token"* ]]; then
                echo "Saving ${namespace}/${kind}.${name}.yaml"
                kubectl get $kind -o=yaml --export $name -n $namespace > $namespace/$kind.$name.yaml
            fi
        fi
    fi
done

For those who works in windows Powershell, here's an one-liner:
Foreach ($i in $(kubectl get -o=name pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob)) {If($i -notmatch "default-token") {kubectl get -o=yaml --export $i | Out-File -filepath $($i.Replace("/", "-") + ".yaml")}}

@mrwulf @acondrat
I think grep -v 'secrets/default-token' should be changed to grep -v 'secret/default-token'
secrets didn't work with me.

I'm using the following versions of kubectl and k8s cluster

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-21T11:34:22Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

@4m3ndy you are right! Thanks!

Hey guys just made this docker image for exporting the required yaml files for each component per namespace. these backups are exported then encrypted with a password and uploaded to S3 Bucket.

If any one would like to commit any changes or share any comments, you're more than welcome :+1:
ambient-innovation/k8s-backup

example: generate pv yaml.

kubectl get pv -o yaml --export | sed -e '/resourceVersion: "[0-9]\+"/d' -e '/uid: [a-z0-9-]\+/d' -e '/selfLink: [a-z0-9A-Z/]\+/d' -e '/status:/d' -e '/phase:/d' -e '/creationTimestamp:/d' > pvList.yaml

@xiaoping378
https://github.com/ambient-innovation/k8s-backup/blob/01c1bfe750136648fd91e14dd691ba39bb05f282/k8s-backup.sh#L38

This script should generate all pvc for each namespace then export the yaml file for each pv, have a look

Create a folder ${HOME}/clusterstate/, then run:
kubectl cluster-info dump --all-namespaces --output-directory=${HOME}/clusterstate/ -o yaml
All your entities will be in separate folders structure, corresponding to the namespaces.
The .json extesions, i.e. deployments.json, are misleading, as the -o yaml flag will create yaml exports.

Create a folder ${HOME}/clusterstate/, then run:
kubectl cluster-info dump --all-namespaces --output-directory=${HOME}/clusterstate/ -o yaml
All your entities will be in separate folders structure, corresponding to the namespaces.
The .json extesions, i.e. deployments.json, are misleading, as the -o yaml flag will create yaml exports.

FYI, this appears to need a decent amount of RAM for large deployments, my 2GB RAM CLI jumpbox VM can't handle it (probably need 4 or 8 I'd imagine):

fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x1ab7c29, 0x16)
        /usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sysMap(0xc068000000, 0x10000000, 0x2da7238)
        /usr/local/go/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x2d8e9a0, 0x10000000, 0x0, 0x0)
        /usr/local/go/src/runtime/malloc.go:701 +0x1cd
runtime.(*mheap).grow(0x2d8e9a0, 0x8000, 0xffffffff)
        /usr/local/go/src/runtime/mheap.go:1255 +0xa3
runtime.(*mheap).allocSpanLocked(0x2d8e9a0, 0x8000, 0x2da7248, 0x42c7bc)
        /usr/local/go/src/runtime/mheap.go:1170 +0x266
runtime.(*mheap).alloc_m(0x2d8e9a0, 0x8000, 0x101, 0xc000103f18)
        /usr/local/go/src/runtime/mheap.go:1022 +0xc2

I reran on my desktop and tracked kubectl process memory usage, it peaked at just around 4GB, so 8GB it is!

Apparently that about matches the total output size of the dump, which includes logs and some of our pods (90 of them) are putting out logs of over 100MB in size. This would indicate to me the dump command is storing everything in RAM even as it is writing out to disk, probably could be optimized to clear out RAM as logs are finished writing.

Can anyone tell me the command or script which will take backup of cluster (includes ns, deployment, svc, secrets, pv pvc, cm yaml files only) with all information and restore it in new cluster.
I have tried with --export command but in service yaml file name of service is missing and if i take backup without --export then it include clusterIP IP , nodePort Port, loadBalancer IP which is not allowing me to deploy it in new cluster as it is immutable(clusterIP and loadBalancer).
kubernetes cluster version 1.14 onward 1.15/16/17 (trying to take backup/restore in GCP gke or AWS eks).

Thanks to kubectl api-resources! I've been able to get manifests (yaml files) of all resources in all namespaces in k8s using the following bash script:

#!/usr/bin/env bash

while read -r line
do
    output=$(kubectl get "$line" --all-namespaces -o yaml 2>/dev/null | grep '^items:')
    if ! grep -q "\[\]" <<< $output; then
        echo -e "\n======== "$line" manifests ========\n"
        kubectl get "$line" --all-namespaces -o yaml
    fi
done < <(kubectl api-resources | awk '{print $1}' | grep -v '^NAME')

Above bash script was tested with:

  • k8sv1.16.3
  • Ubuntu Bionic 18.04.3 OS
  • bash version version 4.4.20(1)-release (x86_64-pc-linux-gnu)

@vhosakot Small simplification to your script.

You can replace: kubectl api-resources | awk '{print $1}' | grep -v '^NAME'
With: kubectl api-resources -o name

#!/usr/bin/env bash

while read -r namespace
do
    echo "scanning namespace '${namespace}'"
    mkdir -p "${HOME}/cluster-backup/${namespace}"
    while read -r resource
    do
        echo "  scanning resource '${resource}'"
        mkdir -p "${HOME}/cluster-backup/${namespace}/${resource}"
        while read -r item
        do
            echo "    exporting item '${item}'"
            kubectl get "$resource" -n "$namespace" "$item" -o yaml > "${HOME}/cluster-backup/${namespace}/${resource}/$item.yaml"
        done < <(kubectl get "$resource" -n "$namespace" 2>&1 | tail -n +2 | awk '{print $1}')
    done < <(kubectl api-resources --namespaced=true 2>/dev/null | tail -n +2 | awk '{print $1}')
done < <(kubectl get namespaces | tail -n +2 | awk '{print $1}')

i extended the script above a little (and slowed it down). this loads all namespaces, loads all resources in all namespaces and then loads each config as a single file in each resource in each namespace. it is verbose and shows some errors, bute the end result (the dump) should be clean.

#!/usr/bin/env bash
ROOT=${HOME}/clusterstate

while read -r resource
do
    echo "  scanning resource '${resource}'"
    while read -r namespace item x
    do
        mkdir -p "${ROOT}/${namespace}/${resource}"        
        echo "    exporting item '${namespace} ${item}'"
        kubectl get "$resource" -n "$namespace" "$item" -o yaml > "${ROOT}/${namespace}/${resource}/$item.yaml" &
    done < <(kubectl get "$resource" --all-namespaces 2>&1 | tail -n +2)
done < <(kubectl api-resources --namespaced=true 2>/dev/null | tail -n +2 | awk '{print $1}')

wait

Inspired by @scones but runs a little quicker because of process forking and reduced loop nesting which is useful if you have a lot of custom resource definitions!

Same as @nathan-c
I removed events from resources list to fix errors

#!/usr/bin/env bash
ROOT=${HOME}/clusterstate
while read -r resource
do
    echo "  scanning resource '${resource}'"
    while read -r namespace item x
    do
        mkdir -p "${ROOT}/${namespace}/${resource}"        
        echo "    exporting item '${namespace} ${item}'"
        kubectl get "$resource" -n "$namespace" "$item" -o yaml > "${ROOT}/${namespace}/${resource}/$item.yaml" &
    done < <(kubectl get "$resource" --all-namespaces 2>&1  | tail -n +2)
done < <(kubectl api-resources --namespaced=true 2>/dev/null | grep -v "events" | tail -n +2 | awk '{print $1}')

wait
Was this page helpful?
0 / 5 - 0 ratings

Related issues

errordeveloper picture errordeveloper  ·  3Comments

theothermike picture theothermike  ·  3Comments

tbchj picture tbchj  ·  3Comments

Seb-Solon picture Seb-Solon  ·  3Comments

arun-gupta picture arun-gupta  ·  3Comments