Helm: Helm install v2.14.0 "validation failed" error when using a template variable with value ""

Created on 16 May 2019  ·  61Comments  ·  Source: helm/helm

Hello,

After upgrading from v2.13.1 to v2.14.0, my chart now throws an error on helm install:

Error: validation failed: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID

This seems to be due to the use in deployment.yaml file of a template variable "buildID" that is actually never declared in values.yaml.
Extract from deployment.yaml:

template:
  metadata:
    labels:
      app: {{ template "gateway.name" . }}
      draft: {{ default "draft-app" .Values.draft }}
      release: {{ .Release.Name }}
    annotations:
      buildID: {{ .Values.buildID }}

If I set the buildID variable in values.yaml file to "", I get the same error.
If I set the buildID variable in values.yaml file to any other string, such as "a", then my install works.
If I set "" to buildID in deployment.yaml (buildID: {{ "" }}), I get the same error.
If I set "" directly to buildID in deployment.yaml (buildID: ""), then my install works.

Could you please let me know if this is a known issue, or if I am missing anything here?

Thanks!


Output of helm version:
Client: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}

Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:38:32Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.7", GitCommit:"6f482974b76db3f1e0f5d24605a9d1d38fad9a2b", GitTreeState:"clean", BuildDate:"2019-03-25T02:41:57Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}

Cloud Provider/Platform:
AKS

bug

Most helpful comment

Wouldn't the prudent course of action to surface validation errors that were 'silently' failing before, with a warning ,that in a future release becomes an error? Or at the least, honor a force flag or some such that allows the user to choose how to handle it?

All 61 comments

This bit me today as well.

Looks like it might be due to this commit:
https://github.com/helm/helm/commit/32d7f1a3fc226c1745a58b51e318b7362bc7a0bf

TL;DR - Fix manifest validation

Unfortunately, if you have something already deployed with an empty string you can't deploy something that's "fixed" as your already deployed components will fail validation. Your only recourse is to helm delete --purge to get rid of the offending template(s) from history and go forward. Or roll back helm/tiller

As discussed with a community member earlier today, #5576 was the change. Prior to 2.14, Helm silently accepted schema validation errors, but as of 2.14, all manifests are validated, including ones that were previously accepted. The end result being that upgrading to 2.14 causes Tiller to fail manifest validation on charts that were previously accepted, preventing upgrades. Sorry about that!

The mitigation for this is easy: downgrade to 2.13.1 to upgrade for now until a fix is released.

5643 should fix this as the validation only occurs for new manifests being added to the release, and we'd love to hear if that solves the issues raised in here. If so, we may need to cut a 2.14.1 with the fix.

EDIT: #5576 was the PR that made the change. #5643 is the PR that should fix this. :)

I can try #5643 tomorrow unless someone beats me to it.

Wouldn't the prudent course of action to surface validation errors that were 'silently' failing before, with a warning ,that in a future release becomes an error? Or at the least, honor a force flag or some such that allows the user to choose how to handle it?

Thank you everyone for your replies!
I tried downloading the binaries from the latest Helm Canary build but the issue still reproduces. I'm not sure that these binaries correspond to the latest version of master though.
I'm having issues building Helm locally, so I'd be really interested in the outcome of your test @fooka03 !

I'm not sure that these binaries correspond to the latest version of master though.

Check the output of helm version - that should tell you what commit your helm client and tiller are running as. Because the patch in #5643 was a server-side patch, you'll have to ensure that tiller was updated.

Same, using k8s 1.8.4 :

`Error: error validating "": error validating data: [ValidationError(Deployment.spec.template.spec.containers[1].ports[0]): unknown field "exec" in io.k8s.api.core.v1.ContainerPort, ValidationError(Deployment.spec.template.spec.containers[1].ports[0]): unknown field "initialDelaySeconds" in io.k8s.api.core.v1.ContainerPort]

Error: error validating "": error validating data: ValidationError(StatefulSet.spec): missing required field "serviceName" in io.k8s.api.apps.v1beta1.StatefulSetSpec Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(StatefulSet.spec): missing required field "serviceName" in io.k8s.api.apps.v1beta1.StatefulSetSpec`

I'm not sure that these binaries correspond to the latest version of master though.

Check the output of helm version - that should tell you what commit your helm client and tiller are running as. Because the patch in #5643 was a server-side patch, you'll have to ensure that tiller was updated.

Thanks @bacongobbler ! Following your comment I upgraded my tiller:

Client: &version.Version{SemVer:"v2.14+unreleased", GitCommit:"9fb19967bab21ecb9748440a99487f2fb0560f63", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"9fb19967bab21ecb9748440a99487f2fb0560f63", GitTreeState:"clean"}

However, I'm still getting the exact same error when running helm install. :(
Error: validation failed: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID

The commit corresponds to https://github.com/helm/helm/commit/9fb19967bab21ecb9748440a99487f2fb0560f63, so it looks like the issue still reproduces in my case despite the fix.

Can we get an ETA for the hotfix? I really would like to avoid patching my server with a self-build helm/tiller. Thank you!

@daniv-msft Could you try doing this in your template yaml?

template:
  metadata:
    labels:
      app: {{ template "gateway.name" . }}
      draft: {{ default "draft-app" .Values.draft }}
      release: {{ .Release.Name }}
    annotations:
      buildID: {{ .Values.buildID | quote }}

@SeriousM for now you can rollback to 2.13.1 and wait for 2.14.1 release. I tried the commit 9fb19967 and it works for me

@SeriousM for now you can rollback to 2.13.1 and wait for 2.14.1 release. I tried the commit 9fb19967 and it works for me

we have a lot of azure devops release pipelines (30+) and each of them is trying to keep helm at the latest stable build version. I could downgrade for now but once the next build pipeline is started the version would be back on 2.14.0 and I really don't go over all the 30+ to disable the step and enable it later back again. Sorry, but I need to wait for the hotfix.

Is there any ETA on the hotfix?

@daniv-msft Could you try doing this in your template yaml?

template:
  metadata:
    labels:
      app: {{ template "gateway.name" . }}
      draft: {{ default "draft-app" .Values.draft }}
      release: {{ .Release.Name }}
    annotations:
      buildID: {{ .Values.buildID | quote }}

This is my content of the deployment.yaml file that matches the path Deployment.spec.template.metadata.annotations.buildID:

spec:
  template:
    metadata:
      annotations:
        buildID: {{ .Values.buildID }}
      labels:
        app: {{ template "fullname" . }}
        env: {{ .Values.labels.env }}

Do you think a | quote could fix the problem?

@SeriousM Oh. What error are you getting currently? And I tried the | quote with the master commit, not released version, and it will basically surround the value with double quotes, and it's useful when the value is empty, as the yaml is rendered as

  annotations:
    buildID: ""

If you don't use | quote, it will get rendered as

  annotations:
    buildID: 

which will lead to a validation error that is described in the issue. I verified this by using this dummy chart that I created:

issue-5750.tar.gz

@SeriousM Oh. What error are you getting currently?

My error Error: UPGRADE FAILED: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID but I don't see why. First I thought that the buildID was not passed into the cli command from azure devops but since @daniv-msft got the same error I guess it's because of the server validation.

@SeriousM Oh. What error are you getting currently?

I tried to modify my deployment.yaml by adding | quote and even remove the metadata/annotations/buildID at all but this didn't helped.

This is the error I got when I removed the buildID annotation:

Error: failed decoding reader into objects: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID

In regards to a 2.14.1 release, we probably won't be able to cut a release until after KubeCon.

In regards to a 2.14.1 release, we probably won't be able to cut a release until after KubeCon.

To be sure to get this right, you mean this KubeCon?

image

KubeCon EU, which is next week, not November. :)

KubeCon EU, which is next week, not November. :)

So we can expect a fix in form of v2.14.1 at the 24.5.19 ?
Can I compile it myself somehow?

@SeriousM The first error you got, which is
Error: UPGRADE FAILED: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID
is due to the issue in the chart template yamls , which you seem to have fixed with the | quote

The next error
Error: failed decoding reader into objects: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID
is due to the bad manifest in the existing release. To avoid this, the tiller should not validate the release manifests and only check the manifests that are given by the user, and that's what the PR https://github.com/helm/helm/pull/5643 (fix) does and has been merged into master. You can may be use the canary image of tiller to check if it all works for you. If the failing releases are fixed once (by upgrading with proper manifests), then if the pipeline uses 2.14 version, there won't be any validation problems

You can may be use the canary image of tiller to check if it all works for you.

How can I deploy this image?

@SeriousM With any helm client version, you can use

helm init --tiller-namespace <namespace> --upgrade --canary-image

To get the latest helm client (master), you can use this : https://helm.sh/docs/using_helm/#from-canary-builds

So it looks like #5643 does fix the manifest validation issue:

helm version
Client: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"9fb19967bab21ecb9748440a99487f2fb0560f63", GitTreeState:"clean"}
Release "belligerent-horse" has been upgraded.
LAST DEPLOYED: Fri May 17 10:32:12 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME                             DATA  AGE
<redacted>-belligerent-horse  1     181d
<redacted>                      2     181d

==> v1/Pod(related)
NAME                               READY  STATUS       RESTARTS  AGE
<redacted>-belligerent-horse-0  1/1    Running      0         5m16s
<redacted>-belligerent-horse-1  1/1    Terminating  0         17h

==> v1/Service
NAME           TYPE       CLUSTER-IP  EXTERNAL-IP  PORT(S)   AGE
<redacted>  ClusterIP  None        <none>       8080/TCP  181d

==> v1/StatefulSet
NAME                             READY  AGE
<redacted>-belligerent-horse  2/2    181d

Still have to set any missing validation fields in the "new" templates of course, but it will at least let you deploy over existing releases.

@SeriousM With any helm client version, you can use

helm init --tiller-namespace <namespace> --upgrade --canary-image

To get the latest helm client (master), you can use this : https://helm.sh/docs/using_helm/#from-canary-builds

Thank you very much, I will try it out asap

@fooka03 > Thanks for sharing the results of your test! Despite the fix, I'm still reproducing the issue with both my own chart and the chart provided by @karuppiah7890 (issue-5750.tar.gz). Could you please share the chart you're using, or let us know if you see any difference in your chart compared to this one?

@karuppiah7890 > With the chart you provided, for the helm install to be successful, I have to add the | quote to deployment.yaml and the buildID must also be provided in values.yaml. If I'm commenting the line to #buildID: "" or removing the | quote, then the error reproduces whereas it was working fine on v2.13.1.

Unfortunately in my case it is not really possible to upgrade the chart's files, so a fix not implying any changes to existing charts would be far easier to handle.

Am I the only one reproducing the issue despite the fix? Helm version seems to return the correct versions, but is there anything else I should verify to make sure I'm using the latest bits?

Client: &version.Version{SemVer:"v2.14+unreleased", GitCommit:"9fb19967bab21ecb9748440a99487f2fb0560f63", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"9fb19967bab21ecb9748440a99487f2fb0560f63", GitTreeState:"clean"}

@daniv-msft In my chart, the invalid template was missing the key entirely (serviceName for a statefulset). Since helm had set this automagically to empty string (serviceName: "") in the deployed manifest I simply added serviceName: {{ .Values.serviceName | default "" | quote }} to my template. The default filter in this case means I don't have to supply a value for serviceName.

As I mentioned in my comment, the fixed version still requires that you make changes to the new manifests in order to pass validation. The only thing different is that the fixed version does not also perform validation against already deployed charts which will allow you to deploy without having to purge first.

Ultimately I think in your scenario where you are unable to update your chart files you're going to be stuck using 2.13.1 until they are changed to pass validation.

@fooka03 > Thanks for your reply! I didn't understand previously that the fix would only apply to deployed charts being upgraded, and not new ones.

In my case unfortunately sticking to v2.13.1 isn't an option either. :(
The charts we created and that were working for v2.13.1 are provided to our users as part of a tool, and even though we would push a new version fixing this, we cannot force our users to use the latest version of our tool.
Pushing a new version of the tool/charts might reduce the impact, but we'll have in any case some of our users trying to install previous versions of our charts with Helm v2.14.0, and thus getting the validation error.

Even though I understand that my case is specific, isn't this change going to also break some of the other charts that were generated previously (I'm thinking of https://github.com/helm/charts/tree/master/stable)?

If this is the case, would it be conceivable to postpone the validation change to Helm v3, and/or apply @StephanX's suggestion above and display a warning for now?

Any thoughts on a mitigation strategy, @mortent? Perhaps we should pull this out and move it over to Helm 3. While it's certainly useful, it seems like it breaks existing use cases which were previously working. What do you think about that?

@bacongobbler I agree we should pull this out. Based on the discussion here, I'm not convinced that just restricting validation to new manifests (as in #5643) would be enough to help all users. I have a PR that disables schema validation in all cases: #5760.

It would be great if someone having problems can verify that this really resolves their issues.

I just used the canary image but got the same error as before: Error: failed decoding reader into objects: error validating "": error validating data: unknown object type "nil" in Deployment.spec.template.metadata.annotations.buildID.
So I did the closest thing I could do and removed the annotations.buildID property from the deployment.yaml. This probably not the way others can solve this issue, but it you can, just remove it.
I run the 2.14.0 tiller image without problems.

@SeriousM it appears that the canary build is only based off the master branch, you would need to build #5760 yourself as it's not merged yet.

I'm in the middle of a massive outage here so I likely won't be able to test it myself today.

Thank you for your replies! I would like to test the bug fix as well, but I'm still having issues building Helm locally.
I looked at the pull-request and, even though I'm not familiar with the codebase, the change looks quite straightforward.

Would it be possible to complete the pull-request, so that I can give this a try in the next Canary build?
Worst case, if something really goes wrong, it would still be possible to revert this commit before it reaches actual users, right?

We have been using YAML forward references for the last 2 years in our values.yaml without any problems. It's not a valid YAML - but it worked and made lives of our developers easier.
YAML is way too simple and restrictive for more complex configurations, but that's the declarative language Helm used so far.

Please consider making this validation either enabled by a flag, or provide an option to disable it with a flag in an upcoming Helm 3.0.

We're backing out this change and cutting 2.14.1.

Sounds good, thank you @bacongobbler and everyone involved in fixing this!
I'll try the fix as soon as it is available.

The fun thing is that I'm getting the same validation error "" for restartPolicy on AKS (same as OP), for GCP, DigitalOcean and Locally with Kubernetes Desktop it doesn't happen at all.

Everyone that has this issue uses AKS?

Edit:

I can confirm in my case (without using helm or any other tool behind kubectl) that the problem is in AKS for some reason. And even though I commented out the field it was complaining, the error still happened.

I'm not sure why this is happening, but it's only in AKS clusters.

Any idea when v2.14.1 will be available?

@WoLfulus I'm seeing this with 2.14.0 on GKE:

$ helm upgrade my-release --install --namespace test ./my-chart/ -f ./my-chart/values.yaml
UPGRADE FAILED
Error: failed decoding reader into objects: error validating "": error validating data: ValidationError(Deployment.spec.template): unknown field "annotations" in io.k8s.api.core.v1.PodTemplateSpec
Error: UPGRADE FAILED: failed decoding reader into objects: error validating "": error validating data: ValidationError(Deployment.spec.template): unknown field "annotations" in io.k8s.api.core.v1.PodTemplateSpec
$ helm version
Client: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}

@WoLfulus I'm using a handspun cluster so it's decidedly not limited to AKS. I'm not sure how you were going about your tests to give you any insight as to why you ran into inconsistent behavior like that.

@andyast from what @bacongobbler was saying earlier, it should be sometime this week at the earliest. They were delayed due to kubecon which was last week.
https://github.com/helm/helm/issues/5750#issuecomment-493464958

@fooka03 and @achton

It's really weird actually.

I think it has something to do with AKS (in my case), but my problem was slightly different.

I was trying to set a restartPolicy for a deployment's initContainers (which doesn't exists in the spec of a Container, but some vendors (DigitalOcean) accepted it as if kubectl was called with --validate=false).

I had to remove restartPolicy in order to make it work on AKS.

So I think in my case it had nothing to do with Helm, even though the error format is exactly the same as this issue.

when can we expect a fix for this ?

The pull-request https://github.com/helm/helm/pull/5760 disabling the validation has been merged to master by @bacongobbler (thanks!).
I assume that, as mentioned by @fooka03, the version 2.14.1 won't be available before a couple of days but I wanted to share the good news. :)

The changes have been merged onto master, so you should be able to pull the changes now.

For those that want to try the canary builds: https://helm.sh/docs/using_helm/#from-canary-builds

Has anyone been able to test the canary and confirm if the patches applied to master fixed this issue? I'd like to get this bug verified and fixed before having to cut a 2.14.1, avoiding the need for a 2.14.2. Thanks!

I will test the fix in canary build today and will let you know the outcome.

I can confirm that for us it worked

Same thing here: I tested the fix from the latest canary build and it works for us too.

@SeriousM With any helm client version, you can use

helm init --tiller-namespace <namespace> --upgrade --canary-image

To get the latest helm client (master), you can use this : https://helm.sh/docs/using_helm/#from-canary-builds

This worked for me, thanks very much

Does anyone knows how to prevent gitlab pipelines to use helm:latest? We are deploying everything via our laptops since gitlab uses 2.14. It's taking us lots of time.

@pulpbill How do you install or get the helm client in GitLab ? Do you download from releases? or use a docker image? And you want to install 2.13.1 right?

I tried to find a docker image for helm, but couldn't find any official ones. If you want, you could build a docker image by downloading and putting helm binary in it and then you can use that image in gitlab ci config. You can find the url for downloading binaries (all versions) from releases page - https://github.com/helm/helm/releases . And you can do the same (download and install in $PATH) inside your gitlab job too, if you don't want to use docker image and docker runner in gitlab.

Let's try to keep the topic on subject. @pulpbill if you don't mind sending an email to the helm-users mailing list or by asking the gitlab team directly that'd be great; this seems like an issue with gitlab moreso than with Helm, and it doesn't seem related to the issue present here.

AutoDevops downloads helm when it runs the deploy job if you take a look here: https://gitlab.com/gitlab-org/gitlab-ce/blob/master/lib/gitlab/ci/templates/Jobs/Deploy.gitlab-ci.yml#L472
I think if you set the env var HELM_VERSION in the cicd variables it might allow you to override it but am not sure.

Thank you @karuppiah7890 and @mitchellmaler Thanks for the tips! I remember I raised and issue to gitlab for autodevops. I will have to wait for the release of 2.4.1, don't have the time right now to build a new pipeline :(

@bacongobbler Sorry for the off-topic!

Helm v2.14.1 has been released: https://github.com/helm/helm/releases/tag/v2.14.1

Hello Team, We are facing this issue in the newly released v3 as well but not in the beta version v3.0.0-beta.4. Kindly help with the resolution.

this is still happening on v3.2.4. v3.2.3 works fine though.

this is still happening on v3.2.4. v3.2.3 works fine though.

Happening on 3.2.3 too on Mac

helm version
version.BuildInfo{Version:"v3.2.3", GitCommit:"8f832046e258e2cb800894579b1b3b50c2d83492", GitTreeState:"clean", GoVersion:"go1.13.12"}

I don't have access to the old code, but I did have a real issue on my chart which resulted in and error on v3.3.0 and the error was gone when I fixed it

Was this page helpful?
0 / 5 - 0 ratings