Hi - can you give some more detail on how you're deploying? are you using helm upgrade --install by any chance? And if you do, what is the state of the deployment when it's broken (helm ls) - presumably it is Failed?

If this is the case, a helm delete --purge <deployment> should do the trick.

reschex on 18 Apr 2019

👍15 👎8 ❤2

hi, sorry for missing info.
Yes i am using helm upgrade --install
And yes, the deployment stays in Failed forever.
Unfortunately the helm delete --purge <deployment> is not an option here at all. I cannot just delete production services because of that :)

The question is why helm cannot recover after 3 consecutive failures.

sta-szek on 18 Apr 2019

👍2

the only way to sort that without deleting the release add --force

rimusz on 18 Apr 2019

👍2

--force to what? to helm upgrade --install ?
and if yes, then it means that above issue is actually expected feature and we should use --force with every deployment? -- if yes, then it means that it will deploy broken releases forcibly?

sta-szek on 18 Apr 2019

👍6 ❤1

yes, of course to helm upgrade --install :)
and yes you should use --force with every deployment

rimusz on 18 Apr 2019

👍2

does it means that --force will deploy broken releases forcibly as well? - I mean, if pod will be broken and restarting all the time, will it delete old pods and schedule new ones?
--force force resource update through delete/recreate if needed
what is the delete condition? can you elaborate how it works exactly? the description is definitely too short for such a critical flag - I expect it does thousands of things under the hood.

BTW I really dont want to end up with deleted production services, so the --force flag is not an option for me.

and do you really think that it is not an issue?
even the error message is wrong:
app-name has no deployed releases
which states that there is no deployed releases
while there is but with state Failed and helm does not even try to fix it :( -- by fixing I mean just please try to deploy it, instead of giving up on the very beginning

sta-szek on 18 Apr 2019

👍23

See https://github.com/helm/helm/issues/3208

AmazingTurtle on 14 May 2019

Cannot agree more. Our production is experiencing the same error. So deleting the chart is not an option, and forcing the install seems dangerous. This error is still present with Helm 3. So might be good to include a fix or safer workaround.

bappr on 8 Nov 2019

👍36 😕2 👀1

it can be fixed by removing "status": "deployed", in storage.go:136

See: https://github.com/helm/helm/pull/6933/commits/638229c3d3646e78d0fd5157309f8aeadfd01af1

I will fix the Pull Request when i have time.

johannges on 12 Nov 2019

👎2 👀1 👍1

The code in place was originally correct. Removing status: deployed from the query results with Helm finding the latest release to upgrade from, regardless of the state it is currently in which could lead to unintended results. It circumvents the problem temporarily, but it introduces much bigger issues further down the road.

If you can provide the output of helm history when you hit this bug, that would be helpful. It's more helpful to determine how one ends in a case where the release ledger has no releases in the "deployed" state.

bacongobbler on 12 Nov 2019

I'm encountering this issue when deploying for the first time to a new cluster. Should I use --force too?

bastoche on 19 Nov 2019

I encountered this issue when I deleted the previous release without using --purge option.

helm delete --purge <release-name>

Helm Version

Client: &version.Version{SemVer:"v2.15.X"}
Server: &version.Version{SemVer:"v2.15.X"}

japzio on 25 Nov 2019

👍1

I am also encountering this issue.

tomaustin700 on 26 Nov 2019

@bacongobbler
I hit this with helm3. History is completely empty when this happens, although broken k8s resources are there since attempt 1.

Reproduction seems really easy:

helm upgrade --install "something with a pod that has a container that exits with error"
correct what caused the container to exit, e.g. value with invalid arg for the executable inside container, and try again
-> Error: UPGRADE FAILED: "foo" has no deployed releases

henrikb123 on 2 Dec 2019

👍9

Seems the --atomic flag may be a way forward in my (CI/CD) scenario. Since it cleans out initial failing release completely as if it never happened, I don't hit this issue on next attempt.

henrikb123 on 3 Dec 2019

👍5

Same here, I don't see how using delete or --force can be advised especially when there are persistent volumes in place, I've already lost all grafana dashboards because of this once, not doing it again :)

Update: btw in my case the release is failing because of:

Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims

even if I haven't changed anything in the grafana values

alex88 on 10 Dec 2019

@alex88 can you provide the output from helm history? I need to know how others are hitting this case so we can try to nail down the root cause and find a solution.

bacongobbler on 10 Dec 2019

@bacongobbler sure I would really love to see this fixed as I'm really cautious of using helm because of having lost persistent volumes a couple times (probably my fault tho)

REVISION    UPDATED                     STATUS  CHART           APP VERSION DESCRIPTION
4           Wed Dec  4 02:45:59 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
5           Mon Dec  9 12:27:22 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
6           Mon Dec  9 12:33:54 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
7           Mon Dec  9 12:36:02 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
8           Mon Dec  9 13:06:55 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
9           Mon Dec  9 13:38:19 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
10          Mon Dec  9 13:38:51 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
11          Mon Dec  9 13:41:30 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
12          Mon Dec  9 13:56:01 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims
13          Mon Dec  9 15:15:05 2019    failed  grafana-4.1.0   6.5.0       Upgrade "grafana" failed: cannot patch "grafana" with kind PersistentVolumeClaim: PersistentVolumeClaim "grafana" is invalid: spec: Forbidden: is immutable after creation except resources.requests for bound claims

basically I've tried multiple times to run the upgrade to change some env variables and since while there was the deploy error the env variables changed anyway I kept doing so ignoring the error

alex88 on 10 Dec 2019

how did you get into a state where every release has failed? Where's release 1, 2, and 3?

bacongobbler on 10 Dec 2019

how did you get into a state where every release has failed? Where's release 1, 2, and 3?

changing env variables (had to do multiple changes) and running an upgrade every time, it was changing the env variables but I had no idea on how to fix the persistent volume error

Update: btw I'm using

version.BuildInfo{Version:"v3.0.0", GitCommit:"e29ce2a54e96cd02ccfce88bee4f58bb6e2a28b6", GitTreeState:"clean", GoVersion:"go1.13.4"}

regarding previous release probably helm keeps only 10 of them

alex88 on 10 Dec 2019

Helm3: I am having the similar issue while upgrading istio, the release failed, now I can not redeploy it even though a small error in templates is fixed. I cant delete production release since it will also delete associated ELB with istio-ingress service.

ajitchahal on 16 Dec 2019

Is there any future work to change the logic when the initial release ends up in a failed state ?

HamzaZo on 17 Dec 2019

👍17

What do I have to do if downtime is not accepted?

% helm upgrade prometheus-thanos --namespace metrics -f values.yaml . 
Error: UPGRADE FAILED: "prometheus-thanos" has no deployed releases
% helm install --atomic prometheus-thanos --namespace metrics -f values.yaml .                                                                                                               
Error: cannot re-use a name that is still in use
% helm version
version.BuildInfo{Version:"v3.0.1", GitCommit:"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa", GitTreeState:"clean", GoVersion:"go1.13.4"}

com30n on 2 Jan 2020

What do I have to do if downtime is not accepted?

for now I just use helm to generate the templates and I manually save them locally and apply

alex88 on 3 Jan 2020

Seems the --atomic flag may be a way forward in my (CI/CD) scenario. Since it cleans out initial failing release completely as if it never happened, I don't hit this issue on next attempt.

@henrikb123 the above works only if you allways used --atomic flag. Otherwise it will not work. For example: try to install a broken chart without it and them run the same command with the --atomic flag. It will break. FYI I'm using the latest Helm version -> 3.0.2

@alex88 can you provide the output from helm history? I need to know how others are hitting this case so we can try to nail down the root cause and find a solution.

@bacongobbler why don't you just do what @henrikb123 said here to simulate the problem? As pointed out by @henrikb123, the history is completely empty. I can confirm that as well. Take a look please:

$ helm upgrade --install --cleanup-on-fail --reset-values --force --namespace teleport --values values.test.yaml teleport ./
Release "teleport" does not exist. Installing it now.
Error: Secret "teleport-secrets" is invalid: metadata.labels: Invalid value: "helm.sh/chart:teleport-1.0.0app.kubernetes.io/managed-by": a qualified name must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName',  or 'my.name',  or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]') with an optional DNS subdomain prefix and '/' (e.g. 'example.com/MyName')

$ helm history teleport
Error: release: not found

$ helm upgrade --install --cleanup-on-fail --reset-values --force --namespace teleport --values values.test.yaml teleport ./
Error: UPGRADE FAILED: "teleport" has no deployed releases

galindro on 14 Jan 2020

👍3

I also ran into this with Istio.

There's an Istio issue with 1.4.3 where one of the jobs the install runs will fail if it can't get to the Kubernetes API server. It then leaves a job behind and if you try to re-run the Helm command it fails because the job exists already. I tried deleting the job, tweaking things, and re-running the upgrade but was never successful... and now I'm stuck.

(That's how you can get into an all-failed release state, since there was a question about that.)

REVISION    UPDATED                     STATUS  CHART       APP VERSION DESCRIPTION                                                                                                                                                                                                         
10          Tue Jan 14 09:17:00 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: timed out waiting for the condition
11          Tue Jan 14 09:22:21 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: warning: Hook post-upgrade istio/charts/security/templates/create-custom-resources-job.yaml failed: jobs.batch "istio-security-post-install-1.4.3" already exists
12          Tue Jan 14 09:23:10 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: warning: Hook post-upgrade istio/charts/security/templates/create-custom-resources-job.yaml failed: jobs.batch "istio-security-post-install-1.4.3" already exists
13          Tue Jan 14 09:25:58 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: timed out waiting for the condition 
14          Tue Jan 14 09:35:21 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: warning: Hook post-upgrade istio/charts/security/templates/create-custom-resources-job.yaml failed: jobs.batch "istio-security-post-install-1.4.3" already exists
15          Tue Jan 14 09:38:08 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: timed out waiting for the condition 
16          Tue Jan 14 14:02:47 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: timed out waiting for the condition
17          Tue Jan 14 14:19:44 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: timed out waiting for the condition
18          Tue Jan 14 14:33:36 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: warning: Hook post-upgrade istio/charts/security/templates/create-custom-resources-job.yaml failed: jobs.batch "istio-security-post-install-1.4.3" already exists
19          Tue Jan 14 14:36:59 2020    failed  istio-1.4.3 1.4.3       Upgrade "istio" failed: post-upgrade hooks failed: timed out waiting for the condition

This is with Helm 3.0.2.

tillig on 15 Jan 2020

IMO this is a critical issue that needs to be fixed asap. I saw many other similar issues that had being open for the same problem since version 2 and til now seems that it hasn't be fixed.

I just ask for the developers to do exactly what @henrikb123 said on his comment to simulate this problem. Its a very simple way to simulate it. You can test it with any Helm version (2.x.x and 3.x.x). I'm almost sure that it will occur with all of them.

Maybe --atomic should be a hard requirement (not a command line argument). It is even quite redundant as --cleanup-on-fail. The difference is that --cleanup-on-fail doesn't fixed this problem like --atomic did.

galindro on 15 Jan 2020

👍10

We have also just encountered this in production and downtime is not an option. We found a workaround by just patching the latest FAILED configmap to instead have the label STATUS: DEPLOYED using a command like...

kubectl -n kube-system patch configmap app-name.v123 --type=merge -p '{"metadata":{"labels":{"STATUS":"DEPLOYED"}}}'

In our case, we were sure that the last FAILED revision was actually eventually deployed successfully by kubernetes.

How did we get into this state?

Basically, our dev team ignored the FAILED upgrades because Kubernetes was still making the modifications after helm timed out.

Specifically, we are using Helm 2 and we set TILLER_HISTORY_MAX=20 on the tiller-deploy deployment. We were using helm upgrade --wait --timeout 1080 for all of our RollingUpdate upgrades which were taking longer over time. Then the helm upgrades started to time-out but no one was alarmed (just annoyed) because Kubernetes was still successfully making the modifications. After 20 upgrades timed out (today), then we were alarmed because we could no longer deploy because instead we were seeing app-name has no deployed releases.

Why does the patch work?

We figured out that we just needed to patch the STATUS label in the configmap because we realized that Helm was probably requesting configmaps using a request similar to...

kubectl -n kube-system get configmap -l NAME=app-name,STATUS=DEPLOYED

The clue was found when we viewed the configmap yaml and noticed the following labels...

$ kubectl -n kube-system describe configmap app-name.v123
Name:         app-name.v123
Namespace:    kube-system
Labels:       MODIFIED_AT=1579154404
              NAME=app-name
              OWNER=TILLER
              STATUS=FAILED
              VERSION=123
Annotations:  <none>
Data
====
release:
----
H4sIAAAAAAAC...snipped...

And this is consistent with https://github.com/helm/helm/issues/5595#issuecomment-552743196

carlosdoordash on 16 Jan 2020

👍7

@bacongobbler instead of wondering how you'd get into a failed state, you should consider a valuable fix for upgrading a failed installation, which.. should not fail.

But actually, to answer your concerns: a time out is a good reason for a failed release. The release will also stuck and can't be rolled back when upgrading and running into a timeout.

Thus having volumes dynamically created by the claims. When deleting the claims (by deleting a chart) the volumes are also permanently deleted. That's not how you like it. Me and many other developers are stuck for months and trying to work around on this.

You didn't like the idea of removing status: deployed from the query. So what about adding a new label that actually marks the latest release no matter if its status was deployed or failed? That would actually make sense. Because that's what you want to do, you want to get the latest release to upgrade from. And if there is none, you should probably check for failed releases instead. Or just use a new label that marks the latest one directly.

_I'm excited to hear your opinion on this._

AmazingTurtle on 16 Jan 2020

👍1

Perfect collocation @AmazingTurtle.

galindro on 19 Jan 2020

I'm not sure if this has already been noted, but this issue also crops up if the very first install of a chart fails for any reason (which is a very common occurrence especially for first time chart users who may need to iterate on their configuration to get things running).

I believe the only workaround for CLI users in this case is to delete the release tracking secret if using the secrets driver, as well as all resources that were created by the last release (to avoid running into Helm's resource ownership checks).

This is a real function from a tool I've written internally for handling this issue when it crops up:

package foo

import (
    "helm.sh/helm/v3/pkg/action"
    "helm.sh/helm/v3/pkg/release"
    "helm.sh/helm/v3/pkg/storage/driver"
)

// DangerouslyApplyRelease allows installing or upgrading any release from a failed state,
// but does not enforce Helm's standard resource ownership checks.
func DangerouslyApplyRelease(cfg *action.Configuration, rel *release.Release) error {
    // Forcibly mark the last release as successful and increment the version
    rel.Info = &release.Info{
        Status: release.StatusDeployed,
    }
    rel.Version++

    var err error

    // Attempt to create the release
    err = cfg.Releases.Create(rel)

    // If release already exists, update it
    if err == driver.ErrReleaseExists {
        err = cfg.Releases.Update(rel)
    }

    return err
}

jlegrone on 23 Jan 2020

👍1

@jlegrone Would using helm delete --purge (v2) or helm uninstall (v3) also work, as they are all failed releases?

hickeyma on 24 Jan 2020

What was pointed out by @jlegrone is true.
@hickeyma your proposal is a workaround that can work. But, I need a definitive solution.

galindro on 24 Jan 2020

it is harmfull bug for last 2years and helm is not going to fix it
helm delete is not acceptable in most of production cases
with helm3 we cannot kubectl edit secret sh.helm.release.... because it is encrypted
helm rollback <latest-successful> is only correct workaround

so if you have by default HISTORY_MAX=10 and you tried 10 time to get something working - you are completely lost...

and if you have some logic on install vs upgrade, you cannot delete sh.helm.release.....v* secrets

helm must die or fix it

kosta709 on 30 Jan 2020

👍4

found workaround
helm3 sets labels on its secrets:
kubectl get secrets --show-labels | grep sh.helm.release.v1

....
sh.helm.release.v1.helm-must-die.v34                 helm.sh/release.v1                    1         13h       modifiedAt=1580326073,name=helm-must-die,owner=helm,status=failed,version=34
sh.helm.release.v1.helm-must-die.v35                 helm.sh/release.v1                    1         13h       modifiedAt=1580326228,name=helm-must-die,owner=helm,status=failed,version=35
sh.helm.release.v1.helm-must-die.v36                 helm.sh/release.v1                    1         1h        modifiedAt=1580370043,name=helm-must-die,owner=helm,status=failed,version=36
...

so do for latest kubectl edit secret sh.helm.release.v1.helm-must-die.v36 and set label status=deployed
and for release before it (v35) set label status=superseded

next helm upgrade --install ... will work

kosta709 on 30 Jan 2020

👍9 ❤1

@kosta709 Similar to my finding for Helm2, which stores releases as ConfigMaps in the kube-system namespace with labels that are all CAPS, Helm3 now stores releases as Secrets in the application's namespace with labels that are all lowercase.

So for Helm3, you can just use a slightly different kubectl patch command...

kubectl -n app-namespace patch secret app-name.v123 --type=merge -p '{"metadata":{"labels":{"status":"deployed"}}}'

carlosdoordash on 30 Jan 2020

👍4

I wish we didn't have to discuss these workarounds. Fixing this in the product should be top priority. A reminder of how bad this is (disregarding workarounds):

If a release failed the first time it was deployed OR if enough releases failed to rotate last success out of history, the release cannot be fixed without manual intervention.

Given Helm usage from a continuous deployment pipeline is probably a common pattern or at least a desired one, this is not workable.

henrikb123 on 31 Jan 2020

I completely agree but at least wanted to clearly document the work around because when you get into this state, it feels like there is no other option but to abandon the release and take an outage.

Along with the patches to avoid taking an outage, we also stopped using helm --wait and instead rely on our own polling logic to know when the release is successful or not. It is more work but we now have way more visibility, which is helpful when a release is taking longer than expected, and we can detect failures earlier than the timeout.

carlosdoordash on 31 Jan 2020

👍3

This wasn't an issue for me on older versions of helm, and there are no failed deployments, kubectl is showing running services and everything is working.

Now I am simply trying to run helm upgrade -f app.yaml --namespace prometheus prometheus prometheus and I just get the error: Error: UPGRADE FAILED: "prometheus" has no deployed releases but I cannot try any of the work arounds because this is in prod...

zrsm on 3 Feb 2020

@zrsm what we're doing for now is generating the yaml files using helm and using kubectl diff/dry-run to preview changes before applying them manually

alex88 on 3 Feb 2020

👍1

@zrsm what we're doing for now is generating the yaml files using helm and using kubectl diff/dry-run to preview changes before applying them manually

Thanks for the reply, I downgraded to 2.15.1 but ran into similar issues, however, I tried something like deleting my ~/.helm and then I reinitialized the tiller serviceaccount from kubectl, after doing this I was able to apply charts to kubernetes. I will try testing this with helm 3 later on today and reply back with a fix. I have a feeling this might have been the issue.

zrsm on 5 Feb 2020

Hi there, so I tested this out... and looks like performing the following command after also deleting my previous ~/.helm/ solved this...

helm init --service-account tiller --override spec.selector.matchLabels.'name'='tiller',spec.selector.matchLabels.'app'='helm' --output yaml | sed 's@apiVersion: extensions/v1beta1@apiVersion: apps/v1@' | kubectl apply -f -

I am thinking if you install a new helm edition and your serviceaccount stuff isn't situated (I wiped my laptop and restored at some point) then this happens and this was the fix. I hope it works for you also.

zrsm on 6 Feb 2020

This bug is ongoing in Helm 3, is there a planned fix?

dudicoco on 9 Feb 2020

👍33 👀6

Also running into this issue with a fresh cluster and a fresh deployment due to a timeout. I don't like manually connecting to our cluster to fix this, but I guess that's the only option now.

Can we make sure this issue gets resolved asap?

devedse on 18 Feb 2020

this issue is so frustrating it's a reason to stop using helm altogether.

gretel on 20 Feb 2020

😕4 👍2

I agree. This is driving me nuts. I'm going to work on fixing it. Wish me luck.

yinzara on 21 Feb 2020

👍16 ❤11

I agree. This is driving me nuts. I'm going to work on fixing it. Wish me luck.

thanks and good luck!

gretel on 21 Feb 2020

I wouldn't mind getting a few of you to look at PR #7653 .

I believe this will solve the issues described above.

yinzara on 21 Feb 2020

Can't believe it's still open with no reaction from maintainers

yarax on 24 Feb 2020

👍8

cc @bacongobbler @mattfarina

HamzaZo on 24 Feb 2020

Would using helm delete --purge (v2) or helm uninstall (v3) also work, as they are all failed releases?

@hickeyma not always; this could also be the result of helm release metadata corruption so in some cases uninstalling could delete resources under load.

jlegrone on 26 Feb 2020

sometimes the release is not failed but timeout and helm labels it as the failing one, and next time it shows as has no deployed release, but the app is actually fully functional, it happened to me many times, so I had to change the release label to deployed one. it it not always an option to do helm delete --purge (v2) or helm uninstall (v3)

rimusz on 26 Feb 2020

👎2

@rimusz how are you changing the release label?

dudicoco on 26 Feb 2020

@dudicoco by manually editing the helm v3 latest release secret, you could automate that and use kubectl patch

rimusz on 26 Feb 2020

have moved over to https://github.com/k14s/kapp which works like a charm.

gretel on 26 Feb 2020

@rimusz that's what I thought, thanks.

dudicoco on 26 Feb 2020

I also back-ported my fix to helm 2 in #7668 but am still awaiting feedback on #7653

yinzara on 26 Feb 2020

😕3

Same issue here,

A release deployed with --wait did timeout, and finally is up and running. It is still marked as failed.
And so, later deployments are failing too.

This means the release status is not a reliable information.

linkdd on 3 Mar 2020

We use k8s in our company for many services in production.
For a few times in a month, we have same problems with helm on different applications ("* has no deployed releases.").
We used different versions of helm (from 2.7 till 3.0.3).
The problem is not fixed.
This causes a lot of discomfort for our users (developers that deploy applications in cluster)
Each time, when we hit it, we just patch the latest release secret (status to deployed).
Is there any plan to add a behavior that ignores last releases state and installs new releases?

yuripastushenko on 5 Mar 2020

👍6

Having --history-max set to 10 (default value), first release succeeded.
Then, next 10 releases failed on:
Error: UPGRADE FAILED: timed out waiting for the condition (It was simulated, thus expected).
After that, next (11th failed) release failed on:
Error: UPGRADE FAILED: "app" has no deployed releases (that's the problem!)

Would it be possible that helm always preserves the latest successful release in the history, in addition to the 10 most recent ones (whatever their status is)?

zkovac-devops on 10 Mar 2020

👍2

I love the idea. Would need to modify the storage functionality but I think it could be done.

yinzara on 10 Mar 2020

https://github.com/helm/helm/pull/4978 was merged for Helm 2. Perhaps it wasn't ported over to Helm 3. If someone has the time and wants to port it over, please feel free.

bacongobbler on 10 Mar 2020

I tried my hand at porting this to Helm 3 with #7806, and would love to see it merged ASAP. Thanks, @ultimateboy!

yurrriq on 25 Mar 2020

👍2

What about releases that fail on _first_ install, i.e. have no past successful releases?
We are using upgrade --install for idempotent deployment of helm releases and when the first release fails, all the subsequent invokes of upgrade --install fail with "has no deployed releases" error (this issue).

krylovsk on 25 Mar 2020

👍18 👀1

The "first release failing" scenario is at least more manageable, because you usually run or monitor it manually (and can apply a fix right then and there)—as opposed to having helm run by a CI/CD system that just starts failing one day and doesn't recover even after fixing the code.

It should still be fixed of course.

There is also value in preserving the last successful release anyway, not just because of this bug. E.g. debugging issues with values file, etc.

peterholak on 25 Mar 2020

👍1

@peterholak The "first release failing" scenario is sometimes also done with a CI/CD and for example - we have restricted access to our cluster and can't even make a "helm ls" how are we supposed to "manage this"?

son4etyyy on 1 Apr 2020

👍4

This issue should be a high priority one seeing that most people use helm in production. I could run the helm install with --atomic, but what if I'd like to inspect the reason for failure before deploying? I would be time boxed by the timeout before the installation fails and then it reverts. If I could successfully upgrade I won't have to feel time boxed when inspecting the failure.

javonclarke on 7 Apr 2020

👍2

We are also using upgrade --install for idempotent deployment of helm releases. Because that is how automated ci/cd pipelines work. We do not plan to manually fiddle around with helm because that would bypass our deployment pipeline.

In an automated deployment pipeline the first deployment will almost always fail. Subsequent deployments must not be triggered differently than the first attempt.

Please consider raising the priority of this issue considerably.

nemoo on 7 Apr 2020

👍9

The experience is soooooooo bad, we can't simply delete the whole release, because it's in production! It will cause the server downtime! How can we deal with this issue in the end?

thetruechar on 10 Apr 2020

👍9 😄1

Also, can somebody please remove the question/support label? This issue is not about missing documentation, but about the current behaviour of Helm which is not very supportive towards use in automated deployment pipelines.

nemoo on 12 Apr 2020

👍5

The #7806 PR has been merged onto master. It will be released in 3.2. I am closing this issue accordingly.

technosophos on 14 Apr 2020

🎉3 👍3

Great! This solves most of our issues with Helm.

What is the current behavior if the first release fails though (no deployed releases yet)?

There was https://github.com/helm/helm/issues/3353 which was addressed by https://github.com/helm/helm/pull/3597 but only when --force is used.

--force has some issues in Helm 3 though (https://github.com/helm/helm/issues/6378), with a proposal to address it (https://github.com/helm/helm/issues/7082), plus as other comments in this thread mentioned, using --force is not always suitable anyway. So the whole situation is still somewhat unclear.

peterholak on 14 Apr 2020

👍1

@technosophos thanks for the fix. Curious, when would 3.2. release available to install ? Keep getting app-name has no deployed releases error on an existing failed release. And it's kind a blocker in CI/CD pipelines.

nareshnayini on 14 Apr 2020

@peterholak see #7913.

bacongobbler on 14 Apr 2020

👍1

3.2 will be discussed on the April 16 public dev call. I have triaged it down to just the ones that currently look like they can be wrapped up right away. Then we will start the beta release process (assuming maintainers all agree on the call tomorrow).

technosophos on 16 Apr 2020

👍1

I was facing the same issue on AKS solve to fix the mentioned issue by following command:

helm version : 3.1.2
I just delete the package from k8s cluster with command
helm delete <release-name>

and run the deployment cycle to fix the issue

sachinmishra007 on 6 May 2020

The issue is still there in 3.2.0 version

deimosfr on 6 May 2020

😕3

@deimosfr This is fixed in #7653 which will be in release 3.2.1. It is not yet released but you can get the fix if you want to build off master.

hickeyma on 6 May 2020

🎉2

I'm on 3.2.1 and this is still happening

VengefulAncient on 22 May 2020

👍1

There are still reasons that this error can occur. 3.2.1 didn't simply remove the error. It removed some of the causes. If you're still experiencing it, your problem is something other than what the issue corrected.

yinzara on 22 May 2020

@yinzara I have a classic case of "path b" from the original description on a fresh cluster with no issues. I can also reproduce this error in another cluster where Helm v2 works just fine. We can of course do the classic "this is caused by something else, open a new issue" dance, but I think it will be quicker if it's simply recognized that it's not really fixed.

VengefulAncient on 22 May 2020

What is the output of helm list? What is the "status" of the prior failed release? Helm 2 has this problem and it has not been fixed at all so I still think your issue is not what you think.

yinzara on 22 May 2020

Still happens on version 3.2.1.

If the initial deploy fails 3 times it all gets stuck...No way to fix it if you don't delete the chart and deploy a good one.

Details:

helm history t3-mac -n t3                                                                                                                                                                 REVISION        UPDATED                         STATUS          CHART           APP VERSION     DESCRIPTION
1               Fri May 22 18:55:11 2020        failed          t3-mac-2.13.0   2.13.0          Release "t3-mac" failed: timed out waiting for the condition
2               Fri May 22 19:33:44 2020        failed          t3-mac-2.13.0   2.13.0          Upgrade "t3-mac" failed: timed out waiting for the condition
3               Fri May 22 19:57:51 2020        pending-upgrade t3-mac-2.13.0   2.13.0          Preparing upgrade

helm.exe upgrade --namespace t3b --install --force --wait t3b-mac t3b-mac-2.13.0.tgz
2020-05-22T18:14:01.7103689Z Error: UPGRADE FAILED: "t3b-mac" has no deployed releases

zodraz on 22 May 2020

😕3

I have the same issue on deployed chart and the pod is running fine

vm-victoria-metrics-single-server-0                    1/1     Running     0          2d18h

But I can't upgrade it.

$ helm version
version.BuildInfo{Version:"v3.1.2", GitCommit:"d878d4d45863e42fd5cff6743294a11d28a9abce", GitTreeState:"clean", GoVersion:"go1.13.8"}

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-26T06:16:15Z", GoVersion:"go1.14", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.8", GitCommit:"ec6eb119b81be488b030e849b9e64fda4caaf33c", GitTreeState:"clean", BuildDate:"2020-03-12T20:52:22Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}


ismail ~ $ helm list
NAME    NAMESPACE   REVISION    UPDATED                                 STATUS      CHART                                   APP VERSION    
vm      default     1           2020-05-23 16:20:35.243505 +0300 +03    deployed    victoria-metrics-single-0.5.3           1.35.6         

$ helm upgrade vm vm/victoria-metrics-single --set "-selfScrapeInterval=10" 
Error: UPGRADE FAILED: "vm" has no deployed releases


ismail ~ $ helm upgrade --install vm vm/victoria-metrics-single --set "-selfScrapeInterval=10" 
Release "vm" does not exist. Installing it now.
Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: namespace: , name: vm-victoria-metrics-single, existing_kind: policy/v1beta1, Kind=PodSecurityPolicy, new_kind: policy/v1beta1, Kind=PodSecurityPolicy

ismailyenigul on 26 May 2020

I confirm it happened on my side as well

deimosfr on 26 May 2020

@zodraz Your helm status shows the reason for your error. The most recent release isn't showing as failed, it's showing as "pending install". This would imply the process that was managing the last upgrade was artificially terminated before it completed (i.e. before it errored or was successful).

It was the decision of the project maintainers to not include the pending install status as a valid error status to allow the upgrade. (i.e. this is working as designed)

I suggest you try to ascertain why your helm upgrade is being cancelled before it finishes. That should be an avoidable situation.

yinzara on 26 May 2020

I have the same issue on deployed chart and the pod is running fine

vm-victoria-metrics-single-server-0                    1/1     Running     0          2d18h

But I can't upgrade it.

$ helm version
version.BuildInfo{Version:"v3.1.2", GitCommit:"d878d4d45863e42fd5cff6743294a11d28a9abce", GitTreeState:"clean", GoVersion:"go1.13.8"}

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-26T06:16:15Z", GoVersion:"go1.14", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.8", GitCommit:"ec6eb119b81be488b030e849b9e64fda4caaf33c", GitTreeState:"clean", BuildDate:"2020-03-12T20:52:22Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}


ismail ~ $ helm list
NAME  NAMESPACE   REVISION    UPDATED                                 STATUS      CHART                                   APP VERSION    
vm    default     1           2020-05-23 16:20:35.243505 +0300 +03    deployed    victoria-metrics-single-0.5.3           1.35.6         

$ helm upgrade vm vm/victoria-metrics-single --set "-selfScrapeInterval=10" 
Error: UPGRADE FAILED: "vm" has no deployed releases


ismail ~ $ helm upgrade --install vm vm/victoria-metrics-single --set "-selfScrapeInterval=10" 
Release "vm" does not exist. Installing it now.
Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: namespace: , name: vm-victoria-metrics-single, existing_kind: policy/v1beta1, Kind=PodSecurityPolicy, new_kind: policy/v1beta1, Kind=PodSecurityPolicy

I will say your issue is quite perplexing to me. I can't see how that could have happened given the log output you have. The fix release in 3.2.1 certainly won't help your situation as you don't have a failed release. I would guess somehow some of the secrets got removed from Kubernetes that contain the helm release information. I would suggest completely uninstalling the release and reinstalling if you can.

yinzara on 26 May 2020

Hi @yinzara,

The thing is that I did not cancel it...As far I understand the thrird time I launched (and erroed...because I had errors on the deployments to make it fail) made it to reach that "corrupted state"...

This state is not recoverable...So the only way to fix it is to delete the chart...My workauround to avoid this to use the atomic flag to always rollback and never reach this "corrupted state"...

I understand the decision of the maintainers...But this leads to confusion, no possible solution at all (if not deleting the chart) and well, like I said this state was reached when 3 errors happened...without cancelling it...

Anyway lesson learnt and making rollbacks through the atomic flag.

zodraz on 26 May 2020

Hi @yinzara

I found the reason why it fails.

I set the wrong parameter -selfScrapeInterval=10 it should be server.extraArgs.selfScrapeInterval=10

So the problem with - in the parameter.
Maybe the helm error was not meaningful for this type of variable error?

Failing one:

ismail sf $ helm upgrade vm vm/victoria-metrics-single --set "-selfScrapeInterval=10" 
Error: UPGRADE FAILED: "vm" has no deployed releases

Success:

ismail sf $ helm upgrade vm vm/victoria-metrics-single --set "server.extraArgs.selfScrapeInterval=10" 
Release "vm" has been upgraded. Happy Helming!
NAME: vm
LAST DEPLOYED: Tue May 26 22:35:15 2020
NAMESPACE: default
STATUS: deployed
REVISION: 3
TEST SUITE: None
NOTES:
TBD

This also works:

ismail sf $ helm upgrade vm vm/victoria-metrics-single --set "selfScrapeInterval=10" 
Release "vm" has been upgraded. Happy Helming!
NAME: vm
LAST DEPLOYED: Tue May 26 22:37:43 2020
NAMESPACE: default
STATUS: deployed
REVISION: 4
TEST SUITE: None
NOTES:
TBD

ismailyenigul on 26 May 2020

I have the same problem :'( and I can't use purge because I will lose the data and I can't do that, I know this issue is closed but only I'm expressing my pain.

GloriaPG on 8 Jun 2020

We have to ditch helm releases, when we deploy critical workloads, even istioctl of istio ditches helm for this reason (I assume). We use helm template.... | kubctl -f - to avoid this issue, but this of-course creates issue having to remember about deleted resources.

ajitchahal on 9 Jun 2020

👍2

@GloriaPG can you share more information? How are you experiencing the same problem? As @yinzara mentioned earlier in the thread, you may be experiencing a case that #7652 does not fix. We need more information to help before we can come to that conclusion, however.

bacongobbler on 15 Jun 2020

Hi @bacongobbler

We're using helm upgrade with --install and --force flags:

helm upgrade --install ${PROJECT_NAME} ${CHART_NAME} \
   --namespace $NAMESPACE_NAME \
   --values ${SECRETS} \
   --values ${CONFIG_VALUES} \
   --force \
   --wait \
   --timeout ${MAX_WAIT_SECONDS} || rollback

Unfortunately when release is in failed state:

$ helm list
NAME                    NAMESPACE   REVISION    UPDATED                                 STATUS      CHART           APP VERSION
PROJECT_NAME                CHART_NAME      136         2020-07-09 14:13:09.192381483 +0000 UTC failed      CHART_NAME-0.1.0

it result in:

Error: UPGRADE FAILED: "PROJECT_NAME" has no deployed releases
Error: failed to replace object: Deployment.apps "PROJECT_NAME" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"PROJECT_NAME"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

How it can be solved? It seems that --force with --install flag is not working

As this is production env I'm not able to just purge release and create it from scratch :(

Thanks for any suggestions

sebarys on 9 Jul 2020

You error seems to be related to https://github.com/kubernetes/client-go/issues/508
You cannot change the selector on a Deployment. You would have to undeploy and redeploy.

yinzara on 9 Jul 2020

@yinzara the funny thing is that I'm not changing selector on my deployment, everything is working on 9/10 releasone. In one during deployment sth went wrong, release is in failed state and I'm not able to recover from it in any way - deployment itself is working, Pods are running but I'm no longer able to modify it.

It is a bit counterintuitive that after release is in failed state I'm not able to change it anymore using helm. I would expect that flag --force will allow me to replace whole deployment or force to apply changes but I couldn't find a way to fix existing release and work with it.

sebarys on 9 Jul 2020

Yeah unfortunately this doesn't actually seem to be a helm problem. Something failed about your release and it's in a bad state in kubernetes. More than likely the selector is messed up or something is not as you expect, but the error you're seeing about "app-name" has no deployed releases is just a red herring.

yinzara on 9 Jul 2020

👎5

I've tried rollback to previous version, release is now in deployed status. Unfortunately it doesn't change anything so I think the only way is to delete and deploy again unfortunately.

sebarys on 10 Jul 2020

So, my particular issue with this is easy to reproduce.

Start deploying something with helm3 (with --atomic and --cleanup-on-fail), and ctrl+c the process after it starts creating resources. Nothing is rolled back, the resources still exist, and any subsequent attempts to run install --upgrade result in the "has no deployed releases" error.

This ctrl+c is something that in essence happens when someone pushes a new commit to a branch in our CI system while there's already a build running - the helm upgrade will get cancelled, and then it's in a completely broken state.

Is there something we can do to fix it after this point? As with many others in this thread, deletion is not an option.

EDIT: once this is broken, helm ls does not show the release, helm history shows it in pending-install state.

AirbornePorcine on 21 Jul 2020

👍5

Actually - nevermind. For those affected by this, there is one solution: delete the history record from kubernetes manually. It's stored as a secret. If I delete the offending pending-install state entry, then I can successfully run upgrade --install again!

AirbornePorcine on 21 Jul 2020

❤2

@AirbornePorcine - Can you please elaborate on the changes required in kubernetes to delete the pending-install entries .

tarunnarang0201 on 30 Jul 2020

@tarunnarang0201 Helm creates a kubernetes secret for each deploy, in the same namespace you deployed to, you'll see it's of type 'helm.sh/release.v1', and named something like 'sh.helm.release.v1.release-name.v1'. You just have to delete the most recent secret (look at the 'v1' suffix in the example, it's incremented for each deploy), and that seemed to unblock things for me.

AirbornePorcine on 30 Jul 2020

👍3 🚀1 🎉1

@AirbornePorcine thanks!

ninja- on 13 Aug 2020

@AirbornePorcine @tarunnarang0201 @ninja- You can also just patch the status label ... especially, if you don't have any previous DEPLOYED releases.

For Helm 3, see my comment at https://github.com/helm/helm/issues/5595#issuecomment-580449247

For more details and instructions for Helm 2, see my comment at https://github.com/helm/helm/issues/5595#issuecomment-575024277

carlosdoordash on 13 Aug 2020

This conversation is too long... and each comment has one solution .... what's the conclusion?
We've been using old helm 2.12 and we never had issues but now with v3.2.4 a previously failed deployment fails with this error.

We are using Terraform by the way and latest helm provider. So should we use --force or --replace

xbmono on 1 Sep 2020

@xbmono The conversation is long because there are

there are quite a number reasons your release can get into this state
this was possible on Helm 2 as well, and solutions that worked there and on Helm 3 are different.
there are different paths users in this issue took to get there
there are different options depending on what you are trying to do, and whether you are willing to risk/tolerate loss of PVCs and various possible combinations of downtime.

If you are at a "has no deployed releases" error I'm not sure install --replace nor upgrade --install --force will help you on its own.

A sensible suggestion can probably only be given

if you supply the helm history for the release so people can see what has happened
if you share the original reason for the failure/what you did to get there - and whether you feel that the original problem been addressed

My summary of possible options

if you don't care about the existing k8s resources at all or downtime, helm uninstall && helm install may be an option
if it's a first time chart install that failed, you can probably just delete the release secret metadata and helm install again. Maybe will need to clean up k8s resources manually if cruft got left beyond due to the failure, depending on whether you used --atomic etc.
if you abandoned a --waited install part way through and the helm history shows the last release is in pending-install you can delete the most recent release secret metadata or patch the release status
in certain other combinations of scenarios, it may also be possible to patch the release status of one or more of the release secrets and see if a subsequent upgrade can proceed, however to my knowledge, most of these cases were addressed by #7653 (to ensure there is a deployed release somewhere in the history to go back to) so I'd be surprised if this was useful now.

Since this is a closed issue, I suspect there is a root cause that would be good to debug and document in a different, more specific ticket anyway.

chadlwilson on 1 Sep 2020

@chadlwilson Thanks for your response.

helm history returns no rows!

Error: release: not found

but helm list returns the failed deployment

M:\>helm3 list -n cluster171
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS  CHART                           APP VERSION
cluster171      cluster171      1               2020-09-01 04:45:26.108606381 +0000 UTC failed    mychart-prod-0.2.0-alpha.10    1.0

We are using Terraform and our environments get deployed every hour automatically by Jenkins. With terraform I can't use helm upgrade, it's what the helm provider is doing

In the terraform code I have set force_update to true, no luck and the I set replace to true, again no luck

resource "helm_release" "productStack" {
  name = "${var.namespace}"
  namespace = "${var.namespace}"
  chart = "${var.product_stack}"
  force_update = true//"${var.helm_force_update}"
  max_history = 10
  replace = true

  wait = true
  timeout = "${var.timeout_in_seconds}"

}

So I wonder if it's to do with wait=true ? So the reason the previous deployment failed was that the cluster wasn't able to communicate with docker repository and so the timeout reached and the status is failed but we fixed the issue and the pods restarted successfully, now obviously helm delete works but if I were to do this each time my managers nor the developers will be happy.

With helm v2 if the deployment fails and the developers fix it, the next deployment would upgrade the failed deployment.

xbmono on 1 Sep 2020

👍1

M:\>helm3 list -n cluster171
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS  CHART                           APP VERSION
cluster171      cluster171      1               2020-09-01 04:45:26.108606381 +0000 UTC failed    mychart-prod-0.2.0-alpha.10    1.0

The helm history failure seems odd (typo? missed namespace? wrong helm version?), but given it's revision 1 in the list above it seems you are trying to do a first time installation of a new chart and the first time installation has failed. If you are trying to unblock things you can probably delete the release secret metadata as above or patch its status, and try again. That may indicate that the metadata is in a bad state from the perspective of either Helm or the Helm Terraform Provider, but not how it got there.

In any case, I don't have issues doing upgrade over failed first-time deploys with Helm 3.2.1 since #7653 was merged. You might want to double-check the specific Helm version the provider is actually using? It's also possible it may be to do with the way the Helm Terraform provider figures out the state of the release after an install failure. I don't have any experience with that provider, and personally am not in favour of wrapping Helm with another declarative abstraction such as TF because I find it even more opaque when things go wrong, but you might want to dig further there all the same.

In any case, as I said above, if the error you are stuck at is has no deployed releases after a failed first-time deployment, I don't think either replace nor force are likely to help you resurrect the situation without some other intervention and it would be best to debug it further and have any conversation elsewhere, as going back and forth on this old closed ticket with 51 participants doesn't seem so productive for all concerned.

chadlwilson on 2 Sep 2020

No there was no typo. Also, this happens regardless of being first deployment or later.

As I mentioned we are using --wait option to wait for the deployment in Jenkins and then to notify whether the deployment failed or not.

It seems, if timeout is reached and the deployment isn't successful, helm marked the deployment as failed and there is no way to recover other than manually deleting that release. And we don't want to delete the release automatically either because that's scary.

So if we remove --wait option, helm will mark the deployment as successful regardless.

Workaround:

Now I found another solution. For those who have the same problem and want their automation to work nicely as it used to work before, here is my workaround:

Remove --wait option from helm deploy
Use this command to retrieve the list of deployment for that namespace that you are deploying against: kubectl get deployments -n ${namespace} -o jsonpath='{range .items[*].metadata}{.name}{","}{end}'
You can use split to turn the comma separated list above into an array
Then you can run multiple commands in parallel (we use Jenkins so it's easy to do so) as kubectl rollout status deployment ${deploymentName} --watch=true --timeout=${timeout} -n ${namespace}
If after the timeout, for example 7m means 7 minutes, the deployment still not successful, the command exits with error
Problem solved.

xbmono on 3 Sep 2020

Actually - nevermind. For those affected by this, there is one solution: delete the history record from kubernetes manually. It's stored as a secret. If I delete the offending pending-install state entry, then I can successfully run upgrade --install again!

Alternatively, this worked for me:

helm uninstall {{release name}} -n {{namespace}}

LasTshaMAN on 16 Sep 2020

fixed by kubectl -n $namespace delete secret -lstatus=pending-upgrade
Run now helm again.

abdennour on 29 Sep 2020

👍5 🚀4 ❤3 😄3 🎉2 👀1

I am not sure why this is closed, I've just hit it with brand new Helm 3.3.4. If initial install fails, second helm upgrade --install --force still shows the same error. All those workarounds work, but are manual, they don't help when you want to completely, 100% automatic CI/CD where you can simply push the fix to trigger another deployment without manually doing cleanup.

Has anyone thought of simply adding a flag that this is the first release so it should be safe to just delete it automatically? Or adding something like "--force-delete-on-failure"? Ignoring the problem is not going to help.

nick4fake on 8 Oct 2020

👍4

@nick4fake AFIK it was closed by PR #7653. @yinzara might be able to to provide more details.

hickeyma on 8 Oct 2020

It was a decision by the maintainers to not allow overwriting a pending-upgrade release. But your statement that all work arounds are work arounds that don't work in a CI/CD pipeline are not true. The last suggested work around could be added as a build step before running your helm upgrade (i also would not use --force in a CI/CD pipieline). It has the same effect as what you've suggested except that it deletes the release right before you install the next release instead of immediately afterwards allowing you to debug the cause of the failure.

yinzara on 8 Oct 2020

I have also used the following in my automated build to uninstall any "pending" releases before I run my upgrade command (make sure to set the NS_NAME environment variable to the namespace you're deploying to):
```bash

!/usr/bin/env bash

RELEASES=$(helm list --namespace $NS_NAME --pending --output json | jq -r '.[] | select(.status=="pending-install")|.name')
if [[ ! -z "$RELEASES" ]]; then
helm delete --namespace $NS_NAME $RELEASES
fi

yinzara on 8 Oct 2020

👍1

@yinzara thank you for the snippet, it is very helpful for those finding this thread.

My point is still valid - it is not safe to simply delete release. Why can't Helm force-upgrade release if a single resource fails? Replacing release with a new version seems a better solution than full deletion. I might not understand some core fundamentals of Helm (like how it manages state) so it might be not possible to do, but I still don't understand why it is better force users to manually intervene if first installation fails.

I mean, just check this discussion thread, people still face the issue. What do you think about possibly adding some additional information to Helm error message with link to this thread + some suggestions on what to do?

nick4fake on 10 Oct 2020

@nick4fake I think you're mixing up "failed" with "pending-install".

The library maintainers agree with you about failed releases, that's why they accepted my PR.

A "failed" release CAN be upgraded. That's what my PR did. If a release fails because one of the resources failed, you can just upgrade that release (i.e. upgrade --install works too) and it will not give the "app-name" has no deployed releases error.

You're talking about a "pending-install" release. The maintainers do not think it is safe to allow you to upgrade a pending-install release (forced or otherwise) as it could possibly be in progress still or be in a partially complete state that they don't feel can be resolved automatically. My PR originally allowed this state and the maintainers asked me to remove it.

If you find your releases in this state, you might want to reconsider your deployment configuration. This should never happen in a properly configured CI/CD pipeline. It should either fail or succeed. "pending" implies the install was cancelled while it was still processing.

I am not a maintainer so my opinion on your suggestion is irrelevant however I do not find any mention in the codebase to a Github issue that's actually printed in an error or message, so I'm betting they won't allow that, but you're welcome to put together a PR and see :-)

yinzara on 10 Oct 2020

👍1

That being said, I don't agree with your statement that your point is still valid. My suggestion may remove the pending release, however @abdennour suggestion right before yours is just to delete the secret that describes the pending install release. If you do that you're not deleting any of the resources from the release and can upgrade the release.

yinzara on 10 Oct 2020

What do you think about possibly adding some additional information to Helm error message with link to this thread + some suggestions on what to do?

+1 to this. We still have to google around, to find this thread, to understand what's a pending-install release, so we can begin to reason about this error message.

omnibs on 13 Oct 2020

I had issues with helm upgrade and it lead me here. It was solved by adding -n <namespace>. Maybe it will help someone out there.

sajtrus on 20 Oct 2020

For Helm3, Could be solved through patch
kubectl -n <namespace> patch secret <release-name>.<version> --type=merge -p '{"metadata":{"labels":{"status":"deployed"}}}'

release-name and version - Can be seen from kubectl get secrets -n <namespace> | grep helm

Jenishk56 on 27 Oct 2020

👍6 🎉5

Helm: app-name has no deployed releases

Most helpful comment

All 120 comments

Workaround:

!/usr/bin/env bash

Related issues