Terraform-provider-aws: data.aws_ecs_task_definition: Failed getting task definition

Created on 28 Jul 2017  ·  25Comments  ·  Source: hashicorp/terraform-provider-aws

Terraform Version

0.9.11.

  • aws_ecs_task_definition

Terraform Configuration Files

data "aws_ecs_task_definition" "my-service" {
  task_definition = "${aws_ecs_task_definition.my-service.family}"
}

resource "aws_ecs_task_definition" "my-service" {
  family                = "${var.environment_name}-${var.service_name}-${var.instance_name}"
  network_mode          = "bridge"
  container_definitions = "${data.template_file.my-service.rendered}"
}

resource "aws_ecs_service" "my-service" {
 ...
  #Track the latest ACTIVE revision
  task_definition = "${aws_ecs_task_definition.my-services.family}:${max("${aws_ecs_task_definition.my-service.revision}", "${data.aws_ecs_task_definition.my-service.revision}")}"
...
}

Expected Behavior

if resource not exists create new aws_ecs_task_definition else use latest aws_ecs_task_definition version

this code vork fine in Terraform v0.9.2

Actual Behavior

: Failed getting task definition ClientException: Unable to describe task definition.
status code: 400, request id: "my-service"

Steps to Reproduce

  1. terraform apply
bug servicecs

Most helpful comment

I was able to get around this issue by adding a "depends_on" to the data source:

resource "aws_ecs_task_definition" "task" {
...
}
data "aws_ecs_task_definition" "task" {
  depends_on = [ "aws_ecs_task_definition.task" ]
  ...
}

Hope it helps.

All 25 comments

also reproduced in terraform 1.0

I'm also experiencing the same issue! What's curious is that when attempting the search using a vanilla state (completely empty), the plan and apply work as expected. It's only when I have an existing state file that it doesn't work.

Even more curious, the resources don't exist in the statefile anyhow, and yet it fails? 🤔

Diving into debugging... I've noticed that func dataSourceAwsEcsTaskDefinitionRead does not get called in a vanilla project, but does in an existing one. This appears to be a terraform pattern. I was able to reproduce this by creating a simple resource first (a security group) then trying to perform a lookup. The plan failed when a resource was already present in a statefile (the security group in this case). I verified my hypothesis by also creating a different data source which looked up a non-existent security group. The plan for this also failed.

If the arguments of a data instance contain no references to computed values, such as attributes of resources that have not yet been created, then the data instance will be read and its state updated during Terraform's "refresh" phase, which by default runs prior to creating a plan. This ensures that the retrieved data is available for use during planning and the diff will show the real values obtained.

Data instance arguments may refer to computed values, in which case the attributes of the instance itself cannot be resolved until all of its arguments are defined. In this case, refreshing the data instance will be deferred until the "apply" phase, and all interpolations of the data instance attributes will show as "computed" in the plan since the values are not yet known.

This is doubly interesting to me. Based on the above docs, OP's config shouldn't be failing because data.aws_ecs_task_definition.my-service depends on aws_ecs_task_definition.my-service.family, but it's failing in the plan* phase (my problem as well). Perhaps this is a terraform-level bug and not a provider-level?

  • Edit: incorrectly said it failed in the apply phase instead of the plan phase.

@radeksimko could we get your eyes on this? I don't want to spam the main repo if it's not a terraform issue.

I'm seeing this issue as well.

I actually don't need data and resource for the same thing in the same file. I commented out the data and now it seems to be working better.

I was able to get around this issue by adding a "depends_on" to the data source:

resource "aws_ecs_task_definition" "task" {
...
}
data "aws_ecs_task_definition" "task" {
  depends_on = [ "aws_ecs_task_definition.task" ]
  ...
}

Hope it helps.

It's not really a bug, the solution from @parruda is correct. The resource aws_ecs_service and the data aws_ecs_task_definition both expect that related resource aws_ecs_task_definition must be already created.

@KIVagant that makes sense, as I was also experiencing the same issue.

Though I would say the Terraform docs for that show the data object and resource being used together should be updated to reflect this. as it stands now the doc's imply that if the resource doesn't exist then nothing should fail.

Otherwise @parruda solutions makes sense for me

Ya I probably should of tried the fix before replying, it works but it causes continuous change detection to occur.
Which is not the expected/desired result

@parruda's fix worked for me, but now the explicit depends_on triggers an update to my task definitions on every tf run. Is there a best practice to prevent that? I'm using Terraform v0.11.5
and provider.aws v1.10.0.

@dendrochronology, I use something like this:

data "aws_ecs_task_definition" "blabla" {
  task_definition = "${aws_ecs_task_definition.blabla.family}"
  depends_on = [ "aws_ecs_task_definition.blabla" ]
}


resource "aws_ecs_task_definition" "..." {
  family                = "..."
  task_role_arn         = "${aws_iam_role.blabla.arn}"

  container_definitions = "${data.template_file.task_definition.rendered}"

  depends_on = [
    "data.template_file.task_definition",
  ]

  lifecycle {
    ignore_changes = [
      "container_definitions" # if template file changed, do nothing, believe that human's changes are source of truth
    ]
  }
}


resource "aws_ecs_service" "blabla" {
  name            = "blabla"
  cluster         = "${aws_ecs_cluster.cluster_name.id}"
  task_definition = "${aws_ecs_task_definition.blabla.family}:${max("${aws_ecs_task_definition.blabla.revision}", "${data.aws_ecs_task_definition.blabla.revision}")}"
  desired_count   = 1
  iam_role        = "${aws_iam_role.ecs_service.name}"

// Not compatible with placement_constraints:distinctInstance, commented
//  placement_strategy {
//    type  = "binpack"
//    field = "cpu"
//  }

  placement_constraints {
    type  = "distinctInstance"
  }

  load_balancer {
    elb_name       = "${aws_elb.blabla.name}"
    container_name = "internal"
    container_port = "${var.blabla_port}"
  }

  depends_on = [
    "aws_iam_role.ecs_service",
    "aws_elb.blabla",
    "aws_iam_role.blabla",
    "aws_ecs_task_definition.blabla"
  ]

  lifecycle {
    ignore_changes = ["task_definition"] # the same here, do nothing if it was already installed
  }
}

@KIVagant ahhh, I'm going to play with the ignore_changes lifecycle hook!

Ah, nice, I'll play with that, too. Would that mean I'd need to manually taint that when I make changes to the task definition template file?

It depends on your goals. In our case the template contains empty place for secrets which are filling after first install by Terraform and we don't want to allow it to change exist task definitions. And we control them manually after first install.

@dendrochronology sorry for the lack of response. I actually never noticed the problem because we do want to update the task definition on every run. I hope you found a solution.

This still seems to be a problem, if you just use what is on the docs you will get this:

Error: Error running plan: 1 error(s) occurred:

* module.frontshop_staging.data.aws_ecs_task_definition.frontshop: 1 error(s) occurred:

* module.frontshop_staging.data.aws_ecs_task_definition.frontshop: Resource 'aws_ecs_task_definition.frontshop' not found for variable 'aws_ecs_task_definition.frontshop.family'

The only changed things are that this is inside a module and the name is frontshop. Could it be related to the module?
I tried also with depends_on and it won't work. I am thinking of applying a first version to create the resource and then use the data with max to get the latest revision.

Actually, what I said is a lie, looks like there is a problem when you have an invalid JSON for container definitions and mine is not using the heredoc syntax but a json file with a template and it should be an array of containers and i have only one main object.
Here where I found out about it https://github.com/terraform-providers/terraform-provider-aws/issues/2026

nice one @jaysonsantos. In my case, the error came out because of json syntax error

With a provider upgrade to 1.59 and terraform 11.11, I am still seeing this error.

If terraform destroy completes with no errors, it works fine without a depends_on.

However, if terraform destroy fails on something else for instance:

 Error removing user, role, or group list from IAM Policy Detach bootstrap-iam-group-attach1:
– NoSuchEntity

Unrelated to the ecs service. Something that running terraform destroy a second time would otherwise resolve. On the second pass the

Failed getting task definition ClientException: Unable to describe task definition.

error resurfaces and the state file is corrupt.

This issue isn't very clear to me. Seems like some folks claim that we should NOT be using a depends_on in the datasource for the task definition but upon the first run it always fails because the resource doesnt exist.

FYI for everybody else stumbling over the issue: @skorfmann illustrated in this MR https://github.com/terraform-providers/terraform-provider-aws/pull/10247 a better workaround using aws_ecs_task_definition.self.revision and explains why the discussed depends_on approach is not what you want!

This is working around the issue of not having a task definition when the resources are initially rolled out. The documetation example of directly referecing "task_family" doesn't work and exits with an error when initially applying it. See also this issue #1274

The reason is, that data sources don't handle missing data gracefully. Unfortunately, that's not gonna be addressed, as stated here: hashicorp/terraform#16380 (comment). One of the suggested workarounds is, to add an explict depends_on. However, this causes a potential change in the terraform plan output, even though it's not actually going to change. Furthermore, it's discourage by the Terraform documentation itself.

This thread mentions a few other workarounds, but none of them seem to be suitable hashicorp/terraform#16380

aws_ecs_task_definition.self.revision can only be referenced, once the resource is created (in contrast to family, which is already present in code). Apparently, this allows Terraform to correctly resolve the dependencies and makes the data source behave as expected.

Was this page helpful?
0 / 5 - 0 ratings