Ansible: make with_ loops configurable

Created on 25 Aug 2015  ·  90Comments  ·  Source: ansible/ansible

ISSUE TYPE

Feature Idea

COMPONENT NAME

core

ANSIBLE VERSION

2.1

CONFIGURATION
OS / ENVIRONMENT
SUMMARY
how: 
    forks: 1
    pause: 0
    squash: name
    label: "{{item.name}}"
    end: on_fail
with_items: ...
  • forks: forks within the loop to do items in parallel, default 1, this needs warnings
  • pause: between loop executions, useful in throttled api scenario _Done in 2.2_
  • squash: join all items into list and pass to provided option, works like current hardcoded opts for apt, yum, etc, by default it should be None _abandon_: reversed opinion, we should remove this feature
  • end: when to interrupt the loop, default is 'last item', options? on_fail, on_success (first one)?
  • label: (#13710) what to display when outputting the item loop _Done in 2.2_

docs to current state at:

http://docs.ansible.com/ansible/playbooks_loops.html#loop-control

STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
affects_2.1 affects_2.3 feature core

Most helpful comment

+1 forks

Me waiting for openstack modules to iterate through with_ loops on 100+ items...
image

All 90 comments

Please let's not call it how. That's even worse to read than become: true. But the functionality under it looks great.

includes fix for #10695

Excellent. In the interests of bikeshedding, maybe call it looping:.

:+1:

+1

+1 especially for within host parallelization!

:+1:

:+1:
but lets not call it "how"

so here is a workaround for breaking a loop task after the first failure

- hosts: localhost
  vars:
    myvar:
        - 1
        - 2
        - 3
        - 4
        - 5
  tasks:
    - name: break loop after 3
      debug: msg={{item}}
      failed_when: item == 3
      register: myresults
      when: not (myresults|default({}))|failed
      with_items: "{{myvar}}"

@bcoca not working from end (ansible 1.9.3 ubuntu )

TASK: [break loop after 3] ******************
ok: [localhost] => (item=1) => {
"failed": false,
"failed_when_result": false,
"item": 1,
"msg": "1"
}
ok: [localhost] => (item=2) => {
"failed": false,
"failed_when_result": false,
"item": 2,
"msg": "2"
}
failed: [localhost] => (item=3) => {"failed": true, "failed_when_result": true, "item": 3, "verbose_always": true}
msg: 3
ok: [localhost] => (item=4) => {
"failed": false,
"failed_when_result": false,
"item": 4,
"msg": "4"
}
ok: [localhost] => (item=5) => {
"failed": false,
"failed_when_result": false,
"item": 5,
"msg": "5"
}

ah, yes, it will work as is in 2.0 as in 1.9 the registration does not occur until after the loop is done.

+1 on forks

+1
perhaps instead of "how", loop_details or options?

+1

+1, using wait_for from localhost to wait for 100 EC2 hosts to come up is killing me...

+1 for similar reason to senderista

+1

:+1:

Both squash and forks would be awesome features which would speed up Ansible execution immensely.

I would also replace how with something like loop_details, loop_settings, loop_options, or anything similar.

loop_control , already in 2.1 with the label part implemented.

squash might just go away as it is easy to just pass a list to the modules that support it:

apt: name={{listofpackages}}

and avoid the loop completely

+1 forks

+1 forks

I had a use case for a new config for a conditional break break_when

+1 forks and I hope it'll also work for parallelizing sequences of tasks to run as in:
- include: service.yml
with_items: "{{services|default([])}}"

Otherwise, there's the async/async_status already.

+1 forks

Me waiting for openstack modules to iterate through with_ loops on 100+ items...
image

+1 on forks. Could use this for copying AMIs to all the AWS regions.

+1 on forks. Please! make it part of loop_control

+1 on forks

+1 on forks

+1, need forks too :-)

+1 on forks

+1 on forks

Forks would be awesome, +1

I always sit in silence not wanting to contribute to the spam, because it's hard to gauge between projects whether it's welcomed or not, but it looks like the fork +1 train has left the station!

+1 on forks

+1 on forks

+1 on forks

:+1: on forks

@bcoca Greetings! Thanks for taking the time to open this issue. In order for the community to handle your issue effectively, we need a bit more information.

Here are the items we could not find in your description:

  • issue type
  • ansible version
  • component name

Please set the description of this issue with this template:
https://raw.githubusercontent.com/ansible/ansible/devel/.github/ISSUE_TEMPLATE.md

click here for bot help

+1 on forks

+1 on forks!

+1 on Forks!

+1 on forks!

+1 on forks!

Any update on fork? When is it supposed to be ready?
My use case is instantiating about 20 containers on a server with with_sequence. Now it takes too many ages :(
I would be glad to help, but I would need some tips on where to put hands

@bitliner no one has created a PR for it, if that is what you are asking, its actually very hard to do correctly.

as for your issue, just declare X hosts in inventory and loop over hosts: instead of with_ to create them in paralell.

inventory:

[containers]
container[000:020]
hosts: containers
gather_facts: false
tasks:
   - container: state=present name={{inventory_hostname}}

i.e container is a 'made up' module.

@bcoca your solution is not clear to me. To be sure, is this what you mean?

hosts file

[containers]
192.168.1.100
192.168.1.100
192.168.1.100
192.168.1.100
... and so on based on the degree of parallelism I want to get ...

main.yml file

hosts: containers
gather_facts: false
tasks:
   - container: state=present name={{inventory_hostname}}

based on a container module that I should implement, correct? In this case, I would have all containers with the same name, and that is not acceptable, correct?

Furthermore, what are the challenges to do implement loop in parallel correctly?

My use case needs to speed up this task:

- name: "Start clients"
  docker_container:
    name: "mycontainer-{{ item }}"
    image: myimage
    links: 
      - server-{{item}}:server-{{item}}
  with_sequence: count={{ scale }}

I can't use docker-compose scale because I need to route traffic among containers in a specific way (that is why I use with_sequence to generate different docker container names).

I could build a module that takes the declaration of a container and based on a scale parameter it instantiates remotely multiple containers in parallel. Does it make sense? Do you have any help to understand how to call/re-use in myModule the docker_container module and what are the API that Ansible offers to run something in parallel?

@bitliner you did not follow my instructions, i had unique names in inventory (using range to work just like a sequence). As names are unique in inventory, you just declared the same name N times, but still have one host, which does not cause your 2nd issue on dupe names of containers as you only loop over 1 host.

In any case, if you want to follow up with your issue, use ML or IRC as it would be a bit off topic for this ticket.

@bcoca How can I have one host if you declared 20 hosts?

[containers]
container[000:020]

is going to connect to container001, container002, etc.

It works for having unique names, but what is not clear to me is why you say I still have one host (instead of 20).
hosts:containers means to me 20 hosts, not just one. What am I ignoring in this solution?

@bitliner cause this:

[containers]
192.168.1.100
192.168.1.100
192.168.1.100
192.168.1.100

is not 4 hosts, but 1

also at this point this is pretty much off topic, if you want to continue getting help on this go to IRC or ML

I've solved using

[containers]
ip[000:020] ansible_host=192.168.1.100

and

- name: "Start containers"
  docker_container:
    name: "my-container-{{ inventory_hostname }}"

A question: imaging to add a fork statement, would the changes consists of rewriting the method run_loop in order to make it manage the level of parallelism and the asynchrony ?

and then it starts to get 'fun' ...:

  • does the loop fork count against the global --forks, make that per remote?
  • what to do with loops when items depend on previous items (not just task execution, but conditionals, etc)?
  • how to handle concurrency issues when multiple forks execute on same host? i.e they update same file
  • how do we handle cleanup? right now they can reuse tmp dirs .. but each execution cleans after itself, now this can cause issues.

And there are a few other issues that i know of .. sure there are plenty that I wont be aware until someone tries to implement it. I have solutions for a few, but it starts getting out of hand pretty quick.

@bcoca Loop forking should not be enabled by default. I would prefer to see the default set to 1 and introduce it as a parameter forks or serial, but include a warning. It will likely break some existing code. That having been said, I am very much looking forward to this feature (most especially for tasks which require delegate_to)

+1 for forks (false by default)

+1 forks

+1 forks

+1 forks

+1 forks

+1 forks

+1 for forks as well, however in the meantime there's also a new Ansible strategy plugin that gives a big performance increase in general, and also for with_items loops. Perhaps for those wanting forks for performance reasons it's worth looking at:

https://github.com/dw/mitogen
https://mitogen.readthedocs.io/en/latest/ansible.html

I can’t see how this will improve with_items loops exactly. This plugin
improves performance issues caused by using ssh as a connection method.
Especially over long distances and latent networks and with large numbers
of servers.

This doesn’t help with AWS or Azure cloud functions where the executation
happens on the ansible controller and just executes on a set of items in
that cloud system and doesn’t connect to hosts at all, which is the
primary issue with with_items being slow. It has nothing to with large set
of machines or latency or anything related to ssh. It’s simply the fact
that it executes cloud functions in a with_items loop in serial and nothing
can speed that up except the cloud provider improving its speed or a
parallel execution of those cloud functions by ansible.

It also doesn’t mention with_items in the article at all so I can’t see how
this will help even in the tiniest little bit. Can you explain a bit more
how this could help? Id like to know what I’m missing if I am missing
something here.

On Sat, 10 Mar 2018 at 21:58, NielsH notifications@github.com wrote:

+1 for forks as well, however in the meantime there's also a new Ansible
strategy plugin that gives a big performance increase in general, and also
for with_items loops. Perhaps for those wanting forks for performance
reasons it's worth looking at:

https://github.com/dw/mitogen
https://mitogen.readthedocs.io/en/latest/ansible.html


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372070418,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59qWI9n_DNmUqpmZYiIOMZAyjJy3uks5tdEyQgaJpZM4Fx8zF
.

Indeed, it won't help in all cases. However the reason I'm looking for forks in with_items is because of the slowness with processing each item individually (even with pipelining). Sometimes I have to create a large number (several hundred) of directories based on host_vars, or template a few hundred files. So I'm looping over the file and template module mostly.

I once tested templating 100 files in 100 separate files through with_items vs looping over the items in the jinja template itself and merging the template in a single large file. Everything in a single file takes 5 seconds but creating 100 separate config files takes 30 minutes.

The plugin I mentioned gave such a big improvement for me I thought it was worth mentioning it here.

since loops just execute the same task a time per item, any improvement in task execution speed should translate into faster loops. This happens only to affect 'remote tasks' so anything local will not see the gains.

Agreed. I'm using ansible to run only local tasks. In particular, to build a dozen or so docker images. At the moment, ansible builds them serially, so it takes a lot of time and underutilises the multi-core CPU. I would like to build multiple docker images in parallel.

@gjcarneiro then don't define them as data, define them as hosts and target them, then delegate_to: localhost to execute the actions in parallel

Hah, thanks for the neat trick :) But still, even if it works (I haven'ted tested) it is a rather convoluted way of running tasks in parallel.

Then again, I may be using ansible for completely different purpose than it was intended, so in a way it's my own fault :(

not really convoluted, it is how Ansible is meant to use paralellization, by host, not by variable.

Yes, I understand, it's not Ansible's fault, it makes sense. But I'm using Ansible as build system (instead of e.g. make), because Ansible is nice as build system in most ways. But, in my frame of mind, thinking as a build system, "hosts" don't make sense. A build system like "make" doesn't care about "hosts", it only cares about files and tasks. I am forcing Ansible to be used as build system, and that causes a bit of cognitive dissonance, that's all.

Ansible only cares about Hosts and Tasks, consider the images you are building 'hosts' and suddenly it fits both paradigms.

Ansible is a configuration management tool for many other things, networks
devices, both real and Virtual, for a huge amount of cloud services such as
databases, web services such as eleastic beanstalk, lambda and all the
components that apply to it like IAM security components, while Ansible is
good at hosts if your still running mostly VMs/hosts your basically in
Early 2000s IT. Not offending anyone here there are sometimes important
reasons for running VMs or even docker containers but they all stem back to
legacy reasons. In fact more and more hosts are going to become less of
what it automates. IMO If we don’t get parallel with_items we might as
well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

@gjcarneiro then don't define them as data, define them as hosts and target them, then delegate_to: localhost to execute the actions in parallel

This is a very nice approach but it doesn't seem to work inside workaround for rolling restarts with serial=1 simulation (#12170). So an option for paralellization would add a lot more flexibility.

no doubt, but it also adds a huge layer of complexity and the need to deal with concurrent actions on a single host ala hosts:all + lineinfile + delegate_to: localhost

Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.

My playbook looks like this:

  • hosts: "{{ DeploymentGroup }}"

    tasks:

    • name: remove vm and all associated resources
      azure_rm_virtualmachine:
      resource_group: "{{ host_vars[item]['resource_group'] }}"
      name: "{{ inventory_hostname }}"
      state: absent

    delegate_to: 127.0.0.1


Unfortunately it still tries to connect to the machines listed in hosts to
execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try this
previously in many different ways so just want to know you are able to do
this.

Does this actually even work? Hopefully this is just some syntax issue.

On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <[email protected]

wrote:

Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

Okay so disabling fact gathering fixes this issue, however it causes
another one, host_vars no longer contains the azure dynamic inventory from
standard in.

So resource_group: "{{ host_vars[item]['resource_group'] }}" doesn't
work in the above and needs to be hard coded to a resource group name.

On Sun, Mar 18, 2018 at 11:14 AM, Isaac Egglestone <
[email protected]> wrote:

Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.

My playbook looks like this:

  • hosts: "{{ DeploymentGroup }}"

    tasks:

    • name: remove vm and all associated resources
      azure_rm_virtualmachine:
      resource_group: "{{ host_vars[item]['resource_group'] }}"
      name: "{{ inventory_hostname }}"
      state: absent

    delegate_to: 127.0.0.1


Unfortunately it still tries to connect to the machines listed in hosts to
execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try this
previously in many different ways so just want to know you are able to do
this.

Does this actually even work? Hopefully this is just some syntax issue.

On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <
[email protected]> wrote:

Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com
wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

Okay so I have modified the Playbook below to try a number of things.

1st I tried setting delegate_facts: True in case this helped but of course
even based on the documentation I didn't really expect that to work.
2nd I set gather_facts: no and tried running setup to reduce the fact
gathering to nothing hoping it would opt to not connect at all, but of
course as expected it still tried to connect to the machine.
3rd Tried setting connection: localhost but strangely it still wants to
connect remotely to the machine to gather the facts even though it knows it
will execute the play locally, a bit annoying there but I get the logic as
how else will it know the details of the host in question without doing
this..

I can probably use the playbook to turn the machines on first and then let
ansible login to them and gather the unneeded facts. This would be so that
I can get host_vars to work and then delete the machines. I'd like to know
if anyone has a better solution here as that's also a huge time consuming
effort when I've got a hundred or more machines and I have to power them
all up just to then delete them.

So far I'm seeing using this as a solution instead of a with_items parallel
solution as having potential but the machines in question still need to be
up and reachable if you need any kind of facts from azure_rm.py while you
do this so there is at least one caveat there. That is unless someone knows
how to get access to host_vars from azure that are passed via standard in
when gather_facts: no

Actually I of course have the same problem when I run all this using a
with_items list, however I was hoping to avoid that work around if I'm
going to use hosts again. The work around is dumping the azure_rm.py to a
json file on the command line and then loading into a variable to get
access to them again.

If I look forward to my end goal here to modify hundreds or even thousands
of serverless components in parallel, perhaps this will be okay as I can
use things like azure_rm_functionapp_facts
http://docs.ansible.com/ansible/latest/azure_rm_functionapp_facts_module.html
to
gather facts about them and use them in the play in theory although this
has yet to be tested.

I still don't have any great logic on how to do this properly to create a
documentation pull request about it as the method seems so far largely
dependant on what your doing and I'm not sure I want to suggest using the
json dump hack in the documentation.

I'll wait for some feedback from anyone who happens to care about this on
this issue list to decide my next step. Meanwhile I'll use my hack to get
my immediate work done.


  • hosts: "{{ DeploymentGroup }}"
    gather_facts: no
    tasks:

    • setup:

      gather_subset=!all,!min

    • name: remove vm and all associated resources

      azure_rm_virtualmachine:

      resource_group: "{{ host_vars[inventory_hostname]['resource_group']

      }}"

      name: "{{ inventory_hostname }}"

      state: absent

      delegate_to: localhost

      delegate_facts: True


On Sun, Mar 18, 2018 at 12:04 PM, Isaac Egglestone <
[email protected]> wrote:

Okay so disabling fact gathering fixes this issue, however it causes
another one, host_vars no longer contains the azure dynamic inventory from
standard in.

So resource_group: "{{ host_vars[item]['resource_group'] }}"
doesn't work in the above and needs to be hard coded to a resource group
name.

On Sun, Mar 18, 2018 at 11:14 AM, Isaac Egglestone <
[email protected]> wrote:

Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.

My playbook looks like this:

  • hosts: "{{ DeploymentGroup }}"

    tasks:

    • name: remove vm and all associated resources
      azure_rm_virtualmachine:
      resource_group: "{{ host_vars[item]['resource_group'] }}"
      name: "{{ inventory_hostname }}"
      state: absent

    delegate_to: 127.0.0.1


Unfortunately it still tries to connect to the machines listed in hosts
to execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try
this previously in many different ways so just want to know you are able to
do this.

Does this actually even work? Hopefully this is just some syntax issue.

On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <
[email protected]> wrote:

Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com
wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

I have a use case for forks too, which would make this a lot easier. The playbook is deploying a bunch of openstack instances via terraform with randomly picked floating ips. Then I iterate over the ips to check that port 22 is open on each created host. Current method to do this is with a multiplay playbook:

- hosts: localhost
  connection: local
  gather_facts: no
  tasks:
...
  - name: Run terraform
    terraform:
      plan_file: '{{tf_plan | default(omit)}}'
      project_path: '{{terraform_path}}/{{infra}}'
      state: '{{state}}'
      state_file: '{{stat_tfstate.stat.exists | ternary(stat_tfstate.stat.path, omit)}}'
      variables: '{{terraform_vars | default(omit)}}'
    register: tf_output

  - name: Add new hosts
    with_items: '{{tf_output.outputs.ip.value}}' # configued this in terraform to output a list of assigned ips.
    add_host:
      name: '{{item}}'
      groups: new_hosts

- hosts: new_hosts
  gather_facts: no
  connection: local
  tasks:
   - name: Wait for port 22 to become available
     wait_for:
       host: '{{ansible_host}}'
       port: 22
       state: started
       timeout: 60

This is run with: ansible-playbook -i localhost, deploy-test-clients.yml --extra-vars="infra=terraform_os_instances state=present"
This is of course a limited workaround since you don't always have a neatly-inventory-parseable list of ips to work with.

Since a lot of people seem to be struggling with the performance of templating files locally, maybe a specific template_local module could be created to solve this specific issue instead. At least it'd be a start... I'd have a go myself but won't have time for the forseeable future.

30+ minutes to template 100 files that can be done in 5s with jinja is ridiculous.

@saplla templating always happens locally, the only thing that happens remotely is copying the template and setting permissions.

Just to clarify, I'm talking about those users who want to template files as local tasks, e.g. to feed into other build systems, or in my case, to deploy k8s resources using kubectl.

What I mean is to offload the looping and templating to jinja via a module that is a simple wrapper. The module could take some context and the loop definition (what would normally be put into with_nested and friends) and just cut out ansible entirely for this task (perhaps the wrapper could run jinja in parallel if it speeds things up).

It could be invoked like this:

    template_parallel:
      src: "{{ item[0] }}"
      dest: "{{ tempdir }}/{{ item[1] }}-{{ item[0] | basename }}"
      context: "{{ hostvars[inventory_hostname] }}"
      nested:
      - "{{ templates.stdout_lines }}"
      - "{{ namespaces.stdout_lines }}"

The above example takes all variables defined by ansible as the context, but any dict could be passed in.

As I say, I haven't got time to work on this right now, but does the approach sound feasible @bcoca ?

That assumes that each item is independent, that is not always the case, you can make the current item values depend on the previous ones and/or results of previous iterations, or they can just be cumulative.

Most of the time spent templating has to do with the vars, not the templates themselves, since they need to be consistent, you would not gain much in parallelization unless you are willing to change behaviours that would break current assumptions.

Also, templates are already parallel, by host, just not by item.

OK thanks for the thoughts. It'd actually be good enough for my use case and it sounds like it might be for some other people in this thread too. I'm just using ansible to load hierarchical configs and template files locally before invoking some binary that deploys them (kubectl, helm, etc). I'd be happy with a dead-simple, lightweight templating module if it was so performant it reduced templating times from minutes to seconds.

I'll try to look at this when it becomes an issue for us, unless someone beats me to it.

I originally filed #10695 but seeing that this is going to take a while to come together I ended up addressing these use cases with shell scripts (eg. just say I have to do something on 50 Git repos on a single host, I use Ansible to run a single script once that does the thing 50 times). Unfortunately, this means giving up some of the stuff that you get for free with Ansible, like very granular change reporting, and you also have to implement all of the "run only if" logic yourself and be very careful about error handling, but it is probably two orders of magnitude faster. As such, even if we wind up getting a "parallel" option in the future, it might not be as fast as my custom scripts and I probably won't bother switching to it.

@wincent a parallel loop will probably still always be slower than a shell script/dedicated program, as Ansible does much more than just 'apply the action'.

@bcoca: yep, that confirms my understanding.

@saplla k8s_raw is better than using template for this, you can inline the yaml in your inventory if needed :) (it's not the subject of this PR)
what is the current state about this ? Can we expect something in 2.6 @bcoca ?
I'm managing thousands of postgresql privileges on my DB clusters and 25 minutes is painfully slow

@nerzhul Thanks but it's not better for us. Too much magic. We need templating.

@sapila you could always create a host target per template to parallelize templating as much as possible and then use subsequent plays or delegation to deliver to the proper actual hosts.

@bcoca a little bit hacky :)

not at all, its a LOT hacky, but works today

Was this page helpful?
0 / 5 - 0 ratings