ansible 🚀 - make with_ loops configurable

Please let's not call it how. That's even worse to read than become: true. But the functionality under it looks great.

amenonsen on 26 Aug 2015

👍8

includes fix for #10695

bcoca on 27 Aug 2015

Excellent. In the interests of bikeshedding, maybe call it looping:.

mahemoff on 13 Oct 2015

👍4 👎1

:+1:

realcnbs on 11 Nov 2015

👎1 👍1

+1

yikaus on 19 Nov 2015

👎1 👍1

+1 especially for within host parallelization!

jyennaco on 22 Nov 2015

👍3 👎1

:+1:

rbarabas on 24 Nov 2015

👎1 👍1

:+1:
but lets not call it "how"

hloeffler on 2 Dec 2015

so here is a workaround for breaking a loop task after the first failure

- hosts: localhost
  vars:
    myvar:
        - 1
        - 2
        - 3
        - 4
        - 5
  tasks:
    - name: break loop after 3
      debug: msg={{item}}
      failed_when: item == 3
      register: myresults
      when: not (myresults|default({}))|failed
      with_items: "{{myvar}}"

bcoca on 16 Dec 2015

@bcoca not working from end (ansible 1.9.3 ubuntu )

TASK: [break loop after 3] ******************
ok: [localhost] => (item=1) => {
"failed": false,
"failed_when_result": false,
"item": 1,
"msg": "1"
}
ok: [localhost] => (item=2) => {
"failed": false,
"failed_when_result": false,
"item": 2,
"msg": "2"
}
failed: [localhost] => (item=3) => {"failed": true, "failed_when_result": true, "item": 3, "verbose_always": true}
msg: 3
ok: [localhost] => (item=4) => {
"failed": false,
"failed_when_result": false,
"item": 4,
"msg": "4"
}
ok: [localhost] => (item=5) => {
"failed": false,
"failed_when_result": false,
"item": 5,
"msg": "5"
}

yikaus on 16 Dec 2015

ah, yes, it will work as is in 2.0 as in 1.9 the registration does not occur until after the loop is done.

bcoca on 18 Dec 2015

+1 on forks

mattyb on 7 Jan 2016

👍3 👎1

+1
perhaps instead of "how", loop_details or options?

t1m0thy on 26 Jan 2016

+1

daniel-sc on 14 Feb 2016

👎1 👍1

+1, using wait_for from localhost to wait for 100 EC2 hosts to come up is killing me...

senderista on 26 Feb 2016

👍2

+1 for similar reason to senderista

beholt on 9 Mar 2016

👎1 👍1

+1

ykuksenko on 29 May 2016

👎1 👍1

:+1:

evenme on 15 Jun 2016

👎1 👍1

Both squash and forks would be awesome features which would speed up Ansible execution immensely.

I would also replace how with something like loop_details, loop_settings, loop_options, or anything similar.

kustodian on 16 Jun 2016

loop_control , already in 2.1 with the label part implemented.

squash might just go away as it is easy to just pass a list to the modules that support it:

apt: name={{listofpackages}}

and avoid the loop completely

bcoca on 16 Jun 2016

http://docs.ansible.com/ansible/playbooks_loops.html#loop-control

mahemoff on 18 Jun 2016

+1 forks

jeanpaulmanjarres-payulatam on 7 Oct 2016

👍3 👎1

+1 forks

ansafonov on 26 Oct 2016

👍3 👎1

I had a use case for a new config for a conditional break break_when

resmo on 26 Oct 2016

+1 forks and I hope it'll also work for parallelizing sequences of tasks to run as in:
- include: service.yml
with_items: "{{services|default([])}}"

Otherwise, there's the async/async_status already.

kasabov on 28 Oct 2016

+1 forks

Me waiting for openstack modules to iterate through with_ loops on 100+ items...

bryfry on 29 Oct 2016

👍15

+1 on forks. Could use this for copying AMIs to all the AWS regions.

ghjm on 1 Nov 2016

👍4 👎1

+1 on forks. Please! make it part of loop_control

aarlint on 3 Nov 2016

👍4 👎1

+1 on forks

uebertrieben on 18 Nov 2016

👍4 ❤2 👎1

+1 on forks

ghost on 28 Nov 2016

👍4 👎1

+1, need forks too :-)

Schroeffu on 6 Dec 2016

👍4 👎2

+1 on forks

Haxe18 on 24 Jan 2017

👍3 👎1

+1 on forks

pschorf on 25 Jan 2017

👍3 👎1

Forks would be awesome, +1

berendt on 27 Jan 2017

👍4 👎1

I always sit in silence not wanting to contribute to the spam, because it's hard to gauge between projects whether it's welcomed or not, but it looks like the fork +1 train has left the station!

+1 on forks

xenithorb on 27 Jan 2017

👍4

+1 on forks

adrien-mogenet on 23 Feb 2017

👍3 👎1

+1 on forks

ImpSy on 23 Feb 2017

👍3 👎1

:+1: on forks

hai-ld on 25 Feb 2017

👍3 👎1

@bcoca Greetings! Thanks for taking the time to open this issue. In order for the community to handle your issue effectively, we need a bit more information.

Here are the items we could not find in your description:

issue type
ansible version
component name

Please set the description of this issue with this template:
https://raw.githubusercontent.com/ansible/ansible/devel/.github/ISSUE_TEMPLATE.md

click here for bot help

ansibot on 4 Apr 2017

+1 on forks

d3matt on 6 Apr 2017

👍4 👎2

+1 on forks!

inponomarev on 13 May 2017

👍4 👎2

+1 on Forks!

ahharu on 23 May 2017

👍4 👎2

+1 on forks!

bluefish6 on 21 Jun 2017

👍2

+1 on forks!

jt1 on 21 Jun 2017

👍2

Any update on fork? When is it supposed to be ready?
My use case is instantiating about 20 containers on a server with with_sequence. Now it takes too many ages :(
I would be glad to help, but I would need some tips on where to put hands

bitliner on 14 Aug 2017

👍4

@bitliner no one has created a PR for it, if that is what you are asking, its actually very hard to do correctly.

as for your issue, just declare X hosts in inventory and loop over hosts: instead of with_ to create them in paralell.

inventory:

[containers]
container[000:020]

hosts: containers
gather_facts: false
tasks:
   - container: state=present name={{inventory_hostname}}

i.e container is a 'made up' module.

bcoca on 14 Aug 2017

👍1

@bcoca your solution is not clear to me. To be sure, is this what you mean?

hosts file

[containers]
192.168.1.100
192.168.1.100
192.168.1.100
192.168.1.100
... and so on based on the degree of parallelism I want to get ...

main.yml file

hosts: containers
gather_facts: false
tasks:
   - container: state=present name={{inventory_hostname}}

based on a container module that I should implement, correct? In this case, I would have all containers with the same name, and that is not acceptable, correct?

Furthermore, what are the challenges to do implement loop in parallel correctly?

My use case needs to speed up this task:

- name: "Start clients"
  docker_container:
    name: "mycontainer-{{ item }}"
    image: myimage
    links: 
      - server-{{item}}:server-{{item}}
  with_sequence: count={{ scale }}

I can't use docker-compose scale because I need to route traffic among containers in a specific way (that is why I use with_sequence to generate different docker container names).

I could build a module that takes the declaration of a container and based on a scale parameter it instantiates remotely multiple containers in parallel. Does it make sense? Do you have any help to understand how to call/re-use in myModule the docker_container module and what are the API that Ansible offers to run something in parallel?

bitliner on 15 Aug 2017

@bitliner you did not follow my instructions, i had unique names in inventory (using range to work just like a sequence). As names are unique in inventory, you just declared the same name N times, but still have one host, which does not cause your 2nd issue on dupe names of containers as you only loop over 1 host.

In any case, if you want to follow up with your issue, use ML or IRC as it would be a bit off topic for this ticket.

bcoca on 16 Aug 2017

@bcoca How can I have one host if you declared 20 hosts?

[containers]
container[000:020]

is going to connect to container001, container002, etc.

It works for having unique names, but what is not clear to me is why you say I still have one host (instead of 20).
hosts:containers means to me 20 hosts, not just one. What am I ignoring in this solution?

bitliner on 7 Sep 2017

@bitliner cause this:

[containers]
192.168.1.100
192.168.1.100
192.168.1.100
192.168.1.100

is not 4 hosts, but 1

also at this point this is pretty much off topic, if you want to continue getting help on this go to IRC or ML

bcoca on 7 Sep 2017

I've solved using

[containers]
ip[000:020] ansible_host=192.168.1.100

and

- name: "Start containers"
  docker_container:
    name: "my-container-{{ inventory_hostname }}"

A question: imaging to add a fork statement, would the changes consists of rewriting the method run_loop in order to make it manage the level of parallelism and the asynchrony ?

bitliner on 7 Sep 2017

and then it starts to get 'fun' ...:

does the loop fork count against the global --forks, make that per remote?
what to do with loops when items depend on previous items (not just task execution, but conditionals, etc)?
how to handle concurrency issues when multiple forks execute on same host? i.e they update same file
how do we handle cleanup? right now they can reuse tmp dirs .. but each execution cleans after itself, now this can cause issues.

And there are a few other issues that i know of .. sure there are plenty that I wont be aware until someone tries to implement it. I have solutions for a few, but it starts getting out of hand pretty quick.

bcoca on 7 Sep 2017

👍2

@bcoca Loop forking should not be enabled by default. I would prefer to see the default set to 1 and introduce it as a parameter forks or serial, but include a warning. It will likely break some existing code. That having been said, I am very much looking forward to this feature (most especially for tasks which require delegate_to)

mattymo on 10 Sep 2017

+1 for forks (false by default)

pascalheraud on 10 Jan 2018

+1 forks

RealKelsar on 15 Jan 2018

👍2

+1 forks

mumutu66 on 18 Jan 2018

👍5

+1 forks

ctirs on 24 Jan 2018

👍3

+1 forks

flaviodsr on 8 Feb 2018

👎1

+1 forks

isaacegglestone on 11 Feb 2018

👎1

+1 for forks as well, however in the meantime there's also a new Ansible strategy plugin that gives a big performance increase in general, and also for with_items loops. Perhaps for those wanting forks for performance reasons it's worth looking at:

https://github.com/dw/mitogen
https://mitogen.readthedocs.io/en/latest/ansible.html

NielsH on 10 Mar 2018

I can’t see how this will improve with_items loops exactly. This plugin
improves performance issues caused by using ssh as a connection method.
Especially over long distances and latent networks and with large numbers
of servers.

This doesn’t help with AWS or Azure cloud functions where the executation
happens on the ansible controller and just executes on a set of items in
that cloud system and doesn’t connect to hosts at all, which is the
primary issue with with_items being slow. It has nothing to with large set
of machines or latency or anything related to ssh. It’s simply the fact
that it executes cloud functions in a with_items loop in serial and nothing
can speed that up except the cloud provider improving its speed or a
parallel execution of those cloud functions by ansible.

It also doesn’t mention with_items in the article at all so I can’t see how
this will help even in the tiniest little bit. Can you explain a bit more
how this could help? Id like to know what I’m missing if I am missing
something here.

On Sat, 10 Mar 2018 at 21:58, NielsH notifications@github.com wrote:

+1 for forks as well, however in the meantime there's also a new Ansible
strategy plugin that gives a big performance increase in general, and also
for with_items loops. Perhaps for those wanting forks for performance
reasons it's worth looking at:

https://github.com/dw/mitogen
https://mitogen.readthedocs.io/en/latest/ansible.html

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372070418,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59qWI9n_DNmUqpmZYiIOMZAyjJy3uks5tdEyQgaJpZM4Fx8zF
.

isaacegglestone on 11 Mar 2018

Indeed, it won't help in all cases. However the reason I'm looking for forks in with_items is because of the slowness with processing each item individually (even with pipelining). Sometimes I have to create a large number (several hundred) of directories based on host_vars, or template a few hundred files. So I'm looping over the file and template module mostly.

I once tested templating 100 files in 100 separate files through with_items vs looping over the items in the jinja template itself and merging the template in a single large file. Everything in a single file takes 5 seconds but creating 100 separate config files takes 30 minutes.

The plugin I mentioned gave such a big improvement for me I thought it was worth mentioning it here.

NielsH on 11 Mar 2018

since loops just execute the same task a time per item, any improvement in task execution speed should translate into faster loops. This happens only to affect 'remote tasks' so anything local will not see the gains.

bcoca on 12 Mar 2018

Agreed. I'm using ansible to run only local tasks. In particular, to build a dozen or so docker images. At the moment, ansible builds them serially, so it takes a lot of time and underutilises the multi-core CPU. I would like to build multiple docker images in parallel.

gjcarneiro on 12 Mar 2018

@gjcarneiro then don't define them as data, define them as hosts and target them, then delegate_to: localhost to execute the actions in parallel

bcoca on 12 Mar 2018

👍3

Hah, thanks for the neat trick :) But still, even if it works (I haven'ted tested) it is a rather convoluted way of running tasks in parallel.

Then again, I may be using ansible for completely different purpose than it was intended, so in a way it's my own fault :(

gjcarneiro on 12 Mar 2018

not really convoluted, it is how Ansible is meant to use paralellization, by host, not by variable.

bcoca on 12 Mar 2018

Yes, I understand, it's not Ansible's fault, it makes sense. But I'm using Ansible as build system (instead of e.g. make), because Ansible is nice as build system in most ways. But, in my frame of mind, thinking as a build system, "hosts" don't make sense. A build system like "make" doesn't care about "hosts", it only cares about files and tasks. I am forcing Ansible to be used as build system, and that causes a bit of cognitive dissonance, that's all.

gjcarneiro on 12 Mar 2018

Ansible only cares about Hosts and Tasks, consider the images you are building 'hosts' and suddenly it fits both paradigms.

bcoca on 12 Mar 2018

Ansible is a configuration management tool for many other things, networks
devices, both real and Virtual, for a huge amount of cloud services such as
databases, web services such as eleastic beanstalk, lambda and all the
components that apply to it like IAM security components, while Ansible is
good at hosts if your still running mostly VMs/hosts your basically in
Early 2000s IT. Not offending anyone here there are sometimes important
reasons for running VMs or even docker containers but they all stem back to
legacy reasons. In fact more and more hosts are going to become less of
what it automates. IMO If we don’t get parallel with_items we might as
well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

isaacegglestone on 12 Mar 2018

😕2

@gjcarneiro then don't define them as data, define them as hosts and target them, then delegate_to: localhost to execute the actions in parallel

This is a very nice approach but it doesn't seem to work inside workaround for rolling restarts with serial=1 simulation (#12170). So an option for paralellization would add a lot more flexibility.

hryamzik on 12 Mar 2018

no doubt, but it also adds a huge layer of complexity and the need to deal with concurrent actions on a single host ala hosts:all + lineinfile + delegate_to: localhost

bcoca on 12 Mar 2018

Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.

My playbook looks like this:

hosts: "{{ DeploymentGroup }}"

tasks:
- name: remove vm and all associated resources
  azure_rm_virtualmachine:
  resource_group: "{{ host_vars[item]['resource_group'] }}"
  name: "{{ inventory_hostname }}"
  state: absent
delegate_to: 127.0.0.1

Unfortunately it still tries to connect to the machines listed in hosts to
execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try this
previously in many different ways so just want to know you are able to do
this.

Does this actually even work? Hopefully this is just some syntax issue.

On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <[email protected]

wrote:

Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

isaacegglestone on 18 Mar 2018

Okay so disabling fact gathering fixes this issue, however it causes
another one, host_vars no longer contains the azure dynamic inventory from
standard in.

So resource_group: "{{ host_vars[item]['resource_group'] }}" doesn't
work in the above and needs to be hard coded to a resource group name.

On Sun, Mar 18, 2018 at 11:14 AM, Isaac Egglestone <
[email protected]> wrote:

Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.

My playbook looks like this:

hosts: "{{ DeploymentGroup }}"

tasks:

name: remove vm and all associated resources
azure_rm_virtualmachine:
resource_group: "{{ host_vars[item]['resource_group'] }}"
name: "{{ inventory_hostname }}"
state: absent

delegate_to: 127.0.0.1

Unfortunately it still tries to connect to the machines listed in hosts to
execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try this
previously in many different ways so just want to know you are able to do
this.

Does this actually even work? Hopefully this is just some syntax issue.

On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <
[email protected]> wrote:

Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com
wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

isaacegglestone on 18 Mar 2018

Okay so I have modified the Playbook below to try a number of things.

1st I tried setting delegate_facts: True in case this helped but of course
even based on the documentation I didn't really expect that to work.
2nd I set gather_facts: no and tried running setup to reduce the fact
gathering to nothing hoping it would opt to not connect at all, but of
course as expected it still tried to connect to the machine.
3rd Tried setting connection: localhost but strangely it still wants to
connect remotely to the machine to gather the facts even though it knows it
will execute the play locally, a bit annoying there but I get the logic as
how else will it know the details of the host in question without doing
this..

I can probably use the playbook to turn the machines on first and then let
ansible login to them and gather the unneeded facts. This would be so that
I can get host_vars to work and then delete the machines. I'd like to know
if anyone has a better solution here as that's also a huge time consuming
effort when I've got a hundred or more machines and I have to power them
all up just to then delete them.

So far I'm seeing using this as a solution instead of a with_items parallel
solution as having potential but the machines in question still need to be
up and reachable if you need any kind of facts from azure_rm.py while you
do this so there is at least one caveat there. That is unless someone knows
how to get access to host_vars from azure that are passed via standard in
when gather_facts: no

Actually I of course have the same problem when I run all this using a
with_items list, however I was hoping to avoid that work around if I'm
going to use hosts again. The work around is dumping the azure_rm.py to a
json file on the command line and then loading into a variable to get
access to them again.

If I look forward to my end goal here to modify hundreds or even thousands
of serverless components in parallel, perhaps this will be okay as I can
use things like azure_rm_functionapp_facts
http://docs.ansible.com/ansible/latest/azure_rm_functionapp_facts_module.html
to
gather facts about them and use them in the play in theory although this
has yet to be tested.

I still don't have any great logic on how to do this properly to create a
documentation pull request about it as the method seems so far largely
dependant on what your doing and I'm not sure I want to suggest using the
json dump hack in the documentation.

I'll wait for some feedback from anyone who happens to care about this on
this issue list to decide my next step. Meanwhile I'll use my hack to get
my immediate work done.

hosts: "{{ DeploymentGroup }}"
gather_facts: no
tasks:
- setup:
  
  gather_subset=!all,!min
- name: remove vm and all associated resources
  
  azure_rm_virtualmachine:
  
  resource_group: "{{ host_vars[inventory_hostname]['resource_group']
  
  }}"
  
  name: "{{ inventory_hostname }}"
  
  state: absent
  
  delegate_to: localhost
  
  delegate_facts: True

On Sun, Mar 18, 2018 at 12:04 PM, Isaac Egglestone <
[email protected]> wrote:

Okay so disabling fact gathering fixes this issue, however it causes
another one, host_vars no longer contains the azure dynamic inventory from
standard in.

So resource_group: "{{ host_vars[item]['resource_group'] }}"
doesn't work in the above and needs to be hard coded to a resource group
name.

On Sun, Mar 18, 2018 at 11:14 AM, Isaac Egglestone <
[email protected]> wrote:

Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.

My playbook looks like this:

hosts: "{{ DeploymentGroup }}"

tasks:

name: remove vm and all associated resources
azure_rm_virtualmachine:
resource_group: "{{ host_vars[item]['resource_group'] }}"
name: "{{ inventory_hostname }}"
state: absent

delegate_to: 127.0.0.1

Unfortunately it still tries to connect to the machines listed in hosts
to execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try
this previously in many different ways so just want to know you are able to
do this.

Does this actually even work? Hopefully this is just some syntax issue.

On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <
[email protected]> wrote:

Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.

Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.

On Mon, 12 Mar 2018 at 18:49, Brian Coca notifications@github.com
wrote:

Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/12086#issuecomment-372422169,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF
.

isaacegglestone on 18 Mar 2018

I have a use case for forks too, which would make this a lot easier. The playbook is deploying a bunch of openstack instances via terraform with randomly picked floating ips. Then I iterate over the ips to check that port 22 is open on each created host. Current method to do this is with a multiplay playbook:

- hosts: localhost
  connection: local
  gather_facts: no
  tasks:
...
  - name: Run terraform
    terraform:
      plan_file: '{{tf_plan | default(omit)}}'
      project_path: '{{terraform_path}}/{{infra}}'
      state: '{{state}}'
      state_file: '{{stat_tfstate.stat.exists | ternary(stat_tfstate.stat.path, omit)}}'
      variables: '{{terraform_vars | default(omit)}}'
    register: tf_output

  - name: Add new hosts
    with_items: '{{tf_output.outputs.ip.value}}' # configued this in terraform to output a list of assigned ips.
    add_host:
      name: '{{item}}'
      groups: new_hosts

- hosts: new_hosts
  gather_facts: no
  connection: local
  tasks:
   - name: Wait for port 22 to become available
     wait_for:
       host: '{{ansible_host}}'
       port: 22
       state: started
       timeout: 60

This is run with: ansible-playbook -i localhost, deploy-test-clients.yml --extra-vars="infra=terraform_os_instances state=present"
This is of course a limited workaround since you don't always have a neatly-inventory-parseable list of ips to work with.

megakoresh on 13 Apr 2018

Since a lot of people seem to be struggling with the performance of templating files locally, maybe a specific template_local module could be created to solve this specific issue instead. At least it'd be a start... I'd have a go myself but won't have time for the forseeable future.

30+ minutes to template 100 files that can be done in 5s with jinja is ridiculous.

saplla on 31 May 2018

👍2

@saplla templating always happens locally, the only thing that happens remotely is copying the template and setting permissions.

bcoca on 31 May 2018

Just to clarify, I'm talking about those users who want to template files as local tasks, e.g. to feed into other build systems, or in my case, to deploy k8s resources using kubectl.

What I mean is to offload the looping and templating to jinja via a module that is a simple wrapper. The module could take some context and the loop definition (what would normally be put into with_nested and friends) and just cut out ansible entirely for this task (perhaps the wrapper could run jinja in parallel if it speeds things up).

It could be invoked like this:

    template_parallel:
      src: "{{ item[0] }}"
      dest: "{{ tempdir }}/{{ item[1] }}-{{ item[0] | basename }}"
      context: "{{ hostvars[inventory_hostname] }}"
      nested:
      - "{{ templates.stdout_lines }}"
      - "{{ namespaces.stdout_lines }}"

The above example takes all variables defined by ansible as the context, but any dict could be passed in.

As I say, I haven't got time to work on this right now, but does the approach sound feasible @bcoca ?

saplla on 31 May 2018

That assumes that each item is independent, that is not always the case, you can make the current item values depend on the previous ones and/or results of previous iterations, or they can just be cumulative.

Most of the time spent templating has to do with the vars, not the templates themselves, since they need to be consistent, you would not gain much in parallelization unless you are willing to change behaviours that would break current assumptions.

Also, templates are already parallel, by host, just not by item.

bcoca on 31 May 2018

OK thanks for the thoughts. It'd actually be good enough for my use case and it sounds like it might be for some other people in this thread too. I'm just using ansible to load hierarchical configs and template files locally before invoking some binary that deploys them (kubectl, helm, etc). I'd be happy with a dead-simple, lightweight templating module if it was so performant it reduced templating times from minutes to seconds.

I'll try to look at this when it becomes an issue for us, unless someone beats me to it.

saplla on 31 May 2018

I originally filed #10695 but seeing that this is going to take a while to come together I ended up addressing these use cases with shell scripts (eg. just say I have to do something on 50 Git repos on a single host, I use Ansible to run a single script once that does the thing 50 times). Unfortunately, this means giving up some of the stuff that you get for free with Ansible, like very granular change reporting, and you also have to implement all of the "run only if" logic yourself and be very careful about error handling, but it is probably two orders of magnitude faster. As such, even if we wind up getting a "parallel" option in the future, it might not be as fast as my custom scripts and I probably won't bother switching to it.

wincent on 31 May 2018

@wincent a parallel loop will probably still always be slower than a shell script/dedicated program, as Ansible does much more than just 'apply the action'.

bcoca on 31 May 2018

@bcoca: yep, that confirms my understanding.

wincent on 31 May 2018

@saplla k8s_raw is better than using template for this, you can inline the yaml in your inventory if needed :) (it's not the subject of this PR)
what is the current state about this ? Can we expect something in 2.6 @bcoca ?
I'm managing thousands of postgresql privileges on my DB clusters and 25 minutes is painfully slow

nerzhul on 5 Jun 2018

@nerzhul Thanks but it's not better for us. Too much magic. We need templating.

saplla on 5 Jun 2018

@sapila you could always create a host target per template to parallelize templating as much as possible and then use subsequent plays or delegation to deliver to the proper actual hosts.

bcoca on 5 Jun 2018

@bcoca a little bit hacky :)

nerzhul on 5 Jun 2018

not at all, its a LOT hacky, but works today

bcoca on 5 Jun 2018

closing in favor of https://github.com/ansible/proposals/issues/140

bcoca on 29 Mar 2019

Ansible: make with_ loops configurable

ISSUE TYPE

COMPONENT NAME

ANSIBLE VERSION

CONFIGURATION

OS / ENVIRONMENT

SUMMARY

STEPS TO REPRODUCE

EXPECTED RESULTS

ACTUAL RESULTS

Most helpful comment

All 90 comments

My playbook looks like this:

My playbook looks like this:

delegate_facts: True

My playbook looks like this:

Related issues