Ansible: separate group_vars being overwritten when deploying to same host

Created on 19 Sep 2014  ·  71Comments  ·  Source: ansible/ansible

Issue Type:

Bug Report

Ansible Version:

ansible 1.6.1

Environment:

Mac OSX

Summary:

If we have a hosts file:

[service1]
www.host1.com

[service2]
www.host1.com

and we have these group_vars:

group_vars/service1:
    database_host: "www.database.com"
    database_port:  3306

group_vars/service2:
    database_host: "www.differentdatabase.com"
    database_port:  3306

and we are running this playbook:

- hosts: service1
  remote_user: root

the variables from group_vars/service1 and group_vars/service2 overwrite each other if we're deploying to the same server. This means service1 will get service2 groups variables and get the incorrect database host and port.

to get around these, we've added DNS entries (aliases to www.host1-service.com) so our host looks like:

[service1]
www.host1-service1.com

[service2]
www.host1-service2.com

but this is highly error prone and is not ideal. What are some different methods to getting around this issue? (or misunderstanding of group_vars)

The way I'm doing multi environmental deployments is like this:

inventory/stage/hosts
inventory/stage/group_vars/
inventory/stage/group_vars/service1
inventory/stage/group_vars/service2

inventory/live/hosts
inventory/live/group_vars/
inventory/live/group_vars/service1
inventory/live/group_vars/service2
Steps To Reproduce:

Summary describes how to reproduce this.

Expected Results:

group_vars should not be overwritten by another group when sharing the same host

Actual Results:

group_vars are overwritten by another group when sharing the same host

bug

Most helpful comment

This is so stupid. I just spent many hours trying to debug a variable issue, only to discover that it was because of some stupid design philosophy (this). At there very least, there should be a warning when multiple group_var files are loaded for the same host.

All 71 comments

I'm also having this issue, it seems like the vars get parsed and attached to the host rather than the staying with the group.

Steps To Reproduce:

Hosts file:

[aaa]
host1
host2

[bbb]
host2
host3

[aaa:vars]
foo=aaa

[bbb:vars]
foo=bbb

Run a command on each group:

$ ansible -i hosts aaa -m shell -a "echo {{ foo }}"
host1 | success | rc=0 >>
aaa

host2 | success | rc=0 >>
bbb

$ ansible -i hosts bbb -m shell -a "echo {{ foo }}"
host2 | success | rc=0 >>
bbb

host3 | success | rc=0 >>
bbb

Expected results:

The variable foo=aaa is asigned to the group aaa and should be output when running commands against group aaa

Actual results:

host2 outputs the variable from group bbb

Have the same problem.

Example:

./group_vars/group1:

---
name: group1

./group_vars/group2

---
name: group2-default

hosts file:

[group1]
host1

[group2]
host1

So, when I'm trying to run playbook for group1 only, I'm getting this result:
"name" variable has a value "group2-default" but it`s expected to have a value "group1".

Is there any workaround for this?

inventory vars get merged even if you are not using that group, last group
merged wins

Brian Coca

@bcoca Why does it work that way? Is this expected behaviour, or an actual bug?

You could have vars/service(1,2,3,...) files

And pass service=1 as extra vars

Then for the hosts your targeting in the playbook use the service variable,
as well as which vars file you are importing

We do a similar thing for importing vars based on an environment extra var
On Nov 26, 2014 3:12 PM, "arianitu" [email protected] wrote:

@bcoca https://github.com/bcoca Why does it work that way? Is this
expected behaviour, or an actual bug?


Reply to this email directly or view it on GitHub
https://github.com/ansible/ansible/issues/9065#issuecomment-64718173.

I have same issue on 1.8.4 today.
Eventually, is this a correct behavior or not?
Does anyone know the progress?

Hmm. Seems like the tags 'bug_report' are wrong here. This is not a bug but.. a feature :-)
So yes, it is correct behaviour.

As @hashnz said, "the vars get parsed and attached to the host rather than the staying with the group" which is totally correct.

Variables in different groups where the same hosts are member, only makes sense when those groups are in a parent-child relation, the child winning, and it's vars applied. If groups are at the same level, there is not really a deterministic way which one will win.

At playbook time targetting a group actually only determines which hosts are in the run, it doesn;t change anything at variable level.

The ansible way here would be to set database host and port in a list, and the iterate over it in tasks.

@bcoca @jimi-c I think this issue can be closed.

@srvg Thanks a lot! I got it. I'll try without using group_vars.

closing as this is not a bug but by design, group vars get flattened per host and do not vary because of how you select the host

Anyone got a workaround for this?

we can't use static vars in the playbook, as we need to differenciate between different environments (handles by different inventory files) and need to run with different variables, depending on the group run, even tough they are run on the same host as another group

Use a CNAME for the same hostname or an IP address. This is the only
workaround I found. The problem appears when you are using Ansible Tower,
because you're wasting a license with such CNAME's. Ansible do not want to
fix this bug calling it a "feature", but in fact it's a real bug that must
be fixed.
28 Апр 2015 г. 17:16 пользователь "SkaveRat" [email protected]
написал:

Anyone got a workaround for this?

we can't use static vars in the playbook, as we need to differenciate
between different environments (handles by different inventory files) and
need to run with different variables, depending on the group run, even
tough they are run on the same host as another group


Reply to this email directly or view it on GitHub
https://github.com/ansible/ansible/issues/9065#issuecomment-97077453.

It has been fixed but rejected. https://github.com/ansible/ansible/pull/6666

The annoying thing is that the ansible guide does recommend to group them by roles

http://docs.ansible.com/playbooks_best_practices.html

It is suggested that you define groups based on purpose of the host (roles) and also geography or datacenter location (if applicable)

But if the same host has multiple roles with the same variables you can't use the group variables since you don't know which one will be used.

I still don't understand how to bind a variable to a specific host for a specific playbook. I want the ability to use the same template, but with different configuration values depending on the environment I'm deploying to.

We're already following http://docs.ansible.com/ansible/playbooks_best_practices.html#staging-vs-production, by making different inventories for our different environments

Why does this seem so difficult with Ansible? Is Ansible just bad at doing environment based deployments?

Please provide a full example, because "The ansible way here would be to set database host and port in a list, and the iterate over it in tasks." does not make any sense to me. Where is the environment specified here?

For example, I have an authentication service that has a variable called authentication_db_host: 127.0.0.1 for stage, and authentication_db_host: 192.168.2.4 for live. Do you want me to have 2 variables?

authentication_db_host_stage: 127.0.0.1
authentication_db_host_live: 192.168.2.4

Now it's difficult for me to feed that to a common template file. I need to add if else checks throughout my j2 file (and we went down that path, but it was too much duplication and we got rid of it.)

I have not found any documentation that handles this case, and it seems like a thing that should not be difficult to do.

Here's how we do it:

AWS instance tagged as:

someapi: prod

playbook someapi.yml

  • hosts: tag_someapi_{{ env }}
    gather_facts: true
    user: ec2-user
    vars_files:

    • vars/aws_{{ env }}

    tasks:

    • fail: msg="Env must be defined"
      when: env is not defined
      roles:
    • someapi
      ...

Execution with extra vars

ansible-playbook -vv someapi.yml -e "env=prod"

roles/someapi/tasks/main.yml

  • name: Include vars specific only to someapi.
    include_vars: someapi_{{ env }}.yml
  • name: Place config
    template: src=config.j2
    dest={{someapi_install_directory}}/configuration.properties
    ...

roles/someapi/vars/someapi_prod.yml

authentication_db_host: 192.168.2.4

roles/someapi/vars/someapi_stag.yml

authentication_db_host: 127.0.0.1
...

roles/someapi/templates/config.j2

authentication_db_host = {{ authentication_db_host }}
...

HTH,

Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Fri, Oct 16, 2015 at 12:42 PM, arianitu [email protected] wrote:

I still don't understand how to bind a variable to a specific host for a
specific playbook. I want the ability to use the same template, but with
different configuration values depending on the environment I'm deploying
to.

We're already following
http://docs.ansible.com/ansible/playbooks_best_practices.html#staging-vs-production,
by making different inventories for our different environments

Why does this seem so difficult with Ansible? Is Ansible just bad at doing
environment based deployments?

Please provide a full example, because "The ansible way here would be to
set database host and port in a list, and the iterate over it in tasks."
does not make any sense to me. Where is the environment specified here?

For example, I have an authentication service that has a variable called authentication_db_host:
127.0.0.1 for Stage, and authentication_db_host: 192.168.2.4 for live. Do
you want me to have 2 variables?

authentication_db_host_stage: 127.0.0.1
authentication_db_host_live: 192.168.2.4

Now it's difficult for me to feed that to a common template file. I need
to add if else checks throughout my j2 file (and we went down that path,
but it was too much duplication and we got rid of it.)

I have not found any documentation that handles this case, and it seems
like a thing that should not be difficult to do.


Reply to this email directly or view it on GitHub
https://github.com/ansible/ansible/issues/9065#issuecomment-148816278.

Variables in different groups where the same hosts are member, only makes sense when those groups are in a parent-child relation, the child winning, and it's vars applied. If groups are at the same level, there is not really a deterministic way which one will win.

Can Ansible detect such cases and make a warning with a possible solution hint?

If I understand correctly how it works -- for future readers like myself who struggle to understand:

  1. Host groups in inventory files define groups of hosts. Group variables define what variables should be be set for the hosts in this group.
  2. When I specify a group to be used to play a playbook, this means to take all the hosts which belong to that group.
  3. Each play is played separately for each host -- there is no context which group is being deployed. Groups are used to group hosts and group vars -- to set common variables to those hosts.
  4. So if a host is mentioned in several groups and these groups have the same variable with different values, the host is told that because it belongs to the first group -- the variable for it is set to the group value; then because the host belongs to another group the same variable with a different value is set shadowing the previous value.

The main problem is that ansible merge variables even from groups I don't
want to play right now. I don't need to check variables from group I'm not
playing at this time. Maybe, the semi solution is to check for variables
only that groups, which was scheduled to run playbook.

вс, 17 Янв 2016, 20:00, Victor Varvaryuk [email protected]:

If I understand correctly how it works -- for future readers like myself
who struggle to understand:

  1. Host groups in inventory files define groups of hosts. Group
    variables define what variables should be be set for the hosts in this
    group.
  2. When I specify a group to be used to play a playbook, this means to
    take all the hosts which belong to that group.
  3. Each play is played separately for each host -- there is no context
    which group is being deployed. Groups are used to group hosts and group
    vars -- to set common variables to those hosts.
  4. So if a host is mentioned in several groups and these groups have
    the same variable with different values, the host is told that because it
    belongs to the first group -- the variable for it is set to the group
    value; then because the host belongs to another group the same variable
    with a different value is set shadowing the previous value.


Reply to this email directly or view it on GitHub
https://github.com/ansible/ansible/issues/9065#issuecomment-172353019.

This seems very counter-intuitive.

I have an inventory which just has my Vagrant box in each group for testing purposes.

But even when I run a playbook which just references a single group, Ansible is pulling in variables from the other groups too which is breaking my testing.

On 28 April 2016 at 17:55, Philip Wigg [email protected] wrote:

This seems very counter-intuitive.

​That is because the mental model you use to look at inventory is not the
correct one.​

I have an inventory which just has my Vagrant box in each group for
testing purposes.

But even when I run a playbook which just references a single group,
Ansible is pulling in variables from the other groups too which is breaking
my testing.

​Do not look at the inventory as tightly coupled with playbooks. Inventory
is pretty much a separate thing, where

  • hosts can be member of several groups, and as such be targetted from
    ansible​ (ad-hoc) or ansible-playbook (playbooks) by pointing to one of the
    groups that host is member of.
  • resolving variables is done inventory-wide, whether you target a
    specific group, or the allmighty built-in 'all' group, doesn't change that
    variables are and inherited, resolved and calculated by looking at all
    group definitions and all group variables that exist in the inventory.

HTH,

Serge

I'm bumping into this too. In my case I have a role which installs vhosts. I set the details for the multiple vhosts as an array in my inventory for the server groups, in other words each group can have multiple vhosts. This works well, I define the vhosts for each group and everything seems neat and tidy. The problem comes up when the host is in more than one group, in that case the vhost array in the inventory is overwritten and the last group wins, so only half my vhosts end up being provisioned.

An alternative would be to put the vhosts in the hosts instead of in the group, but that doesn't seem clean because I'll have to duplicate for all servers which are in the same group.

Can anybody point me in the direction of a better way to do this or to work around it? I feel like I may be abusing the concept of roles or inventory by having my "vhosts" role accept an array of vhosts (a vhost is not exactly a "role" in the sense that you don't say to a server "you are a vhost"). Can anyone offer a better way to do it or share their thoughts?

By the way, @srvg, thank you for taking the time to respond to me - it was much appreciated and has clarified my understanding.

My solution is.

[service1]
host1 ansible_ssh_user=host1.com

[service2]
host2 ansible_ssh_user=host1.com

This is so stupid. I just spent many hours trying to debug a variable issue, only to discover that it was because of some stupid design philosophy (this). At there very least, there should be a warning when multiple group_var files are loaded for the same host.

This i very anoying to have group_vars loaded randomly. It make impossible to have inheritance in a such case all > datacenter > environment (integ, devel, prod) > application > host

It's very easy to do with puppet hiera but with ansible it's a pain, and, when you have > 100 playbooks/roles to maintain, load variables explicitely is not a must have because you should maintain a vars_file + a group_vars file for the same function.
Maybe having a tag in the file (vars_priority) or use the filename to fix the loading order could be good (like apache for example), example

  • all.yml
  • 10_datacenter_a.yml
  • 10_datacenter_b.yml
  • 20_env_production.yml
  • 40_app_gitlab.yml

then gitlab load group vars per name, here the datacenters, after the env, and last the app.

I'm really happy to see that I'm not alone in this nonsense! Groups vars should apply to HIS group of hosts period. How in the world the --limit directive should be explain than? The natural way of thinking is : ok I "limit" this playbook to host in THAT group so THAT group of vars gets pick up!!!! Not variable define in other groups for that same host, that doesn't make anysense!

I hope this get fix and in the mean time, I fallback to use multiple DNS entry for the same host.

Just spent several hours banging my head on the table trying to get around this issue. With the amount of chatter this issue is getting I'm really surprised Ansible (or Red Hat) is allowing this to continue.

Why rely on the group_vars abstraction to merge vars and determine what is
needed in the first place?

I can't think of a good reason not to explicitly define and include vars
(simplest example is included above in the thread)

Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Tue, Sep 27, 2016 at 11:53 AM, Jason Hane [email protected]
wrote:

Just spent several hours banging my head on the table trying to get around
this issue. With the amount of chatter this issue is getting I'm really
surprised Ansible (or Red Hat) is allowing this to continue.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ansible/ansible/issues/9065#issuecomment-249961581,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AATjccSt7D7FJfzdA8r86kb1GQvxLBnaks5quWY4gaJpZM4Cjyd4
.

Part of the issue is that it's not immediately clear that's how Ansible parses variables. One would think (as evidenced by the number of people having this problem) that assigning a host to a group, and running only that group, would inherit the variables for only that group and not of all of the groups that host is in.

Running all hosts is a different story but at least then it should be a clear assumption that Ansible would grab all variables for all groups that host is in. Ansible should either do a better job making those assumptions clear or provide a flag that enables this behavior.

Why rely on the group_vars abstraction to merge vars and determine what is
needed in the first place?

That's EXACTLY the point of an abstraction! If the tool let me abstract my environment in different groups and then by applying clear rules of inference deduce which variables should be use, why not use it?!? So there's 2 problems here : The rules are not clear or the abstraction is not working. Otherwise, explain me the purpose of groups_vars if I need to explicitly define and include vars?

I'm must say that the documentation do not mention this specific case of same host been included in multiple group_vars. But this case is very real in host where multiple instance of service exist (ex. apache). I really hope this problem being solve soon.

We have found a way to solve our app/env problem with variables inheritance by cross defining variables between groups using a prefix.
for example

app_collectd.yml
collectd_host: "{{ env_collectd_host }}"

env_production.yml
env_collectd_host: 1.1.1.1

then it's now not a problem, we have a clear app variables + env variables using cross calling variables

I'm coming from Puppet-land where doing abstraction and code reuse is simple and clear. Maybe I don't understand correctly how Ansible approaches this so if there is a better way that doesn't involve group_vars I'm all ears. What I don't want to have to do is maintain another set of variables for each host or group to specify where it should be getting its variables. To me that should be the responsibility of the framework. This is the beauty of hiera in Puppet.

@hyojinbae

My solution is.

[service1]
host1 ansible_ssh_user=host1.com

[service2]
host2 ansible_ssh_user=host1.com

can you explain it in detail?

i test it like this, not same as yours, but it works!

[service1]
host1 ansible_host=192.168.0.10

[service2]
host2 ansible_host=192.168.0.10

One more victim here.

Anyone figure out a good solution for the following scenario?

Let's say I have a list of web applications (app-a, app-b, app-c). And there are 4 hosts in 2 DC for DR/Load-balancing. I want to use ansible to simplify deployment.

My first instinct is to use group vars, obviously I failed, but what alternative Ansible provides?

inventory

[dc1]
host1
host2

[dc2]
host3
host4

[cluster:children]
dc1
dc2

[app-a:children]
cluster

[app-b:children]
cluster

[app-c:children]
cluster

groups_vars
app-a

http_port: 8081

app-b

http_port: 8082

app-c

http_port: 8083

playbook basically expect hosts passed in via commandline:

- host: {{app-name}}
- tasks: 
  # deploy application with given {{http_port}}....

This is what I do to get around it

- hosts: "{{ app-name }}"

vars_files:
    - "vars/{{ app-name }}.yml"

@hanej Thanks, that certainly points to a direction I wasn't aware of.
However, I am now more incline to use roles for code reuse, and declare variables directly in playbook. Even though that means I will have one playbook per app, but by declare variable right next to the role which will use it, make the entire config much easier to understand.

@hanej Thanks, this workaround is brilliant. You saved me.

@hanej Genius!

I'm so surprised that this is even an issue. I'm running into it because I have a single role I use with multiple inventory groups that have associated group_var files. In QA/Prod, this is not an issue, because each inventory group has different servers. But for integration level environments, we tend to smash everything together on to single servers (which is why I'm seeing this issue).

Following hanej's advice, I modified my playbook yml to just explicitly load the group_var file:

hosts: publish
vars_files:
  - group_vars/publish.yml

"This is so stupid. I just spent many hours trying to debug a variable issue, only to discover that it was because of some stupid design philosophy (this). At there very least, there should be a warning when multiple group_var files are loaded for the same host."

Just to update this "ISSUE"/"LIMITATION"
OK, we can use a workaround to eliminate the conflict, ( with host ansible or other way )
but the best way should be to purpose variable in the ansible.cfg to define :
"enclose group_vars" or "merge group_vars"

I can't use group_vars just for this reason. We have hosts in multiple groups (DC, environment, application, etc) and we have no control over how the variables are read. Coming over from Puppet using Hiera this is a huge deal. In my opinion, Ansible should adopt something like Hiera using the "roles and profiles" pattern where you can specify the order of how variables are consumed based on facts or groups. I kind of do this now using the approach above.

For instance, if you have a host in a role called "/Linux Hosts/Applications/Java/Website" you should be able to inherit variables from Linux Hosts (global level variables like sysctl, DNS, NTP, etc), Applications (global application level variables), Java (java specific variables like JDK version, JAVA_HOME, etc), and Website (variables related specifically to the website application). The further you go down in the tree the more specific the variables become, which can override the variables higher up.

Using Ansible for event driven or adhoc runs is great and works very well, however I'm finding that using Ansible for configuration management is not as clean or easy to use as Puppet and Hiera mostly for this reason.

2 years later, I'm still receiving a steady stream of e-mails on this issue (it's also one of the most commented issues in this github.) Do the Ansible maintainers still feel like the current behaviour is correct, even though it's very unintuitive?

You guys need to seriously document on how to do configuration management with Ansible because anyone trying to do application deployment with this project is having a terrible time.

We've decided to stop using Ansible entirely. It sucks at configuration management, period.

Good luck with the project.

Yeah, just ran into the same problem when configuring the deployment process for config files for different environments, while testing on the same host. I'm going to adopt the whole {{ app-name }}-{{env}}.yml method proposed by @hanej , but would love for the group_vars behaviour to at least be configurable.

Happy I found this issue after banging my head against the wall for countless hours... And I'm extremely surprised of the maintainers dismissive attitude regarding this bug by calling it a "feature", it's stupid beyond recognition.

I'm trying to set up a generic deployment playbook/role using a dynamic inventory. The inventory output looks like this:

{
  "resource-service_ci": {
    "vars": {
      "nginxpool": "resource_pool",
      "httpport": "8800"
    },
    "hosts": ["host1.example.com", "host2.example.com"]
  },
  "example-service_ci": {
    "vars": {
      "nginxpool": "example_pool",
      "httpport": "8100"
    },
    "hosts": ["host1.example.com","host3.example.com"
    ]
  }
}

Unfortunately there can be multiple applications on a single host. The whole point by using a dynamic inventory evaporates if the group vars are 'randomly' mashed up and returned instead of inherit the variables for only that group which I'm running --limit with

I tried @hanej workaround by implementing the json like this:

_"hosts": ["host1 ansible_host=host1.example.com", "host2 ansible_host=host2.example.com"]_

but Ansible clearly don't understand that trying to ssh to host1 ansible_host=host1.example.com

Anyone else facing the same problem, and if so have a possible workaround?

I could pass the port and pool variables on the command line as they take precedence, but then I have to do some ugly lookup before.

Was having a great time with Ansible until hitting this bug.

I had a perfect abstraction that would allow configuration of the same apps on a multi-tiered environment, and also on an all-in-one test environment - alas, I was wrong.

What seemed like one of the most powerful features of Ansible doesn't work like everyone would expect it to.

Please, lets get this fixed.

Spent hours banging my head on the table trying to get around this issue. Please fix this issue.

This "feature" wasted the better part of my day. Not amused.

@hanej Thanks for the rather elegant workaround. We're now setting an env variable in our environment/group_vars/all files to the name of the environment.

In our playbooks we then do this:

- hosts: group
  vars_files:
    - "{{ env }}/group_vars/group.yml
  roles:
    - ...

I know that it's been 2.5 years and that Ansible will probably never look at this, but I'm just going to add my frustration with this. Why allow users to put a host in multiple groups if you can't use the groups separately?

Why allow users to put a host in multiple groups

That's my problem with Ansible devs. Looks like they think only about themselves. They could at least try make the frustration less issuing some warnings. They instead prefer to spit.

We filter out closed issues and don't see comments on them, but someone pinged me to respond to some of these questions. If you want attention from devs the mailing list or IRC are better conduits than closed tickets.

Ansible looks at groups as properties of a host. For example, you can have a host be part of webserver group (to install a webserver reqs) and part of the northeast_datacenter group (so you know the network gateway, ntp and dns servers to set) also part of the dev_group to install dev tools and point to the non-production database, etc, etc.

You CAN use the groups as a way to target hosts but that does not mean the other 'host properties' disappear. I am in the 'males' group and in the 'programmers' group, just because you 'select me' as a programmer I do not stop being male.

In the end this is a design decision (not a bug) on how to tackle groups, a way to classify and add properties to hosts. I know other tools use them differently but that is not a reason to change how they work in Ansible, as there is no uniform way ALL other tools use groups, many others look at them this way many don't.

This has the limitation of not being able to reuse variable names and only have the 'current group apply' but it has the advantage of always seeing the bigger picture of how the host is defined and makes conflicts easy to spot or resolve. In most of the cases above a use of vars_files or include_vars seems more appropriate than setting up as groups, when talking about application specific variables.

Doing include_vars on group_vars ... well that is not how we designed it, but if it works for you ... the caveat is that you are 'double loading' as group_vars are automatically pulled in, this can lead to performance issues on large inventories. We recommend more specific, vars_files: vars/app1.vars.yml or similar with include_vars.

I hope this clarifies why groups work the way they do in Ansible and why that is different than 'another specific tool', let me know if there is anything I can add to our docs http://docs.ansible.com/ansible/intro_inventory.html to clarify and avoid frustration.

I would love to accommodate everyone's way of working, Ansible is very flexible in some areas, but not in this respect and I don't see a good way to make it so nor a sane way to share playbooks afterwards.
If I'm wrong, I look forward to your pull request.

@bcoca is it possible to detect cases when the group vars are being used in a strange way (sorry, cannot formulate this better -- haven't used this for a long time) and make warnings to help users save their time?

@bcoca What's unclear to me, and I think most of the rest of the commentators on this issue, is what value the Ansible team is looking to preserve by keeping group_var precedence/resolution the way that it is.

Can you give an example of a case which is satisfied well by the current solution but would cease to be so if, say, group_vars are only loaded/considered for the groups selected for a given play, rather than for the entire inventory?

It seems quite easy to come up with examples which are broken by the way that it is now.

I understand @bcoca's point as we use it that way too. My biggest frustration with Ansible is it works mostly good for either configuration management or ad-hoc workflows. It doesn't do a great job of either but it's getting better and it has plugin support which I think I will be using to make another issue I have with it better. Most days I still miss Puppet solely for configuration management...

Probably the best use case for group_vars is multiple datacenters. Let's say your inventory file looks thusly:

[dc1]
host1
host2

[dc2]
host3
host4

[app1]
host1
host3

[app2]
host2
host4

You could use group_vars to set datacenter specific variables that would be inherited by all hosts in that datacenter.

group_vars/
    dc1.yml
    dc2.yml
    app1.yml
    app2.yml

dc1.yml

---
ntp_servers:
  - 10.10.10.10
  - 10.10.10.11

dc2.yml

---
ntp_servers:
  - 10.11.10.10
  - 10.11.10.11

Using group_vars like this is fine until you need to merge or override a variable in another group_var file. This is another case where group_vars breaks down. There's no way to set the order in which variables are read in group_vars. A hierarchical layout (like Hiera) that can be determined by facts and not by groups for things like environmental variables comes to mind here but I digress.

From my perspective, this use case is mostly good for straight configuration management but it's not so great at deployments or ad-hoc scripts. This is why I would love to see a switch to disable this feature for certain playbooks where you don't want to do this.

@hanej I'm not sure how that example breaks by the design change I suggested in my previous comment?

Given the inventory you've listed, if I ansible-playbook -i /my-inventory -v some_playbook.yml -l dc1:&app2 Any some_playbook.yml you'd write against that inventory would have to be fairly contrived in order to make it break when variables from dc1.yml and app1.yml only were loaded.

I agree that deployments are an issue. Multi-tenanted deployments, specifically.

I have two concerns with the way things work now:

First, lack of configuration namespacing. Specifically, if I'm maintaining configuration for some applications in a multi-tenanted environment, how do I ensure that someone maintaining a different app which happens to run on the same machines isn't going to add a variable name which conflicts with mine, or that I won't add one which conflicts with theirs?

Second, the case everyone keeps talking about here. I'd expect that with the below inventory, I'd never wind up with files written to /data/prod when I run ansible-playbook -i inventory -v data_deploy.yml -l app1_staging, and that files would never be written to /data/staging when I run ansible-playbook -i inventory -v data_deploy.yml -l app1_prod

These two issues are bad enough on their own, but in a large company where many people in different timezones might be editing the same inventory, the current solution seems more than a little risky.

Example inventory.

[dc1]
host1
host2

[dc2]
host3
host4

[app1_staging]
host1
host2
host3
host4

[app1_prod]
host1
host2
host3
host4

app1_staging.yml:

---
  data_path: /data/staging

app1_prod.yml:

---
  data_path: /data/prod

Don't know if the edit worked to adjust the notification, by my last comment was meant to be directed to @hanej.

@benjamincburns You asked for an example of using group_vars that satisfies the current loading method but breaks if only using the immediate groups. I provided that with the datacenter example. If Ansible doesn't load variables for all groups, then you cannot set variables, for example, at a datacenter level.

I believe this should be a toggle in ansible.cfg or on the command line so you can specify how to load group_vars so it doesn't break the existing use cases. We have multi-tenant environments here and used to get bit by this until I stopped using group_vars.

I'm not sure if you've heard of the roles and profiles pattern but it's a really good read and I highly recommend this approach if you're using any kind of configuration management tool. It was written for Puppet but it applies for all tools. You probably wouldn't use this for deployments or ad-hoc stuff but for straight configuration management it's great.

http://www.craigdunn.org/2012/05/239/

I use docker hosts so it is quite normal to have different instances of containers in different environments spread across hosts and this bit me.

Thanks to @hanej for the workaround and I would also add group vars do not work as advertised and seem to be broken by design.

My workaround is to define a play as follows:

- hosts: "{{ group }}"
  vars_files:
    - "vars/{{ group }}.yml" 
  roles:
    - role1 etc

Put a vars directory in the playbook dir with group1.yml, group2.yml in there.

It would be awful if your configurations changed simply because they were _accessed differently_. Just think of the affect that would have on attempts at convergent architecture. +1 to the Ansible team for keeping variables as consistent as possible.

The more interesting question is how you can apply the same role multiple times, sharing some configurations across hosts while _extending_ (and not replacing) others. I think role dependencies allow that, namely by:

  • Create "wrapper roles" that include roles you want to extend as dependencies.
  • Specify allow_duplicates: true in your wrapper's meta/main.yml.
  • Pass your config extensions to the role declaration (see below).
  • Set shared, stable configs in group/host_vars.

So in essence, your wrapper role would include a meta.yml like:

---
# roles/my-app/meta.yml
allow_duplicates: true
dependencies:
  - role: nginx
    nginx_servers:
      - name: server-myapp
        contents: |
          ...

When you apply the wrapper role, only the "extendable" configurations are actually changed. The burden is on developers to design their roles in a way that supports this pattern.

Ansible team: was this your intention for how roles would be used? I honestly haven't seen a better use-case for dependencies; everything else seems to lead to code soup.

I just joined victim list today after 1 hour debugging and fortunate enough to found this post.
I think it's only documentation problem, no one will complain if it's clearly noted in here:
http://docs.ansible.com/ansible/intro_inventory.html#group-variables
the fact that variables are associated with hosts, and groups is just a convenience way to set variables to all hosts belong to it. Variables will be overriden if the same host is used in multiple groups.

+1 This issue affects our software team.

Unfortunately we also ran into this issue today. Same story, Though we have a generic "tomcat" rol we vary the variables like port numbers based upon the group that is used.
However several tomcat instances can be deployed to the same physical servers so we created multiple groups in the inventories that contain the same set of servers.

However this approach doesn't seem to work. I have no idea yet how to design this. I cant create wrapper roles that have the variables because some of the variables are actually specific to the inventory. Think for example a database password oid.

I recently found this which eased my pain with this issue:
https://github.com/leapfrogonline/ansible-merge-vars

It allows merging of variables no matter where they were defined, so you can merge two dicts/lists in two different groups for example, as opposed to one group winning and overwriting the variable defined in the other groups as is the current behavior. The plugin only works with Ansible 2+.

A good article explaining the concept and the use case can be found here.

We recently come into this issue as well. We have multiple applications running on a single host as different users. For instance:

[group_a]
host1

[group_b]
host1

We have no way to specify ansible_user with vars properly, neither with host_vars nor group_vars. The latest definition always wins in both cases.

Our current workaround is to define multiple plays with different ansible_user in the playbook:

- hosts: group_a
  ansible_user: user_a
  roles:
    - my-role

- hosts: group_b
  ansible_user: user_b
  roles:
    - my-role

@hyojinbae suggested a smart way above to workaround this:

[group_a]
user_a@host1 ansible_user=user_a ansible_host=host1

[group_b]
user_b@host1 ansible_user=user_b ansible_host=host1

So now each host has a unique name in the inventory. Not a big fan to put host vars in inventory, but this is the best workaround I have seen so far.

@hyojinbae I think you are missing the points of hostvars and directives:

- hosts: group_a
  remote_user: user_a
  roles:
    - my-role

- hosts: group_b
  remote_user: user_b
  roles:
    - my-role

should work as long as you dont set ansible_user. The groups are a property of the host, not an entity to themselves.

There was a proposal to make it possible to write proper plugins for inventory management. I suspect now that this proposal was closed/finished, it should become possible to implement a dynamic inventory plugin which works around this point of confusion.

Going to +1 to question the ansible design here.
In the real world, many different applications can run on the same host.
Scoping group_vars at the host-level simply doesn't cover this use-case.

Just as a thought experiment:

  • Let's assume that the default behavior was modified so that running ansible-playbook with the "-l" option uses the group_vars child file (if it exists).
  • You could still handle the "datacenter" use-case that was mentioned through parent groups.
  • All that would change is that the "named" child group for a host match would win the inheritence battle, instead of the "last" child group for a host match.

Making this small change would make hundreds of DevOps-y people happy, and would hurt nobody.

From my previous comment, see https://github.com/ansible/proposals/issues/41 for a potential avenue to allow more pluggable behaviour here.

Can't believe that Ansible do not fix(give hints, warning) or reply more on this issue.
It took me a few hours to debug.

I encountered this as well. One approach was to use nested variables. So take inventory group my_app_servers for example, and it lines up with the file my_app_servers.yml inside your group_vars, and you've got more than one application to manage on a common host. I nested them like this

Example: my_app_servers.yml

application_env:
  - application1
  - application2

application_environments:
  application1: 
    app_port: 8080
    app_conf_dir: "/somedirectory/conf"
    app_logs_dir: "/somedirectory/logs"
    app_install_name: app1prd

   application2: 
     app_port: 8081
     app_conf_dir: "/somedirectory/conf"
     app_logs_dir: "/somedirectory/logs"
     app_install_name: app2prd

In your main.yml, define a loop, and iterate over the nested variable list like this,

with_items: "{{ application_env }}"
loop_control:
loop_var: application_env_item

Then in your tasks, you can grab the nested variable that you want, to execute something.
{{application_environments[application_env_item]['app_install_name']}}
(if you can pull one var out of that list, you can get the rest.). Hope that helps.

This issue (#9065) was closed back in 2015. Same #6538. Same #6666. And many more.
Here is 2018, 4 years after many issues were open, and apparently it's more hot than it was at the beginning, and everyone lost hours due to a counter-intuitive design.
IMHO this sounds like a need for review by design team.

If everyone will continue using hacks proposed, it will make ansible code ugly:

IMHO these hacks make it less attractive not fully aligned ansible's "Simple IT Automation" motto ;)

@bcoca -> Kindly guide what should be the trigger (from community) to have someone reevaluate PR #6666 OR #17236 or something similar that "just works intuitively"?

I think group_vars themselves should be able to be prioritized. Hiera for Puppet got it right. You can specify how lookups are done based on facts or groups. Let the end user be able to customize how group_vars should be processed and provide a mechanism to easily see the order in which Ansible is applying those variables.

FYI, we dont normally see notifications on closed issues, responding here because someone pointed me at this, I'm also going to lock so people don't keep commenting in a forum maintainers won't see.

@hanej i think you want ansible_group_priority and [defaults]precedence configuration, as for introspection right now you have ansible-inventory that shows you the final result, we are still working on giving people a better view into where/what set the final value of a variable.

@ReSearchITEng open a proposal (http://github.com/ansible/proposals), but this is about a fundamental focus of each diff app. Ansible focuses on a task for a host, the other tools you mention focus on a much more complex structure.

For ansible 'groups' are just a property of a host, not a principal entity, the other tools focus on groups as 'the main context' and everything else is added onto that.

I don't see either approach as 'wrong', just different. Trying to make Ansible work in such a way will break the simplicity paradigm, focus on A host and A task for what is just an organizational focus and transposing what i see as the 'accustomed way of thinking' people bring from other tools.

I would argue that this is just a perception on data organization and habits, but that is just my opinion. Feel free to submit a proposal, if you convince and gain enough traction with maintainers you might get the change you want, I just don't think it is a change we need.

Was this page helpful?
0 / 5 - 0 ratings