Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proxmox inventory: refresh_inventory not working with cache enabled #9710

Closed
1 task done
iqt4 opened this issue Feb 9, 2025 · 15 comments · Fixed by #9760
Closed
1 task done

proxmox inventory: refresh_inventory not working with cache enabled #9710

iqt4 opened this issue Feb 9, 2025 · 15 comments · Fixed by #9760
Labels
bug This issue/PR relates to a bug has_pr inventory inventory plugin plugins plugin (any type)

Comments

@iqt4
Copy link
Contributor

iqt4 commented Feb 9, 2025

Summary

I have a dynamic proxmox inventory with cache enabled.
During a play I create an lxc container with community.general.proxmox module.
After successfully creating the container I update the inventory with ansible.builtin.meta: refresh_inventory.
However, the inventory get's updated in the play but not the cache.

Issue Type

Bug Report

Component Name

proxmox inventory

Ansible Version

$ ansible --version
ansible [core 2.17.7]

Community.general Version

$ ansible-galaxy collection list community.general
community.general 10.3.0

Configuration

$ ansible-config dump --only-changed
CONFIG_FILE() = /Users/dirk/Development/Ansible/ansible.cfg
DEFAULT_CONNECTION_PLUGIN_PATH(/Users/dirk/Development/Ansible/ansible.cfg) = ['/Users/dirk/Development/Ansible/plugins/connection']
DEFAULT_HOST_LIST(/Users/dirk/Development/Ansible/ansible.cfg) = ['/Users/dirk/Development/Ansible/inventory']
DEFAULT_VAULT_IDENTITY_LIST(/Users/dirk/Development/Ansible/ansible.cfg) = ['default@files/keychain-client.sh']

OS / Environment

MacOS Sequoia 15.3

Steps to Reproduce

proxmox.yml

plugin: community.general.proxmox
# Dynamic Proxmox API URL from files/pve_vars.yml
url: "{% set vars = lookup('ansible.builtin.file', 'pve_vars.yml') | from_yaml %}
  https://{{ vars.pve_api_host }}:{{ vars.pve_api_port | default(8006) }}"

# Set REQUESTS_CA_BUNDLE or disable certificate validation
# validate_certs: false

# Read the credentials for the Proxmox API
# files/pve_vault links to inventory/host_vars/localhost/pve_vault
user: "{{ (lookup('ansible.builtin.unvault', 'pve_vault') | from_yaml)['vault_api_user'] }}"
token_id: "{{ (lookup('ansible.builtin.unvault', 'pve_vault') | from_yaml)['vault_api_token_id'] }}"
token_secret: "{{ (lookup('ansible.builtin.unvault', 'pve_vault') | from_yaml)['vault_api_token_secret'] }}"

# Cache for 1 hour (default)
cache: true
cache_plugin: jsonfile
cache_connection: .cache

Example playbook
- name: Create Debian container on Proxmox VE
  hosts: localhost
  gather_facts: false
  module_defaults:
    group/community.general.proxmox:
      api_user: "{{ pve_api_user }}"
      api_token_id: "{{ pve_api_token_id }}"
      api_token_secret: "{{ pve_api_token_secret }}"
      api_host: "{{ pve_api_host }}"
  tasks:    
    - name: Assign variables
      ansible.builtin.set_fact:
        pve_node: pve-1
        ct_name: debian-t
        ostemplate: qnap:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst
    - name: Create Debian container # noqa: args[module]
      community.general.proxmox:
        node: "{{ pve_node }}"
        hostname: "{{ ct_name }}"
        password: "{{ lookup('password', '/dev/null', length=8, chars='ascii_letters', seed=ct_name) }}"
        ostemplate: "{{ ostemplate }}"
        disk: local-zfs:4
        features:
          - "nesting=1"
        cores: 1
        memory: 512
        netif:
          net0: "name=eth0,bridge=vmbr0,tag=1000,ip=dhcp"
        # update: true
        state: present
      notify: Refresh inventory

  handlers:
#    - name: Invalidate cache as a workaround
#      ansible.builtin.file:
#        path: "../.cache"
#        state: absent
#      listen: Refresh inventory
    - name: Refresh inventory
      ansible.builtin.meta: refresh_inventory
      listen: Refresh inventory

  post_tasks:
    - name: Wait for the container to be started # noqa: args[module]
      community.general.proxmox:
        vmid: "{{ hostvars[ct_name].proxmox_vmid }}"
        state: started

Expected Results

If you run this play the first time, the lxc container is set up. The line vmid: "{{ hostvars[ct_name].proxmox_vmid }}" does not give an error and the container is started.

The cache should be updated as well and the play runs the second time (without cache update) without error.

Actual Results

Timestamp (and content) of the cache file under .cache directory does not change. Play gives an error:

The task includes an option with an undefined variable.. "hostvars['debian-t']" is undefined 

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@ansibullbot
Copy link
Collaborator

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibullbot
Copy link
Collaborator

@ansibullbot ansibullbot added bug This issue/PR relates to a bug inventory inventory plugin plugins plugin (any type) labels Feb 9, 2025
@Thulium-Drake
Copy link
Contributor

As a first question, did you also see this behavior with other dynamic inventory plugins? From what I read in the code, there's no special handling of cached resources (except that the module is compatible with caches).

Also, you can't change the inventory mid-play, same goes with the add_host module, see https://docs.ansible.com/ansible/latest/collections/ansible/builtin/add_host_module.html (first line of the synopsis), changing hosts is fine, but adding new ones requires a new play from the same playbook.

Which, unless I'm mistaken, should mean that it'll be fixed with this:

---
- name: Create Debian container on Proxmox VE
  hosts: localhost
  gather_facts: false
  module_defaults:
    group/community.general.proxmox:
      api_user: "{{ pve_api_user }}"
      api_token_id: "{{ pve_api_token_id }}"
      api_token_secret: "{{ pve_api_token_secret }}"
      api_host: "{{ pve_api_host }}"
  tasks:    
    - name: Assign variables
      ansible.builtin.set_fact:
        pve_node: pve-1
        ct_name: debian-t
        ostemplate: qnap:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst
    - name: Create Debian container # noqa: args[module]
      community.general.proxmox:
        node: "{{ pve_node }}"
        hostname: "{{ ct_name }}"
        password: "{{ lookup('password', '/dev/null', length=8, chars='ascii_letters', seed=ct_name) }}"
        ostemplate: "{{ ostemplate }}"
        disk: local-zfs:4
        features:
          - "nesting=1"
        cores: 1
        memory: 512
        netif:
          net0: "name=eth0,bridge=vmbr0,tag=1000,ip=dhcp"
        # update: true
        state: present
      notify: Refresh inventory

  handlers:
#    - name: Invalidate cache as a workaround
#      ansible.builtin.file:
#        path: "../.cache"
#        state: absent
#      listen: Refresh inventory
    - name: Refresh inventory
      ansible.builtin.meta: refresh_inventory
      listen: Refresh inventory

- name: Wait for containers to start
  hosts: localhost
  gather_facts: false
  tasks:
    - name: Wait for the container to be started # noqa: args[module]
      community.general.proxmox:
        vmid: "{{ hostvars[ct_name].proxmox_vmid }}"
        state: started

@felixfontein
Copy link
Collaborator

Basically you can update the inventory, but you cannot change the hosts the current play works on. After using meta: refresh_inventory or ansible.builtin.add_host, the new host(s) is/are there and you can use it/them, for example with delegate_to, and you can see it/them in groups['name-of-group'], etc. But the hosts selected for this play are still the same, as selection is made at the beginning of the play, when the host wasn't/weren't there yet.

Here's a simple test with the ini inventory plugin:

# Save as test-inventory.yml; also create an empty file test-inventory.ini.
# Run with:
# ansible-playbook test-inventory.yml -i test-inventory.ini

- name: Reset state
  hosts: localhost
  gather_facts: false
  tasks:
    - name: Reset inventory file
      ansible.builtin.copy:
        dest: test-inventory.ini
        content: |
          [group]
          primary ansible_connection=local
    - name: Refresh inventory
      ansible.builtin.meta: refresh_inventory

- name: Do something
  hosts: group
  gather_facts: false
  tasks:
    - name: Run command only on primary
      ansible.builtin.command:
        cmd: ls
    - name: Update inventory file
      ansible.builtin.copy:
        dest: test-inventory.ini
        content: |
          [group]
          primary ansible_connection=local
          secondary ansible_connection=local
    - name: Show group 'group', should still have one element
      ansible.builtin.debug:
        msg: "{{ groups.group }}"
      delegate_to: secondary
    - name: Refresh inventory
      ansible.builtin.meta: refresh_inventory
    - name: Show group 'group', should now have two elements
      ansible.builtin.debug:
        msg: "{{ groups.group }}"
      delegate_to: secondary
    - name: Run a command on all hosts, which happens to be only primary
      ansible.builtin.command:
        cmd: ls
    - name: Run a command on all hosts, which happens to be only primary
      ansible.builtin.command:
        cmd: ls
      delegate_to: secondary

- name: Do more
  hosts: group
  gather_facts: false
  tasks:
    - name: Run a command on primary and secondary
      ansible.builtin.command:
        cmd: ls
    - name: Reset inventory file
      ansible.builtin.copy:
        dest: test-inventory.ini
        content: |
          [group]
          primary ansible_connection=local
      run_once: true

I also tried this out by removing primary in the updated inventory (commenting out line 30), the result is that basically the middle play stops after the inventory refresh since the host is gone. So while you cannot add new hosts, you can remove existing ones with immediate effect :) (Just checked that because I was curious, I know this wasn't the question here.)

@Thulium-Drake
Copy link
Contributor

Basically you can update the inventory, but you cannot change the hosts the current play works on. After using meta: refresh_inventory or ansible.builtin.add_host, the new host(s) is/are there and you can use it/them, for example with delegate_to, and you can see it/them in groups['name-of-group'], etc. But the hosts selected for this play are still the same, as selection is made at the beginning of the play, when the host wasn't/weren't there yet.

A minor, yet very impactful detail I had wrong 😁 I didn't know that!

This sounds like a really neat trick that can be useful for undef cases, I'll be sure to keep that in mind for the documentation thingey me and gundalow are trying to set up :-)

I also tried this out by removing primary in the updated inventory (commenting out line 30), the result is that basically the middle play stops after the inventory refresh since the host is gone. So while you cannot add new hosts, you can remove existing ones with immediate effect :) (Just checked that because I was curious, I know this wasn't the question here.)

Same goes here as well, I think it makes sense from how a program works kinda way, though it feels just weird that you can footgun hosts out of your play. I don't see a case for fault handling here, there's plenty of better documented ways to do so.

@Thulium-Drake
Copy link
Contributor

But anyway, back on topic, @iqt4 the solution to your inventory woes is to either use delegate_to as Felix metioned, or to move the tasks you want to run on these machines to a different play in the book as in the example I made earlier :-)

Can you let us know if that fixes it?

@felixfontein
Copy link
Collaborator

I also tried this out by removing primary in the updated inventory (commenting out line 30), the result is that basically the middle play stops after the inventory refresh since the host is gone. So while you cannot add new hosts, you can remove existing ones with immediate effect :) (Just checked that because I was curious, I know this wasn't the question here.)

Same goes here as well, I think it makes sense from how a program works kinda way, though it feels just weird that you can footgun hosts out of your play. I don't see a case for fault handling here, there's plenty of better documented ways to do so.

Maybe there should be an ansible-lint rule that meta: refresh_inventory should always be the last task in a play (and not appear anywhere else). That way nothing can really go wrong. I might well miss some sensible real-world usage of meta: refresh_inventory elsewhere, though, so maybe this isn't a really good idea :)

@felixfontein
Copy link
Collaborator

I created ansible/ansible#84691 to improve the docs on meta: refresh_inventory. Comments/improvements are welcomed :)

@iqt4
Copy link
Contributor Author

iqt4 commented Feb 10, 2025

Thanks for your quick responses. Felix, I understand your code, but I assume the topic is cache handling of the proxmox inventory plugin.

  • The play I have written runs as expected: If the task Create Debian container succeeds, the handler is notified and the inventory (in memory) is refreshed. The assignment vmid: "{{ hostvars[ct_name].proxmox_vmid }}" does not give an error.

  • If you run the play a second time (handler is not notified), exactly this line fails, because the newly created host is not in the inventory (read from cache).

  • If I delete the cache (commented out in the example code), then the cache gets updated corectly in the first run

From ansible.builtin.meta module: ... you can disable the cache or, if available for your specific inventory datasource (e.g. aws), you can use the an inventory plugin instead of an inventory script). This is mainly useful when additional hosts are created and users wish to use them instead of using the ansible.builtin.add_host module.

From inventory cache: ... the cache should be updated with the new inventory if the user has enabled caching.

I hope that clarifies my observation and expectation.

@Thulium-Drake: I have not tried any other dynamic inventory plugins
@felixfontein: From the documentation I understand that meta: refresh_inventory could be anywhere in the play and a dynamic plugin (not script) should handle the cache.

@felixfontein
Copy link
Collaborator

Thanks for your quick responses. Felix, I understand your code, but I assume the topic is cache handling of the proxmox inventory plugin.

As all the experiments by @Thulium-Drake and me above shown your problem has nothing to do with caching. It's an inherent behavior of how inventories work in Ansible. You need to start a new play to include new hosts in whatever you specified in hosts. If you don't do that, nothing you do will ever update the selected host list (except potentially making it smaller).

@iqt4
Copy link
Contributor Author

iqt4 commented Feb 10, 2025

The problem does not appear if I switch off the cache in the inventory plugin. The code runs always without error - independent if the new host is created during the play (inventory changed) or already existing before the play (inventory not changed).

From my understanding there is an inventory (in memory, created at the begin/before the playbook) and a cache (json file). According to the documentation meta: refresh_inventory should update both (see links above).

I found this description hxxps://steampunk.si/blog/cloud-automation-with-ansible-series-inventory-caching/ (I don't have access to this service, thus I can't test it, but it confirms my expectations. Please also look at fix)

Conclusion: The issue remains, the cache file is unchanged if I create a new host and refresh the inventory with meta: refresh_inventory.

I will try another inventory plugin (may be nmap?), which will need some time.

Thanks again for your patience and support

@felixfontein
Copy link
Collaborator

One thing this plugin does wrong is how it handles the cache. It only has one knob "use cache or not" in its internals, which does not allow for "do not use cache, but refresh it" (self.get_option('cache') is True, but cache is False).

@iqt4
Copy link
Contributor Author

iqt4 commented Feb 13, 2025

I tested with the nmap inventory plugin and can confirm that this one runs without error. I started to debug the code:

  • According to the reference documentation inventory cache, the self._cache variable is updated "in one go" like this:
    if cache_needs_update:
    self._cache[cache_key] = results
  • The proxmox plugin runs a different iterative approach, see in function
    def _get_json(self, url, ignore_errors=None):
    if not self.use_cache or url not in self._cache.get(self.cache_key, {}):
    if self.cache_key not in self._cache:
    self._cache[self.cache_key] = {'url': ''}

    Following observations:
    • The condition if self.cache_key not in self._cache: equals to false if self._cache[self.cache_key] is already existing, which is the case after initial population. I.e. existing cache entries will not be deleted and only urls will be added/updated. That explains this observation proxmox inventory: refresh_inventory not working with cache enabled #9710 (comment)
    • If the statement self._cache[self.cache_key] = {'url': ''} shall empty the cache variable, the correct statement would be self._cache[self.cache_key] = {}

I need more time to test this but some of the cache logic should probably be outside the _get_json function.

@iqt4
Copy link
Contributor Author

iqt4 commented Feb 14, 2025

After some further analysis, I would like to make the proposal attached. The flow chart is not perfect but should get an idea of a solution.
Please let me know if I should continue with coding / pull request.

Inventory.drawio.pdf

iqt4 added a commit to iqt4/community.general that referenced this issue Feb 16, 2025
@iqt4
Copy link
Contributor Author

iqt4 commented Feb 16, 2025

Simplified approach: Inventory-Proxmox v2.drawio.pdf

felixfontein pushed a commit that referenced this issue Feb 17, 2025
* Proposal for #9710

* Fixed comments

* Fixed trailing whitespace

* Fixed changelog fragment
patchback bot pushed a commit that referenced this issue Feb 17, 2025
* Proposal for #9710

* Fixed comments

* Fixed trailing whitespace

* Fixed changelog fragment

(cherry picked from commit d696bb7)
patchback bot pushed a commit that referenced this issue Feb 17, 2025
* Proposal for #9710

* Fixed comments

* Fixed trailing whitespace

* Fixed changelog fragment

(cherry picked from commit d696bb7)
felixfontein pushed a commit that referenced this issue Feb 17, 2025
…#9710 (caching) (#9769)

proxmox inventory: proposal for #9710 (caching) (#9760)

* Proposal for #9710

* Fixed comments

* Fixed trailing whitespace

* Fixed changelog fragment

(cherry picked from commit d696bb7)

Co-authored-by: Dirk S. <[email protected]>
felixfontein pushed a commit that referenced this issue Feb 17, 2025
#9710 (caching) (#9770)

proxmox inventory: proposal for #9710 (caching) (#9760)

* Proposal for #9710

* Fixed comments

* Fixed trailing whitespace

* Fixed changelog fragment

(cherry picked from commit d696bb7)

Co-authored-by: Dirk S. <[email protected]>
rt-vnx pushed a commit to rt-vnx/community.general that referenced this issue Feb 20, 2025
…nsible-collections#9760)

* Proposal for ansible-collections#9710

* Fixed comments

* Fixed trailing whitespace

* Fixed changelog fragment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue/PR relates to a bug has_pr inventory inventory plugin plugins plugin (any type)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants