Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job cache exhausting inodes #54924

Closed
OrangeDog opened this issue Oct 8, 2019 · 8 comments
Closed

Job cache exhausting inodes #54924

OrangeDog opened this issue Oct 8, 2019 · 8 comments
Assignees
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged stale
Milestone

Comments

@OrangeDog
Copy link
Contributor

OrangeDog commented Oct 8, 2019

Description of Issue

On a system with eight minions running state.apply every 15 minutes, the /var/cache/master/jobs tree used up the majority of the inodes on the file system within a year.

$ df -hi /var
Filesystem           Inodes  IUsed IFree IUse% Mounted on
/dev/mapper/vg1-var    299K   299K     0  100% /var
$ sudo du --inodes -x /var | sort -n | tail
3895    /var/lib/dpkg
4000    /var/cache/salt/master/gitfs/refs
4473    /var/lib
5709    /var/cache/salt/master/gitfs/hash
12985   /var/cache/salt/master/gitfs
280775  /var/cache/salt/master/jobs
297959  /var/cache/salt/master
300225  /var/cache/salt
300356  /var/cache
305205  /var

I'm afraid I didn't save any info about the files before deleting them.

Setup

All jobs config is the defaults

#cachedir: /var/cache/salt/master
#keep_jobs: 24
#job_cache: True

Versions Report

Salt Version:
           Salt: 2019.2.1

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: 0.26.0
        libnacl: Not Installed
       M2Crypto: 0.32.0
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.26.2
         Python: 3.6.8 (default, Aug 20 2019, 17:12:48)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: ISO-8859-1
        machine: x86_64
        release: 4.15.0-65-generic
         system: Linux
        version: Ubuntu 18.04 bionic
@OrangeDog
Copy link
Contributor Author

I have mostly run 2018.3.x, having reverted after trying the 2019.2.0 release.

@arizvisa
Copy link
Contributor

arizvisa commented Oct 9, 2019

There might not be much that can be done since the local_cache returner uses inodes and is just designed like that. I had to switch to the etcd returner for that reason. Maybe you can find an alternative returner that fits your setup.

You also might be able to create another filesystem w/ more inodes on a ramdrive or something and mount it at that path.

@OrangeDog
Copy link
Contributor Author

Except there should be less than 1000 jobs being retained with my setup.
Three days later and /var/cache/salt/master is back up to 209614 inodes.

$ sudo du --inodes -x /var/cache/salt/master/jobs | sort -n | tail
923     /var/cache/salt/master/jobs/12
923     /var/cache/salt/master/jobs/3e
924     /var/cache/salt/master/jobs/cb
925     /var/cache/salt/master/jobs/9c
930     /var/cache/salt/master/jobs/d8
939     /var/cache/salt/master/jobs/0a
943     /var/cache/salt/master/jobs/15
956     /var/cache/salt/master/jobs/23
970     /var/cache/salt/master/jobs/af
209614  /var/cache/salt/master/jobs

The oldest file is from 2019-10-10 10:14:57 (24h ago)

@OrangeDog
Copy link
Contributor Author

After some random sampling I think this may be caused (or at least exacerbated) by #54941.
The cached jobs I looked at were all runner.mine.get, which would be called repeatedly when building pillar data.

@dwoz dwoz added Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged and removed needs-triage labels Oct 14, 2019
@dwoz
Copy link
Contributor

dwoz commented Oct 14, 2019

It still might make sense to nest the jobs in a deeper directory structure.

@OrangeDog
Copy link
Contributor Author

a deeper directory structure

would use even more inodes, no?

@OrangeDog
Copy link
Contributor Author

After upgrading to 2019.2.2 (and waiting 24h) the usage is much reduced.

$ df -hi /var
Filesystem          Inodes IUsed IFree IUse% Mounted on
/dev/mapper/vg1-var   299K   88K  212K   30% /var
$ sudo du --inodes -x /var | sort -n | tail
3901    /var/lib/dpkg
4029    /var/cache/salt/master/gitfs/refs
4481    /var/lib
5709    /var/cache/salt/master/gitfs/hash
13217   /var/cache/salt/master/gitfs
64723   /var/cache/salt/master/jobs
81812   /var/cache/salt/master
84080   /var/cache/salt
84218   /var/cache
89056   /var

@waynew waynew added this to the Blocked milestone Dec 13, 2019
@stale
Copy link

stale bot commented Jan 12, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Jan 12, 2020
@stale stale bot closed this as completed Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged stale
Projects
None yet
Development

No branches or pull requests

5 participants