Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master auth delay #49586

Open
cavepopo opened this issue Sep 9, 2018 · 72 comments
Open

master auth delay #49586

cavepopo opened this issue Sep 9, 2018 · 72 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior gitfs Performance
Milestone

Comments

@cavepopo
Copy link
Contributor

cavepopo commented Sep 9, 2018

Description of Issue/Question

using gitfs (pygit based) for states and external pillar, we are experiencing major performance drawback, the ratio from locally stored states and pillar and a gitfs based storage is about x5 (talking about time to complete setup). Local storage taking about 7/8 minutes when gitfs based master takes about 45 minutes to achieve the exact same setup. :(

for now I am just wondering (without real doubt though) if I am the only one to experience this ? I am quite sure I am, but I wanted to be sure first. Then I will be adding debug log and all this in this issue for tracking root cause.

So : Anyone experiencing this kind of performance using gitfs for state and external pillar ?

Steps to Reproduce Issue

(Include debug logs if possible and relevant.)

Versions Report

(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
master / minion in latest (2018.3.2) as :

Salt Version:
           Salt: 2018.3.2
 
Dependency Versions:
           cffi: 1.11.4
       cherrypy: Not Installed
       dateutil: 2.2
      docker-py: Not Installed
          gitdb: 0.5.4
      gitpython: 0.3.2 RC1
          ioflo: Not Installed
         Jinja2: 2.9.4
        libgit2: 0.26.0
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.2
   mysql-python: 1.2.3
      pycparser: 2.10
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.26.3
         Python: 2.7.9 (default, Jun 29 2016, 13:08:31)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.4.0
           RAET: Not Installed
          smmap: 0.8.2
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
 
System Versions:
           dist: debian 8.11 
         locale: UTF-8
        machine: x86_64
        release: 4.15.18-1-pve
         system: Linux
        version: debian 8.11 
 
@cachedout
Copy link
Contributor

This is not typical performance but, of course, Salt's gitfs related to the performance of the git server itself. Do you have large objects in your repo? There are pretty well-known performance problems when serving large objects out of a git repo.

@cavepopo
Copy link
Contributor Author

Hi @cachedout , thanks for the answer, define large objects ? We do have some old war, the whole repo is about 2Go, is that too big ?
I can see that for every request (state fetch) there is a 3 minutes "inactivity".
I'll get some debug info and come back here.

@Ch3LL
Copy link
Contributor

Ch3LL commented Sep 10, 2018

setting trace instead of debug might show more information if there is inactivity and debug is not showing anything just fyi. look forward to the logs :)

@Ch3LL Ch3LL added the info-needed waiting for more info label Sep 10, 2018
@Ch3LL Ch3LL added this to the Blocked milestone Sep 10, 2018
@cavepopo
Copy link
Contributor Author

Hi,
Sorry for being so late, here is the log for some quite slow operation, a simple state apply that takes 20 seconds on a non-gitfs master, takes more than 1 minute here.
https://gist.github.com/cavepopo/9090a45b7b408cfba34315cb82bdc53a

If you got any clues what causes such a big difference between gitfs and non-gitfs.
Ping from master to gitlab server is normal (time=0.020ms)

@Ch3LL
Copy link
Contributor

Ch3LL commented Sep 28, 2018

thanks for the logs. i'm not seeing anything that jumps out at me so I'm wondering if it is the war files/2GB size of the repo.

ping @terminalmage do you know if that sort of size could slow things down? The original question was "We do have some old war, the whole repo is about 2Go, is that too big ?"

@terminalmage
Copy link
Contributor

Unfortunately, that doesn't really tell us anything about fileclient requests, for that you'd need some logging from the minion. Trace logging is overkill, debug is fine. Please add the following to /etc/salt/minion on that machine:

log_level: '%(asctime)s,%(msecs)03d [%(name)-17s:%(lineno)-4d][%(levelname)-8s][%(process)d] %(message)s'

Next, login to the minion and run time salt-call -l debug state.apply common.nslcd. It's not necessary to restart the salt-minion service after adding the above configuration, since salt-call runs in a new process, separate from the minion daemon.

@cavepopo
Copy link
Contributor Author

cavepopo commented Oct 2, 2018

Here we go :

root@opendj1 [~]: time salt-call -l debug state.apply common.nslcd |tee -a /tmp/minion.log
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: opendj1.admin.ptf1
[DEBUG   ] Configuration file path: /etc/salt/minion
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] The `lspci` binary is not available on the system. GPU grains will not be available.
[DEBUG   ] Please install 'virt-what' to improve results of the 'virtual' grain.
[DEBUG   ] Connecting to master. Attempt 1 of 1
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
[DEBUG   ] Generated random reconnect delay between '1000ms' and '11000ms' (5904)
[DEBUG   ] Setting zmq_reconnect_ivl to '5904ms'
[DEBUG   ] Setting zmq_reconnect_ivl_max to '11000ms'
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'clear')
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_2150d1a384_pub.ipc
[DEBUG   ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_2150d1a384_pull.ipc
[DEBUG   ] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_2150d1a384_pull.ipc
[DEBUG   ] Sending event: tag = salt/auth/creds; data = {'_stamp': '2018-10-02T11:58:13.308831', 'creds': {'publish_port': 4505, 'aes': 'ZwDt8zohMtCSpFlgPL7gQovFEV1NyC0C0xsZOb1VBBVF/eYQofOXYK3g9UZbXbyzaIBwk2Zf/AM=', 'master_uri': 'tcp://10.101.10.56:4506'}, 'key': ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')}
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Determining pillar cache
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] LazyLoaded state.apply
[DEBUG   ] LazyLoaded saltutil.is_running
[DEBUG   ] LazyLoaded grains.get
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')


[INFO    ] Loading fresh modules for state activity
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] In saltenv 'base', looking at rel_path 'common/nslcd.sls' to resolve 'salt://common/nslcd.sls'
[DEBUG   ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/nslcd.sls' to resolve 'salt://common/nslcd.sls'
[DEBUG   ] Fetching file from saltenv 'base', ** attempting ** 'salt://common/nslcd.sls'
[DEBUG   ] No dest file found
[INFO    ] Fetching file from saltenv 'base', ** done ** 'common/nslcd.sls'
[DEBUG   ] compile template: /var/cache/salt/minion/files/base/common/nslcd.sls
[DEBUG   ] Jinja search path: ['/var/cache/salt/minion/files/base']
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nslcd.sls' using 'jinja' renderer: 0.00323104858398
[DEBUG   ] Rendered data from file: /var/cache/salt/minion/files/base/common/nslcd.sls:
####### REQUIREMENTS
include:
  - common.nsswitch


######## NSLCD AUTH MODULE full state
#### pkg
nslcd:
  pkg.installed:
    - name: nslcd
  service:
    - enable: True
    - running

#### config file
nslcd_config_file:
  file.managed:
    - name: /etc/nslcd.conf
    - source: salt://common/files/auth/nslcd.conf
    - template: jinja
    - user: root
    - group: root
    - mode: 640
    - require:
      - pkg: nslcd
    - watch_in:
      - service: nslcd

[DEBUG   ] LazyLoaded config.get
[DEBUG   ] Results of YAML rendering: 
OrderedDict([('include', ['common.nsswitch']), ('nslcd', OrderedDict([('pkg.installed', [OrderedDict([('name', 'nslcd')])]), ('service', [OrderedDict([('enable', True)]), 'running'])])), ('nslcd_config_file', OrderedDict([('file.managed', [OrderedDict([('name', '/etc/nslcd.conf')]), OrderedDict([('source', 'salt://common/files/auth/nslcd.conf')]), OrderedDict([('template', 'jinja')]), OrderedDict([('user', 'root')]), OrderedDict([('group', 'root')]), OrderedDict([('mode', 640)]), OrderedDict([('require', [OrderedDict([('pkg', 'nslcd')])])]), OrderedDict([('watch_in', [OrderedDict([('service', 'nslcd')])])])])]))])
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nslcd.sls' using 'yaml' renderer: 0.00685811042786
[DEBUG   ] In saltenv 'base', looking at rel_path 'common/nsswitch.sls' to resolve 'salt://common/nsswitch.sls'
[DEBUG   ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/nsswitch.sls' to resolve 'salt://common/nsswitch.sls'
[DEBUG   ] compile template: /var/cache/salt/minion/files/base/common/nsswitch.sls
[DEBUG   ] Jinja search path: ['/var/cache/salt/minion/files/base']
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nsswitch.sls' using 'jinja' renderer: 0.000956058502197
[DEBUG   ] Rendered data from file: /var/cache/salt/minion/files/base/common/nsswitch.sls:
######## NSSWITCH config file
nsswitch_config_file:
    file.managed:
      - name: /etc/nsswitch.conf
      - source: salt://common/files/auth/nsswitch.conf
      - user: root
      - group: root
      - mode: 644
[DEBUG   ] Results of YAML rendering: 
OrderedDict([('nsswitch_config_file', OrderedDict([('file.managed', [OrderedDict([('name', '/etc/nsswitch.conf')]), OrderedDict([('source', 'salt://common/files/auth/nsswitch.conf')]), OrderedDict([('user', 'root')]), OrderedDict([('group', 'root')]), OrderedDict([('mode', 644)])])]))])
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nsswitch.sls' using 'yaml' renderer: 0.00209498405457
[DEBUG   ] LazyLoaded file.managed
[INFO    ] Running state [/etc/nsswitch.conf] at time 13:58:56.766240
[INFO    ] Executing state file.managed for /etc/nsswitch.conf
[DEBUG   ] LazyLoaded file.user_to_uid
[DEBUG   ] LazyLoaded cp.hash_file
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
[DEBUG   ] In saltenv 'base', looking at rel_path 'common/files/auth/nsswitch.conf' to resolve 'salt://common/files/auth/nsswitch.conf'
[DEBUG   ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/files/auth/nsswitch.conf' to resolve 'salt://common/files/auth/nsswitch.conf'
[INFO    ] File /etc/nsswitch.conf is in the correct state
[INFO    ] Completed state [/etc/nsswitch.conf] at time 13:58:56.785831 duration_in_ms=19.591
[DEBUG   ] Module PSGet: Only available on Windows systems
[DEBUG   ] You should upgrade pyOpenSSL to at least 0.15.1 to enable the full use of X509 extensions in the tls module
[DEBUG   ] LazyLoaded service.start
[DEBUG   ] LazyLoaded service.running
[DEBUG   ] LazyLoaded pkg.installed
[DEBUG   ] Module DSC: Only available on Windows systems
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Generating function for user module
[DEBUG   ] Could not LazyLoad pkg.ex_mod_init: 'pkg.ex_mod_init' is not available.
[INFO    ] Running state [nslcd] at time 13:58:57.237898
[INFO    ] Executing state pkg.installed for nslcd
[INFO    ] Executing command ['dpkg-query', '--showformat', '${Status} ${Package} ${Version} ${Architecture}\n', '-W'] in directory '/root'
[DEBUG   ] Could not LazyLoad pkg.normalize_name: 'pkg.normalize_name' is not available.
[INFO    ] All specified packages are already installed
[INFO    ] Completed state [nslcd] at time 13:58:57.674831 duration_in_ms=436.933
[INFO    ] Running state [/etc/nslcd.conf] at time 13:58:57.675092
[INFO    ] Executing state file.managed for /etc/nslcd.conf
[DEBUG   ] In saltenv 'base', looking at rel_path 'common/files/auth/nslcd.conf' to resolve 'salt://common/files/auth/nslcd.conf'
[DEBUG   ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/files/auth/nslcd.conf' to resolve 'salt://common/files/auth/nslcd.conf'
[DEBUG   ] Jinja search path: ['/var/cache/salt/minion/files/base']
[INFO    ] File /etc/nslcd.conf is in the correct state
[INFO    ] Completed state [/etc/nslcd.conf] at time 13:58:57.696569 duration_in_ms=21.477
[INFO    ] Running state [nslcd] at time 13:58:57.696746
[INFO    ] Executing state service.running for nslcd
[INFO    ] Executing command ['systemctl', 'status', 'nslcd.service', '-n', '0'] in directory '/root'
[DEBUG   ] stdout: ● nslcd.service - LSB: LDAP connection daemon
   Loaded: loaded (/etc/init.d/nslcd)
   Active: active (running) since Thu 2018-09-27 17:36:06 CEST; 4 days ago
  Process: 26339 ExecStop=/etc/init.d/nslcd stop (code=exited, status=0/SUCCESS)
  Process: 15190 ExecStart=/etc/init.d/nslcd start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/nslcd.service
           └─15198 /usr/sbin/nslcd
[INFO    ] Executing command ['systemctl', 'is-active', 'nslcd.service'] in directory '/root'
[DEBUG   ] output: active
[INFO    ] Executing command ['systemctl', 'is-enabled', 'nslcd.service'] in directory '/root'
[DEBUG   ] output: enabled
[INFO    ] The service nslcd is already running
[INFO    ] Completed state [nslcd] at time 13:58:57.809551 duration_in_ms=112.804
[DEBUG   ] File /var/cache/salt/minion/accumulator/140172865696720 does not exist, no need to cleanup.
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
[DEBUG   ] LazyLoaded highstate.output
local:
----------
          ID: nsswitch_config_file
    Function: file.managed
        Name: /etc/nsswitch.conf
      Result: True
     Comment: File /etc/nsswitch.conf is in the correct state
     Started: 13:58:56.766240
    Duration: 19.591 ms
     Changes:   
----------
          ID: nslcd
    Function: pkg.installed
      Result: True
     Comment: All specified packages are already installed
     Started: 13:58:57.237898
    Duration: 436.933 ms
     Changes:   
----------
          ID: nslcd_config_file
    Function: file.managed
        Name: /etc/nslcd.conf
      Result: True
     Comment: File /etc/nslcd.conf is in the correct state
     Started: 13:58:57.675092
    Duration: 21.477 ms
     Changes:   
----------
          ID: nslcd
    Function: service.running
      Result: True
     Comment: The service nslcd is already running
     Started: 13:58:57.696747
    Duration: 112.804 ms
     Changes:   

Summary for local
------------
Succeeded: 4
Failed:    0
------------
Total states run:     4
Total run time: 590.805 ms

real	0m45.907s
user	0m1.751s
sys	0m0.752s

@terminalmage
Copy link
Contributor

I'm sorry, I mistyped above. Instead of setting log_level to that value, I needed you to set log_fmt_console. Doing that will make salt-call add in timestamps that will help us see where any slowdown may have occurred.

Could you repeat the above after setting log_fmt_console? Sorry again.

@cavepopo
Copy link
Contributor Author

cavepopo commented Oct 3, 2018

Yep I saw that but I guessed you were looking for something specific in that log, anyway here the time stamped log :

 root@opendj1 [~]: time salt-call -l debug state.apply common.nslcd
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: opendj1.admin.ptf2
09:47:51,055 [salt.utils.parsers:234 ][DEBUG   ][5147] Configuration file path: /etc/salt/minion
09:47:51,055 [salt.utils.verify:535 ][WARNING ][5147] Insecure logging configuration detected! Sensitive data may be logged.
09:47:51,056 [salt.loader      :697 ][DEBUG   ][5147] Grains refresh requested. Refreshing grains.
09:47:51,056 [salt.config      :2117][DEBUG   ][5147] Reading configuration from /etc/salt/minion
09:47:51,057 [salt.config      :2260][DEBUG   ][5147] Including configuration from '/etc/salt/minion.d/_schedule.conf'
09:47:51,057 [salt.config      :2117][DEBUG   ][5147] Reading configuration from /etc/salt/minion.d/_schedule.conf
09:47:51,510 [salt.loaded.int.grains.core                           :179 ][DEBUG   ][5147] The `lspci` binary is not available on the system. GPU grains will not be available.
09:47:51,538 [salt.loaded.int.grains.core                           :598 ][DEBUG   ][5147] Please install 'virt-what' to improve results of the 'virtual' grain.
09:47:51,603 [salt.loaded.int.grains.extra                          :70  ][DEBUG   ][5147] Loading static grains from /etc/salt/grains
09:47:51,632 [salt.minion                                           :674 ][DEBUG   ][5147] Connecting to master. Attempt 1 of 1
09:47:51,633 [salt.minion                                           :237 ][DEBUG   ][5147] Master URI: tcp://10.201.10.56:4506
09:47:51,646 [salt.crypt                                            :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:47:51,652 [salt.transport.zeromq                                 :388 ][DEBUG   ][5147] Generated random reconnect delay between '1000ms' and '11000ms' (1183)
09:47:51,652 [salt.transport.zeromq                                 :391 ][DEBUG   ][5147] Setting zmq_reconnect_ivl to '1183ms'
09:47:51,652 [salt.transport.zeromq                                 :397 ][DEBUG   ][5147] Setting zmq_reconnect_ivl_max to '11000ms'
09:47:51,652 [salt.transport.zeromq                                 :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', 'clear')
09:47:51,652 [salt.transport.zeromq                                 :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:47:51,653 [salt.transport.zeromq                                 :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:47:51,653 [salt.crypt                                            :207 ][DEBUG   ][5147] salt.crypt.get_rsa_pub_key: Loading public key
09:47:51,668 [salt.crypt                                            :847 ][DEBUG   ][5147] Decrypting the current master AES key
09:47:51,668 [salt.crypt                                            :199 ][DEBUG   ][5147] salt.crypt.get_rsa_key: Loading private key
09:47:51,668 [salt.crypt                                            :179 ][DEBUG   ][5147] salt.crypt._get_key_with_evict: Loading private key
09:47:51,668 [salt.crypt                                            :776 ][DEBUG   ][5147] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:47:51,696 [salt.crypt                                            :207 ][DEBUG   ][5147] salt.crypt.get_rsa_pub_key: Loading public key
09:47:51,724 [salt.transport.zeromq                                 :437 ][DEBUG   ][5147] Connecting the Minion to the Master publish port, using the URI: tcp://10.201.10.56:4505
09:47:51,724 [salt.crypt                                            :199 ][DEBUG   ][5147] salt.crypt.get_rsa_key: Loading private key
09:47:51,724 [salt.crypt                                            :776 ][DEBUG   ][5147] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:47:51,751 [salt.pillar                                           :58  ][DEBUG   ][5147] Determining pillar cache
09:47:51,751 [salt.transport.zeromq                                 :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:47:51,752 [salt.crypt                                            :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:47:51,752 [salt.transport.zeromq                                 :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:47:51,756 [salt.transport.zeromq                                 :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:47:51,981 [salt.crypt                                            :199 ][DEBUG   ][5147] salt.crypt.get_rsa_key: Loading private key
09:47:51,981 [salt.crypt                                            :776 ][DEBUG   ][5147] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:47:52,039 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded jinja.render
09:47:52,039 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded yaml.render
09:47:52,042 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded state.apply
09:47:52,044 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded saltutil.is_running
09:47:52,044 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded grains.get
09:47:52,049 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded config.get
09:47:52,053 [salt.transport.zeromq                                 :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:47:52,053 [salt.crypt                                            :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:47:52,054 [salt.transport.zeromq                                 :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:47:52,058 [salt.transport.zeromq                                 :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506


09:48:36,943 [salt.state                                            :923 ][INFO    ][5147] Loading fresh modules for state activity
09:48:36,976 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded jinja.render
09:48:36,976 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded yaml.render
09:48:36,989 [salt.fileclient                                       :1124][DEBUG   ][5147] In saltenv 'base', looking at rel_path 'common/nslcd.sls' to resolve 'salt://common/nslcd.sls'
09:48:36,990 [salt.fileclient                                       :1132][DEBUG   ][5147] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/nslcd.sls' to resolve 'salt://common/nslcd.sls'
09:48:36,990 [salt.template                                         :51  ][DEBUG   ][5147] compile template: /var/cache/salt/minion/files/base/common/nslcd.sls
09:48:36,991 [salt.utils.jinja                                      :69  ][DEBUG   ][5147] Jinja search path: [u'/var/cache/salt/minion/files/base']
09:48:36,996 [salt.transport.zeromq                                 :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:48:36,996 [salt.crypt                                            :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:48:36,997 [salt.transport.zeromq                                 :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:48:36,998 [salt.transport.zeromq                                 :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:48:36,998 [salt.utils.templates                                  :345 ][DEBUG   ][5147] Jinja2 environment trim_blocks was set to True by jinja_sls_env
09:48:36,998 [salt.utils.templates                                  :345 ][DEBUG   ][5147] Jinja2 environment lstrip_blocks was set to True by jinja_sls_env
09:48:37,004 [salt.template                                         :26  ][PROFILE ][5147] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nslcd.sls' using 'jinja' renderer: 0.0129718780518
09:48:37,004 [salt.template                                         :113 ][DEBUG   ][5147] Rendered data from file: /var/cache/salt/minion/files/base/common/nslcd.sls:
####### REQUIREMENTS
include:
  - common.nsswitch


######## NSLCD AUTH MODULE full state
#### pkg
nslcd:
  pkg.installed:
    - name: nslcd
  service:
    - enable: True
    - running

#### config file
nslcd_config_file:
  file.managed:
    - name: /etc/nslcd.conf
    - source: salt://common/files/auth/nslcd.conf
    - template: jinja
    - user: root
    - group: root
    - mode: 640
    - require:
      - pkg: nslcd
    - watch_in:
      - service: nslcd

09:48:37,012 [salt.loaded.int.render.yaml                           :68  ][DEBUG   ][5147] Results of YAML rendering: 
OrderedDict([(u'include', [u'common.nsswitch']), (u'nslcd', OrderedDict([(u'pkg.installed', [OrderedDict([(u'name', u'nslcd')])]), (u'service', [OrderedDict([(u'enable', True)]), u'running'])])), (u'nslcd_config_file', OrderedDict([(u'file.managed', [OrderedDict([(u'name', u'/etc/nslcd.conf')]), OrderedDict([(u'source', u'salt://common/files/auth/nslcd.conf')]), OrderedDict([(u'template', u'jinja')]), OrderedDict([(u'user', u'root')]), OrderedDict([(u'group', u'root')]), OrderedDict([(u'mode', 640)]), OrderedDict([(u'require', [OrderedDict([(u'pkg', u'nslcd')])])]), OrderedDict([(u'watch_in', [OrderedDict([(u'service', u'nslcd')])])])])]))])
09:48:37,012 [salt.template                                         :26  ][PROFILE ][5147] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nslcd.sls' using 'yaml' renderer: 0.00806307792664
09:48:37,025 [salt.fileclient                                       :1124][DEBUG   ][5147] In saltenv 'base', looking at rel_path 'common/nsswitch.sls' to resolve 'salt://common/nsswitch.sls'
09:48:37,025 [salt.fileclient                                       :1132][DEBUG   ][5147] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/nsswitch.sls' to resolve 'salt://common/nsswitch.sls'
09:48:37,025 [salt.template                                         :51  ][DEBUG   ][5147] compile template: /var/cache/salt/minion/files/base/common/nsswitch.sls
09:48:37,026 [salt.utils.jinja                                      :69  ][DEBUG   ][5147] Jinja search path: [u'/var/cache/salt/minion/files/base']
09:48:37,030 [salt.transport.zeromq                                 :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:48:37,030 [salt.crypt                                            :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:48:37,031 [salt.transport.zeromq                                 :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:48:37,033 [salt.transport.zeromq                                 :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:48:37,033 [salt.utils.templates                                  :345 ][DEBUG   ][5147] Jinja2 environment trim_blocks was set to True by jinja_sls_env
09:48:37,033 [salt.utils.templates                                  :345 ][DEBUG   ][5147] Jinja2 environment lstrip_blocks was set to True by jinja_sls_env
09:48:37,034 [salt.template                                         :26  ][PROFILE ][5147] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nsswitch.sls' using 'jinja' renderer: 0.00836706161499
09:48:37,034 [salt.template                                         :113 ][DEBUG   ][5147] Rendered data from file: /var/cache/salt/minion/files/base/common/nsswitch.sls:
######## NSSWITCH config file
nsswitch_config_file:
    file.managed:
      - name: /etc/nsswitch.conf
      - source: salt://common/files/auth/nsswitch.conf
      - user: root
      - group: root
      - mode: 644
09:48:37,036 [salt.loaded.int.render.yaml                           :68  ][DEBUG   ][5147] Results of YAML rendering: 
OrderedDict([(u'nsswitch_config_file', OrderedDict([(u'file.managed', [OrderedDict([(u'name', u'/etc/nsswitch.conf')]), OrderedDict([(u'source', u'salt://common/files/auth/nsswitch.conf')]), OrderedDict([(u'user', u'root')]), OrderedDict([(u'group', u'root')]), OrderedDict([(u'mode', 644)])])]))])
09:48:37,036 [salt.template                                         :26  ][PROFILE ][5147] Time (in seconds) to render '/var/cache/salt/minion/files/base/common/nsswitch.sls' using 'yaml' renderer: 0.00191283226013
09:48:37,037 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded config.option
09:48:37,043 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded file.managed
09:48:37,043 [salt.state                                            :1789][INFO    ][5147] Running state [/etc/nsswitch.conf] at time 09:48:37.043616
09:48:37,043 [salt.state                                            :1822][INFO    ][5147] Executing state file.managed for [/etc/nsswitch.conf]
09:48:37,049 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded file.user_to_uid
09:48:37,050 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded cp.hash_file
09:48:37,054 [salt.transport.zeromq                                 :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:48:37,054 [salt.crypt                                            :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:48:37,055 [salt.transport.zeromq                                 :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:48:37,055 [salt.transport.zeromq                                 :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:48:37,072 [salt.fileclient                                       :1124][DEBUG   ][5147] In saltenv 'base', looking at rel_path 'common/files/auth/nsswitch.conf' to resolve 'salt://common/files/auth/nsswitch.conf'
09:48:37,072 [salt.fileclient                                       :1132][DEBUG   ][5147] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/files/auth/nsswitch.conf' to resolve 'salt://common/files/auth/nsswitch.conf'
09:48:37,074 [salt.state                                            :308 ][INFO    ][5147] File /etc/nsswitch.conf is in the correct state
09:48:37,074 [salt.state                                            :1968][INFO    ][5147] Completed state [/etc/nsswitch.conf] at time 09:48:37.074297 (duration_in_ms=30.685)
09:48:37,080 [salt.utils.lazy                                       :100 ][DEBUG   ][5147] LazyLoaded systemd.booted
09:48:37,462 [salt.utils.lazy                                                  :100 ][DEBUG   ][5147] LazyLoaded service.start
09:48:37,462 [salt.utils.lazy                                                  :100 ][DEBUG   ][5147] LazyLoaded service.running
09:48:37,464 [salt.utils.lazy                                                  :100 ][DEBUG   ][5147] LazyLoaded pkg.installed
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.iptables
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.puppet_resource
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.mount_point
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.group
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.socket
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.service
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.package
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.process
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.sudo
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.pip_package
09:48:37,477 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.sysctl
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.system_info
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.ansible
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.command
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.facter
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.file
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.interface
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.supervisor
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.salt
09:48:37,478 [salt.loaded.int.module.testinframod                              :220 ][DEBUG   ][5147] Generating function for testinfra.user
09:48:37,481 [salt.loaded.int.module.tls                                       :151 ][DEBUG   ][5147] You should upgrade pyOpenSSL to at least 0.15.1 to enable the full use of X509 extensions in the tls module
09:48:37,593 [salt.loaded.int.module.win_dsc                                   :43  ][DEBUG   ][5147] DSC: Only available on Windows systems
09:48:37,600 [salt.loaded.int.module.win_psget                                 :36  ][DEBUG   ][5147] Module PSGet: Only available on Windows systems
09:48:37,632 [salt.utils.lazy                                                  :103 ][DEBUG   ][5147] Could not LazyLoad pkg.ex_mod_init: 'pkg.ex_mod_init' is not available.
09:48:37,632 [salt.state                                                       :1789][INFO    ][5147] Running state [nslcd] at time 09:48:37.632681
09:48:37,632 [salt.state                                                       :1822][INFO    ][5147] Executing state pkg.installed for [nslcd]
09:48:37,639 [salt.utils.lazy                                                  :103 ][DEBUG   ][5147] Could not LazyLoad pkg.resolve_capabilities: 'pkg.resolve_capabilities' is not available.
09:48:37,639 [salt.loaded.int.module.cmdmod                                    :392 ][INFO    ][5147] Executing command [u'dpkg-query', u'--showformat', u'${Status} ${Package} ${Version} ${Architecture}', u'-W'] in directory '/root'
09:48:38,004 [salt.utils.lazy                                                  :103 ][DEBUG   ][5147] Could not LazyLoad pkg.normalize_name: 'pkg.normalize_name' is not available.
09:48:38,004 [salt.state                                                       :308 ][INFO    ][5147] All specified packages are already installed
09:48:38,005 [salt.state                                                       :1968][INFO    ][5147] Completed state [nslcd] at time 09:48:38.004964 (duration_in_ms=372.284)
09:48:38,005 [salt.state                                                       :1789][INFO    ][5147] Running state [/etc/nslcd.conf] at time 09:48:38.005292
09:48:38,005 [salt.state                                                       :1822][INFO    ][5147] Executing state file.managed for [/etc/nslcd.conf]
09:48:38,023 [salt.fileclient                                                  :1124][DEBUG   ][5147] In saltenv 'base', looking at rel_path 'common/files/auth/nslcd.conf' to resolve 'salt://common/files/auth/nslcd.conf'
09:48:38,024 [salt.fileclient                                                  :1132][DEBUG   ][5147] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/common/files/auth/nslcd.conf' to resolve 'salt://common/files/auth/nslcd.conf'
09:48:38,024 [salt.utils.jinja                                                 :69  ][DEBUG   ][5147] Jinja search path: [u'/var/cache/salt/minion/files/base']
09:48:38,029 [salt.transport.zeromq                                            :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:48:38,029 [salt.crypt                                                       :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:48:38,029 [salt.transport.zeromq                                            :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:48:38,030 [salt.transport.zeromq                                            :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:48:38,030 [salt.utils.templates                                             :345 ][DEBUG   ][5147] Jinja2 environment trim_blocks was set to True by jinja_env
09:48:38,030 [salt.utils.templates                                             :345 ][DEBUG   ][5147] Jinja2 environment lstrip_blocks was set to True by jinja_env
09:48:38,039 [salt.state                                                       :308 ][INFO    ][5147] File /etc/nslcd.conf is in the correct state
09:48:38,039 [salt.state                                                       :1968][INFO    ][5147] Completed state [/etc/nslcd.conf] at time 09:48:38.039289 (duration_in_ms=33.997)
09:48:38,039 [salt.state                                                       :1789][INFO    ][5147] Running state [nslcd] at time 09:48:38.039535
09:48:38,039 [salt.state                                                       :1822][INFO    ][5147] Executing state service.running for [nslcd]
09:48:38,040 [salt.loaded.int.module.cmdmod                                    :392 ][INFO    ][5147] Executing command [u'systemctl', 'status', u'nslcd.service', u'-n', u'0'] in directory '/root'
09:48:38,070 [salt.loaded.int.module.cmdmod                                    :755 ][DEBUG   ][5147] stdout: ● nslcd.service - LSB: LDAP connection daemon
   Loaded: loaded (/etc/init.d/nslcd)
   Active: active (running) since Mon 2018-10-01 11:14:56 CEST; 1 day 22h ago
   CGroup: /system.slice/nslcd.service
           └─7570 /usr/sbin/nslcd
09:48:38,100 [salt.loaded.int.module.cmdmod                                    :392 ][INFO    ][5147] Executing command [u'systemctl', 'is-active', u'nslcd.service'] in directory '/root'
09:48:38,128 [salt.loaded.int.module.cmdmod                                    :755 ][DEBUG   ][5147] stdout: active
09:48:38,129 [salt.loaded.int.module.cmdmod                                    :392 ][INFO    ][5147] Executing command [u'systemctl', 'is-enabled', u'nslcd.service'] in directory '/root'
09:48:38,156 [salt.loaded.int.module.cmdmod                                    :755 ][DEBUG   ][5147] stdout: enabled
09:48:38,156 [salt.state                                                       :308 ][INFO    ][5147] The service nslcd is already running
09:48:38,156 [salt.state                                                       :1968][INFO    ][5147] Completed state [nslcd] at time 09:48:38.156793 (duration_in_ms=117.258)
09:48:38,157 [salt.state                                                       :2768][DEBUG   ][5147] File /var/cache/salt/minion/accumulator/139738542539536 does not exist, no need to cleanup
09:48:38,158 [salt.utils.lazy                                                  :100 ][DEBUG   ][5147] LazyLoaded state.check_result
09:48:38,158 [salt.transport.zeromq                                            :128 ][DEBUG   ][5147] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506', u'aes')
09:48:38,159 [salt.crypt                                                       :463 ][DEBUG   ][5147] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf2', u'tcp://10.201.10.56:4506')
09:48:38,159 [salt.transport.zeromq                                            :194 ][DEBUG   ][5147] Connecting the Minion to the Master URI (for the return server): tcp://10.201.10.56:4506
09:48:38,160 [salt.transport.zeromq                                            :1029][DEBUG   ][5147] Trying to connect to: tcp://10.201.10.56:4506
09:48:38,170 [salt.utils.lazy                                                  :100 ][DEBUG   ][5147] LazyLoaded highstate.output
local:
----------
          ID: nsswitch_config_file
    Function: file.managed
        Name: /etc/nsswitch.conf
      Result: True
     Comment: File /etc/nsswitch.conf is in the correct state
     Started: 09:48:37.043612
    Duration: 30.685 ms
     Changes:   
----------
          ID: nslcd
    Function: pkg.installed
      Result: True
     Comment: All specified packages are already installed
     Started: 09:48:37.632680
    Duration: 372.284 ms
     Changes:   
----------
          ID: nslcd_config_file
    Function: file.managed
        Name: /etc/nslcd.conf
      Result: True
     Comment: File /etc/nslcd.conf is in the correct state
     Started: 09:48:38.005292
    Duration: 33.997 ms
     Changes:   
----------
          ID: nslcd
    Function: service.running
      Result: True
     Comment: The service nslcd is already running
     Started: 09:48:38.039535
    Duration: 117.258 ms
     Changes:   

Summary for local
------------
Succeeded: 4
Failed:    0
------------
Total states run:     4
Total run time: 554.224 ms

real	0m47.891s
user	0m2.017s
sys	0m0.716s

I can see that 90% of the required time is about waiting for the connection to the master 4506 port... The ping to the master is not bad :

root@opendj1 [~]: ping 10.201.10.56
PING 10.201.10.56 (10.201.10.56) 56(84) bytes of data.
64 bytes from 10.201.10.56: icmp_seq=1 ttl=64 time=0.284 ms
64 bytes from 10.201.10.56: icmp_seq=2 ttl=64 time=0.087 ms
64 bytes from 10.201.10.56: icmp_seq=3 ttl=64 time=0.122 ms
64 bytes from 10.201.10.56: icmp_seq=4 ttl=64 time=0.117 ms
64 bytes from 10.201.10.56: icmp_seq=5 ttl=64 time=0.097 ms
64 bytes from 10.201.10.56: icmp_seq=6 ttl=64 time=0.093 ms
64 bytes from 10.201.10.56: icmp_seq=7 ttl=64 time=0.138 ms

A telnet o this port is instantaneous...

Any suggestion ?

@terminalmage
Copy link
Contributor

It looks like this may have nothing to do with the fileserver.

Can you try a couple salt-call commands that do not involve a fileclient request (e.g. salt-call test.ping -l debug or salt-call pkg.version bash -l debug), and see if you get a similar delay? If you see a similar delay, then try those same salt-call commands with -l trace and see if we get anything additional.

@cavepopo
Copy link
Contributor Author

cavepopo commented Oct 8, 2018

Hi,

There is definitely not the same delays involved here, see :

root@opendj1 [~]: time salt-call pkg.version bash -l debug
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: opendj1.admin.ptf1
09:21:33,535 [salt.utils.parsers:225 ][DEBUG   ][3867] Configuration file path: /etc/salt/minion
09:21:33,536 [salt.utils.verify:524 ][WARNING ][3867] Insecure logging configuration detected! Sensitive data may be logged.
09:21:33,536 [salt.config      :1764][DEBUG   ][3867] Reading configuration from /etc/salt/minion
09:21:33,537 [salt.config      :1913][DEBUG   ][3867] Including configuration from '/etc/salt/minion.d/_schedule.conf'
09:21:33,537 [salt.config      :1764][DEBUG   ][3867] Reading configuration from /etc/salt/minion.d/_schedule.conf
09:21:33,811 [salt.loaded.int.grains.core                          :172 ][DEBUG   ][3867] The `lspci` binary is not available on the system. GPU grains will not be available.
09:21:34,050 [salt.loaded.int.grains.core                          :501 ][DEBUG   ][3867] Please install 'virt-what' to improve results of the 'virtual' grain.
09:21:34,140 [salt.minion                                          :614 ][DEBUG   ][3867] Connecting to master. Attempt 1 of 1
09:21:34,149 [salt.crypt                                           :377 ][DEBUG   ][3867] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
09:21:34,154 [salt.transport.zeromq                                :334 ][DEBUG   ][3867] Generated random reconnect delay between '1000ms' and '11000ms' (10215)
09:21:34,154 [salt.transport.zeromq                                :337 ][DEBUG   ][3867] Setting zmq_reconnect_ivl to '10215ms'
09:21:34,154 [salt.transport.zeromq                                :342 ][DEBUG   ][3867] Setting zmq_reconnect_ivl_max to '11000ms'
09:21:34,155 [salt.transport.zeromq                                :85  ][DEBUG   ][3867] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'clear')
09:21:34,168 [salt.crypt                                           :758 ][DEBUG   ][3867] Decrypting the current master AES key
09:21:34,169 [salt.crypt                                           :682 ][DEBUG   ][3867] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:21:34,175 [salt.utils.event                                     :311 ][DEBUG   ][3867] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_2150d1a384_pub.ipc
09:21:34,175 [salt.utils.event                                     :314 ][DEBUG   ][3867] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_2150d1a384_pull.ipc
09:21:34,176 [salt.transport.ipc                                   :253 ][DEBUG   ][3867] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_2150d1a384_pull.ipc
09:21:34,176 [salt.utils.event                                     :740 ][DEBUG   ][3867] Sending event: tag = salt/auth/creds; data = {'_stamp': '2018-10-08T07:21:34.176722', 'creds': {'publish_port': 4505, 'aes': '75+B1PxGs1vmB8fP3clmI2rUx/PtMeFEFRRikpPOgLbdhuKbEQxvHMqz1a2yWzqGIEpcXbDMl9Y=', 'master_uri': 'tcp://10.101.10.56:4506'}, 'key': ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')}
09:21:34,178 [salt.crypt                                           :682 ][DEBUG   ][3867] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:21:34,181 [salt.pillar                                          :47  ][DEBUG   ][3867] Determining pillar cache
09:21:34,181 [salt.transport.zeromq                                :85  ][DEBUG   ][3867] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
09:21:34,182 [salt.crypt                                           :377 ][DEBUG   ][3867] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
09:21:34,401 [salt.crypt                                           :682 ][DEBUG   ][3867] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:21:34,421 [salt.utils.lazy                                      :97  ][DEBUG   ][3867] LazyLoaded jinja.render
09:21:34,422 [salt.utils.lazy                                      :97  ][DEBUG   ][3867] LazyLoaded yaml.render
09:21:34,457 [salt.utils.lazy                                      :97  ][DEBUG   ][3867] LazyLoaded pkg.version
09:21:34,458 [salt.utils.lazy                                      :97  ][DEBUG   ][3867] LazyLoaded pkg_resource.version
09:21:34,459 [salt.utils.lazy                                      :97  ][DEBUG   ][3867] LazyLoaded cmd.run_stdout
09:21:34,460 [salt.loaded.int.module.cmdmod                        :370 ][INFO    ][3867] Executing command ['dpkg-query', '--showformat', '${Status} ${Package} ${Version} ${Architecture}\n', '-W'] in directory '/root'
09:21:34,853 [salt.transport.zeromq                                :85  ][DEBUG   ][3867] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
09:21:34,853 [salt.crypt                                           :377 ][DEBUG   ][3867] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
09:21:34,870 [salt.utils.lazy                                      :97  ][DEBUG   ][3867] LazyLoaded nested.output
local:
    4.3-11+deb8u1

real	0m2.023s
user	0m1.094s
sys	0m0.623s

And :

root@opendj1 [~]: time salt-call test.ping -l debug
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: opendj1.admin.ptf1
09:21:42,447 [salt.utils.parsers:225 ][DEBUG   ][3917] Configuration file path: /etc/salt/minion
09:21:42,447 [salt.utils.verify:524 ][WARNING ][3917] Insecure logging configuration detected! Sensitive data may be logged.
09:21:42,447 [salt.config      :1764][DEBUG   ][3917] Reading configuration from /etc/salt/minion
09:21:42,448 [salt.config      :1913][DEBUG   ][3917] Including configuration from '/etc/salt/minion.d/_schedule.conf'
09:21:42,448 [salt.config      :1764][DEBUG   ][3917] Reading configuration from /etc/salt/minion.d/_schedule.conf
09:21:42,723 [salt.loaded.int.grains.core                          :172 ][DEBUG   ][3917] The `lspci` binary is not available on the system. GPU grains will not be available.
09:21:42,951 [salt.loaded.int.grains.core                          :501 ][DEBUG   ][3917] Please install 'virt-what' to improve results of the 'virtual' grain.
09:21:43,042 [salt.minion                                          :614 ][DEBUG   ][3917] Connecting to master. Attempt 1 of 1
09:21:43,051 [salt.crypt                                           :377 ][DEBUG   ][3917] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
09:21:43,056 [salt.transport.zeromq                                :334 ][DEBUG   ][3917] Generated random reconnect delay between '1000ms' and '11000ms' (2536)
09:21:43,057 [salt.transport.zeromq                                :337 ][DEBUG   ][3917] Setting zmq_reconnect_ivl to '2536ms'
09:21:43,057 [salt.transport.zeromq                                :342 ][DEBUG   ][3917] Setting zmq_reconnect_ivl_max to '11000ms'
09:21:43,058 [salt.transport.zeromq                                :85  ][DEBUG   ][3917] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'clear')
09:21:43,072 [salt.crypt                                           :758 ][DEBUG   ][3917] Decrypting the current master AES key
09:21:43,073 [salt.crypt                                           :682 ][DEBUG   ][3917] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:21:43,081 [salt.utils.event                                     :311 ][DEBUG   ][3917] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_2150d1a384_pub.ipc
09:21:43,081 [salt.utils.event                                     :314 ][DEBUG   ][3917] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_2150d1a384_pull.ipc
09:21:43,081 [salt.transport.ipc                                   :253 ][DEBUG   ][3917] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_2150d1a384_pull.ipc
09:21:43,082 [salt.utils.event                                     :740 ][DEBUG   ][3917] Sending event: tag = salt/auth/creds; data = {'_stamp': '2018-10-08T07:21:43.081940', 'creds': {'publish_port': 4505, 'aes': '75+B1PxGs1vmB8fP3clmI2rUx/PtMeFEFRRikpPOgLbdhuKbEQxvHMqz1a2yWzqGIEpcXbDMl9Y=', 'master_uri': 'tcp://10.101.10.56:4506'}, 'key': ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')}
09:21:43,083 [salt.crypt                                           :682 ][DEBUG   ][3917] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:21:43,087 [salt.pillar                                          :47  ][DEBUG   ][3917] Determining pillar cache
09:21:43,087 [salt.transport.zeromq                                :85  ][DEBUG   ][3917] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
09:21:43,087 [salt.crypt                                           :377 ][DEBUG   ][3917] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
09:21:43,300 [salt.crypt                                           :682 ][DEBUG   ][3917] Loaded minion key: /etc/salt/pki/minion/minion.pem
09:21:43,322 [salt.utils.lazy                                      :97  ][DEBUG   ][3917] LazyLoaded jinja.render
09:21:43,322 [salt.utils.lazy                                      :97  ][DEBUG   ][3917] LazyLoaded yaml.render
09:21:43,349 [salt.utils.lazy                                      :97  ][DEBUG   ][3917] LazyLoaded test.ping
09:21:43,350 [salt.transport.zeromq                                :85  ][DEBUG   ][3917] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506', 'aes')
09:21:43,350 [salt.crypt                                           :377 ][DEBUG   ][3917] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'opendj1.admin.ptf1', 'tcp://10.101.10.56:4506')
09:21:43,366 [salt.utils.lazy                                      :97  ][DEBUG   ][3917] LazyLoaded nested.output
local:
    True

real	0m1.620s
user	0m0.849s
sys	0m0.494s

@terminalmage
Copy link
Contributor

OK, this does seem to be specific to fileclient requests. And it doesn't seem to be a connection latency issue either, since the test.ping returned quickly.

Can you please up the verbosity in the master log? You can do this by setting log_level_logfile: debug. Then restart the salt-master service. You can then initiate your test either using salt or salt-call, it doesn't matter. I don't think that you'll get what you need logged to the console using salt -l debug, since the master's worker threads (which carry out the fileclient requests) don't log to the console. Hence the change to the log level and examining /var/cache/salt/master.

The master log file should already include timestamps, its default log format matches the one I asked you to set for the salt-call testing.

@cavepopo
Copy link
Contributor Author

Here we go :
https://gist.github.com/cavepopo/49788f84223052364a22eea98a2f8f36
Thanks a lot for your time @terminalmage

@terminalmage
Copy link
Contributor

OK, according to the log, the job was published to the minion at 19:46:04, but the minion didn't even start running states until 19:47:05, almost exactly one minute later. Once started, they took just over 1 second to run (19:47:05.288697 to 19:47:06.426968), which suggests that fileclient requests are being fulfilled quickly and are not what is causing the delay.

Can you bump up the loglevel on the minion and repeat the same test, and then check /var/log/salt/minion to see at which time it received the job from the master?, and at which time it begins to run the states? If we compare that to the master log from the same job, then we can begin to start narrowing down where the delay occurs. Right now it's unclear whether the delay is in the minion receiving the job, or whether it occurs during the state run.

I know I had you try test.ping and a couple other examples before, but those didn't involve running states. Could you also try adding an SLS file to your gitfs server with the following contents?

successful_test:
  test.succeed_with_changes

This will use one of our test states which doesn't actually do anything (and is used simply to test the state compiler).

In the past we have had issues with certain grains causing a timeout and delaying jobs. Are you using any custom-written grains, particularly ones which make outbound network connections?

@cavepopo
Copy link
Contributor Author

Wow that's a lot of log to handle and I am not sure where to search for a job starting (either on master or minion), do you mind checking it ? Cause I really don't want to mess up the diagnosis with a wrong analysis ...

Let's first start with the very same state.apply :
master log = https://gist.github.com/cavepopo/a2feff43b686a02e87e0096d0b951612
minion log = https://gist.github.com/cavepopo/c778f83eb4a867b8fc3a37d4911cba61

Now the test state you asked for :
master log = https://gist.github.com/cavepopo/7c9f2b4f0c1c327734610b9a0d74478a
minion log = https://gist.github.com/cavepopo/e7a9632044ca8263795b6963156050c6

Now regarding your last question : Nope, no house written grains here, I am defining custom grains for some minions but they contains only strings or int , they are totally simple and defined inside /etc/salt/grains file on minions .... there is no custom grains at all on the server on which the above tests were ran.

Hope you'll get the info you needed, thanks again for your time.

@terminalmage
Copy link
Contributor

terminalmage commented Oct 17, 2018

OK, the delay appears to come after the minion auths to the master, but it's unclear if the auth is what is hanging. When we see delays like this, they are usually related to grains in some way. In the past we have had to remove grains that made outbound connections as any latency in connecting to those hosts will delay the job as grains are unable to finish.

If you could run a salt-call using -l trace, that will emit a separate log message at the TRACE loglevel for each grains function that is run. If we see a delay between two of these functions, that will suggest that this is related somehow to grains.

@cavepopo
Copy link
Contributor Author

cavepopo commented Oct 19, 2018

Here we go (sorry for the delay) :
https://gist.github.com/cavepopo/5f25a74d0b74102bc7d7bff9b669afb4

Thank you @terminalmage

@terminalmage
Copy link
Contributor

@cavepopo OK, it's definitely not the grains, and the trace logging shows that we did complete the auth. Given that we loaded the minion functions right after the delay ended, the next place I would like to look is when the minion requests Pillar data as it runs states. Would you be able to find salt/state.py on the minion (likely somewhere like /usr/lib/python2.7/site-packages/salt/state.py) and add in some logging before and after this block? So, this block of code would now resemble something like:

        log.debug('Gathering pillar data for state run')
        if initial_pillar and not self._pillar_override:
            self.opts['pillar'] = initial_pillar
        else:
            # Compile pillar data
            self.opts['pillar'] = self._gather_pillar()
            # Reapply overrides on top of compiled pillar
            if self._pillar_override:
                self.opts['pillar'] = salt.utils.dictupdate.merge(
                    self.opts['pillar'],
                    self._pillar_override,
                    self.opts.get('pillar_source_merging_strategy', 'smart'),
                    self.opts.get('renderer', 'yaml'),
                    self.opts.get('pillar_merge_lists', False))
        log.debug('Finished gathering pillar data for state run')

Once you do this, if you can repeat that salt-call, with debug-level logging this time, this will help tell us if the delay happens waiting for pillar data from the Master.

Thanks for continuing to bear with me as we troubleshoot, I unfortunately can't reproduce this behavior on my end.

@cavepopo cavepopo reopened this Oct 20, 2018
@cavepopo
Copy link
Contributor Author

cavepopo commented Oct 20, 2018

Here, the minion log (full log https://gist.github.com/cavepopo/6c9c2e67a556655a4621222bc061b64d) :

11:10:29,961 [salt.crypt                                            :463 ][DEBUG   ][29061] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'opendj1.admin.ptf1', u'tcp://10.101.10.56:4506')
11:10:29,962 [salt.transport.zeromq                                 :194 ][DEBUG   ][29061] Connecting the Minion to the Master URI (for the return server): tcp://10.101.10.56:4506
11:10:29,966 [salt.transport.zeromq                                 :1029][DEBUG   ][29061] Trying to connect to: tcp://10.101.10.56:4506
11:10:29,966 [salt.transport.zeromq                                 :35  ][TRACE   ][29061] Inserted key into loop_instance_map id 139939152412824 for key (u'/etc/salt/pki/minion', u'opendj1.admin.ptf1', u'tcp://10.101.10.56:4506', u'aes') and process 29061
11:11:31,257 [salt.state                                            :724 ][DEBUG   ][29061] Gathering pillar data for state run
11:11:31,257 [salt.state                                            :738 ][DEBUG   ][29061] Finished gathering pillar data for state run
11:11:31,257 [salt.state                                            :925 ][INFO    ][29061] Loading fresh modules for state activity
11:11:31,288 [salt.utils.lazy                                       :100 ][DEBUG   ][29061] LazyLoaded jinja.render
11:11:31,289 [salt.utils.lazy                                       :100 ][DEBUG   ][29061] LazyLoaded yaml.render

Comparing this one to its master counterpart shows that the whole minute delay (bw 11:10:29 and 11:10:31) is spent running gitfs related operation, mostly :

2018-10-20 11:10:29,968 [salt.utils.gitfs :2664][DEBUG   ][10777] Created gitfs object with uninitialized remotes
2018-10-20 11:10:29,968 [salt.utils.lazy  :100 ][DEBUG   ][10777] LazyLoaded gitfs.envs
2018-10-20 11:10:29,970 [salt.utils.gitfs :679 ][DEBUG   ][10777] Current fetch URL for gitfs remote 'ssh://[email protected]/Ops/salt-eloquant.git': ssh://[email protected]/Ops/salt-eloquant.git (desired: ssh://[email protected]/Ops/salt-eloquant.git)
2018-10-20 11:10:29,970 [salt.utils.gitfs :701 ][DEBUG   ][10777] Current refspecs for gitfs remote 'ssh://[email protected]/Ops/salt-eloquant.git': [u'+refs/heads/*:refs/remotes/origin/*', u'+refs/tags/*:refs/tags/*'] (desired: [u'+refs/heads/*:refs/remotes/origin/*', u'+refs/tags/*:refs/tags/*'])
2018-10-20 11:10:29,970 [salt.utils.gitfs :722 ][DEBUG   ][10777] Current http.sslVerify for gitfs remote 'ssh://[email protected]/Ops/salt-eloquant.git': true (desired: true)
2018-10-20 11:10:29,971 [salt.utils.gitfs :679 ][DEBUG   ][10777] Current fetch URL for gitfs remote 'ssh://[email protected]/Ops/salt-unit_tests.git': ssh://[email protected]/Ops/salt-unit_tests.git (desired: ssh://[email protected]/Ops/salt-unit_tests.git)
2018-10-20 11:10:29,971 [salt.utils.gitfs :701 ][DEBUG   ][10777] Current refspecs for gitfs remote 'ssh://[email protected]/Ops/salt-unit_tests.git': [u'+refs/heads/*:refs/remotes/origin/*', u'+refs/tags/*:refs/tags/*'] (desired: [u'+refs/heads/*:refs/remotes/origin/*', u'+refs/tags/*:refs/tags/*'])
2018-10-20 11:10:29,971 [salt.utils.gitfs :722 ][DEBUG   ][10777] Current http.sslVerify for gitfs remote 'ssh://[email protected]/Ops/salt-unit_tests.git': true (desired: true)
2018-10-20 11:10:29,971 [salt.utils.gitfs :2666][DEBUG   ][10777] Created gitfs object for process 10777
2018-10-20 11:10:29,977 [salt.utils.lazy  :100 ][DEBUG   ][10774] LazyLoaded roots.envs
2018-10-20 11:10:29,977 [salt.utils.gitfs :2664][DEBUG   ][10774] Created gitfs object with uninitialized remotes
2018-10-20 11:10:29,977 [salt.utils.lazy  :100 ][DEBUG   ][10774] LazyLoaded gitfs.envs
2018-10-20 11:10:29,982 [salt.utils.gitfs :679 ][DEBUG   ][10774] Current fetch URL for gitfs remote 'ssh://[email protected]/Ops/salt-eloquant.git': ssh://[email protected]/Ops/salt-eloquant.git (desired: ssh://[email protected]/Ops/salt-eloquant.git)
2018-10-20 11:10:29,982 [salt.utils.gitfs :701 ][DEBUG   ][10774] Current refspecs for gitfs remote 'ssh://[email protected]/Ops/salt-eloquant.git': [u'+refs/heads/*:refs/remotes/origin/*', u'+refs/tags/*:refs/tags/*'] (desired: [u'+refs/heads/*:refs/remotes/origin/*', u'+refs/tags/*:refs/tags/*'])
2018-10-20 11:10:29,982 [salt.utils.gitfs :722 ][DEBUG   ][10774] Current http.sslVerify for gitfs remote 'ssh://[email protected]/Ops/salt-eloquant.git': true (desired: true)
2018-10-20 11:10:29,983 [salt.utils.gitfs :679 ][DEBUG   ][10774] Current fetch URL for gitfs remote 'ssh://[email protected]/Ops/salt-unit_tests.git': ssh://[email protected]/Ops/salt-unit_tests.git (desired: ssh://[email protected]/Ops/salt-unit_tests.git)

@terminalmage
Copy link
Contributor

The master stuff is just the update thread fetching, that has nothing to do with pillar. What that debug logging shows though is that the pillar compilation inside the state run is not what is causing the delay, since it happens after the delay.

OK, more debug logging (you can remove the stuff from salt/state.py if you like). What I need you to do is add some sort of logging message both before and after this line in salt/cli/caller.py and repeat your salt-call.

@cavepopo
Copy link
Contributor Author

cavepopo commented Oct 20, 2018

Here we go :

root@opendj1 [~]: time salt-call -l debug state.apply  successful_test 2>&1 |tee /tmp/log.pbe 
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: server
20:08:21,376 [salt.utils.parsers:234 ][DEBUG   ][3285] Configuration file path: /etc/salt/minion
20:08:21,376 [salt.utils.verify:535 ][WARNING ][3285] Insecure logging configuration detected! Sensitive data may be logged.
20:08:21,376 [salt.loader      :697 ][DEBUG   ][3285] Grains refresh requested. Refreshing grains.
20:08:21,376 [salt.config      :2117][DEBUG   ][3285] Reading configuration from /etc/salt/minion
20:08:21,378 [salt.config      :2260][DEBUG   ][3285] Including configuration from '/etc/salt/minion.d/_schedule.conf'
20:08:21,378 [salt.config      :2117][DEBUG   ][3285] Reading configuration from /etc/salt/minion.d/_schedule.conf
20:08:21,832 [salt.loaded.int.grains.core                           :179 ][DEBUG   ][3285] The `lspci` binary is not available on the system. GPU grains will not be available.
20:08:21,859 [salt.loaded.int.grains.core                           :598 ][DEBUG   ][3285] Please install 'virt-what' to improve results of the 'virtual' grain.
20:08:21,959 [salt.minion                                           :674 ][DEBUG   ][3285] Connecting to master. Attempt 1 of 1
20:08:21,960 [salt.minion                                           :237 ][DEBUG   ][3285] Master URI: tcp://[hidden]:4506
20:08:21,977 [salt.crypt                                            :463 ][DEBUG   ][3285] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506')
20:08:21,983 [salt.transport.zeromq                                 :388 ][DEBUG   ][3285] Generated random reconnect delay between '1000ms' and '11000ms' (3612)
20:08:21,984 [salt.transport.zeromq                                 :391 ][DEBUG   ][3285] Setting zmq_reconnect_ivl to '3612ms'
20:08:21,984 [salt.transport.zeromq                                 :397 ][DEBUG   ][3285] Setting zmq_reconnect_ivl_max to '11000ms'
20:08:21,985 [salt.transport.zeromq                                 :128 ][DEBUG   ][3285] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506', 'clear')
20:08:21,985 [salt.transport.zeromq                                 :194 ][DEBUG   ][3285] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
20:08:21,986 [salt.transport.zeromq                                 :1029][DEBUG   ][3285] Trying to connect to: tcp://[hidden]:4506
20:08:21,986 [salt.crypt                                            :207 ][DEBUG   ][3285] salt.crypt.get_rsa_pub_key: Loading public key
20:08:21,999 [salt.crypt                                            :847 ][DEBUG   ][3285] Decrypting the current master AES key
20:08:21,999 [salt.crypt                                            :199 ][DEBUG   ][3285] salt.crypt.get_rsa_key: Loading private key
20:08:22,000 [salt.crypt                                            :179 ][DEBUG   ][3285] salt.crypt._get_key_with_evict: Loading private key
20:08:22,001 [salt.crypt                                            :776 ][DEBUG   ][3285] Loaded minion key: /etc/salt/pki/minion/minion.pem
20:08:22,003 [salt.crypt                                            :207 ][DEBUG   ][3285] salt.crypt.get_rsa_pub_key: Loading public key
20:08:22,008 [salt.transport.zeromq                                 :437 ][DEBUG   ][3285] Connecting the Minion to the Master publish port, using the URI: tcp://[hidden]:4505
20:08:22,009 [salt.crypt                                            :199 ][DEBUG   ][3285] salt.crypt.get_rsa_key: Loading private key
20:08:22,009 [salt.crypt                                            :776 ][DEBUG   ][3285] Loaded minion key: /etc/salt/pki/minion/minion.pem
20:08:22,012 [salt.pillar                                           :58  ][DEBUG   ][3285] Determining pillar cache
20:08:22,013 [salt.transport.zeromq                                 :128 ][DEBUG   ][3285] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506', u'aes')
20:08:22,013 [salt.crypt                                            :463 ][DEBUG   ][3285] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506')
20:08:22,013 [salt.transport.zeromq                                 :194 ][DEBUG   ][3285] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
20:08:22,018 [salt.transport.zeromq                                 :1029][DEBUG   ][3285] Trying to connect to: tcp://[hidden]:4506
20:08:22,242 [salt.crypt                                            :199 ][DEBUG   ][3285] salt.crypt.get_rsa_key: Loading private key
20:08:22,243 [salt.crypt                                            :776 ][DEBUG   ][3285] Loaded minion key: /etc/salt/pki/minion/minion.pem
20:08:22,277 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded jinja.render
20:08:22,278 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded yaml.render
20:08:22,279 [salt.cli.caller                                       :134 ][DEBUG   ][3285] Before line 134
20:08:22,280 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded state.apply
20:08:22,282 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded saltutil.is_running
20:08:22,284 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded grains.get
20:08:22,288 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded config.get
20:08:22,293 [salt.transport.zeromq                                 :128 ][DEBUG   ][3285] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506', u'aes')
20:08:22,293 [salt.crypt                                            :463 ][DEBUG   ][3285] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506')
20:08:22,294 [salt.transport.zeromq                                 :194 ][DEBUG   ][3285] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
20:08:22,298 [salt.transport.zeromq                                 :1029][DEBUG   ][3285] Trying to connect to: tcp://[hidden]:4506
20:09:22,659 [salt.state                                            :724 ][DEBUG   ][3285] Gathering pillar data for state run
20:09:22,659 [salt.state                                            :738 ][DEBUG   ][3285] Finished gathering pillar data for state run
20:09:22,659 [salt.state                                            :925 ][INFO    ][3285] Loading fresh modules for state activity
20:09:22,692 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded jinja.render
20:09:22,693 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded yaml.render
20:09:22,702 [salt.fileclient                                       :1124][DEBUG   ][3285] In saltenv 'base', looking at rel_path 'successful_test.sls' to resolve 'salt://successful_test.sls'
20:09:22,702 [salt.fileclient                                       :1132][DEBUG   ][3285] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/successful_test.sls' to resolve 'salt://successful_test.sls'
20:09:22,703 [salt.template                                         :51  ][DEBUG   ][3285] compile template: /var/cache/salt/minion/files/base/successful_test.sls
20:09:22,703 [salt.utils.jinja                                      :69  ][DEBUG   ][3285] Jinja search path: [u'/var/cache/salt/minion/files/base']
20:09:22,708 [salt.transport.zeromq                                 :128 ][DEBUG   ][3285] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506', u'aes')
20:09:22,709 [salt.crypt                                            :463 ][DEBUG   ][3285] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506')
20:09:22,710 [salt.transport.zeromq                                 :194 ][DEBUG   ][3285] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
20:09:22,710 [salt.transport.zeromq                                 :1029][DEBUG   ][3285] Trying to connect to: tcp://[hidden]:4506
20:09:22,711 [salt.utils.templates                                  :345 ][DEBUG   ][3285] Jinja2 environment trim_blocks was set to True by jinja_sls_env
20:09:22,711 [salt.utils.templates                                  :345 ][DEBUG   ][3285] Jinja2 environment lstrip_blocks was set to True by jinja_sls_env
20:09:22,756 [salt.template                                         :26  ][PROFILE ][3285] Time (in seconds) to render '/var/cache/salt/minion/files/base/successful_test.sls' using 'jinja' renderer: 0.0526170730591
20:09:22,756 [salt.template                                         :113 ][DEBUG   ][3285] Rendered data from file: /var/cache/salt/minion/files/base/successful_test.sls:
successful_test:
  test.succeed_with_changes
20:09:22,757 [salt.loaded.int.render.yaml                           :68  ][DEBUG   ][3285] Results of YAML rendering: 
OrderedDict([(u'successful_test', u'test.succeed_with_changes')])
20:09:22,758 [salt.template                                         :26  ][PROFILE ][3285] Time (in seconds) to render '/var/cache/salt/minion/files/base/successful_test.sls' using 'yaml' renderer: 0.00121307373047
20:09:22,759 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded config.option
20:09:22,760 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded test.succeed_with_changes
20:09:22,760 [salt.state                                            :1791][INFO    ][3285] Running state [successful_test] at time 20:09:22.760841
20:09:22,761 [salt.state                                            :1824][INFO    ][3285] Executing state test.succeed_with_changes for [successful_test]
20:09:22,761 [salt.state                                            :308 ][INFO    ][3285] Made the following changes:
'testing' changed from 'Unchanged' to 'Something pretended to change'

20:09:22,761 [salt.state                                            :1970][INFO    ][3285] Completed state [successful_test] at time 20:09:22.761791 (duration_in_ms=0.951)
20:09:22,762 [salt.state                                            :2770][DEBUG   ][3285] File /var/cache/salt/minion/accumulator/140330852206480 does not exist, no need to cleanup
20:09:22,763 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded state.check_result
20:09:22,764 [salt.transport.zeromq                                 :128 ][DEBUG   ][3285] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506', u'aes')
20:09:22,765 [salt.crypt                                            :463 ][DEBUG   ][3285] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'server', u'tcp://[hidden]:4506')
20:09:22,765 [salt.transport.zeromq                                 :194 ][DEBUG   ][3285] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
20:09:22,790 [salt.transport.zeromq                                 :1029][DEBUG   ][3285] Trying to connect to: tcp://[hidden]:4506
20:09:22,801 [salt.cli.caller                                       :136 ][DEBUG   ][3285] After line 134
20:09:22,804 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded highstate.output
20:09:22,806 [salt.utils.lazy                                       :100 ][DEBUG   ][3285] LazyLoaded nested.output
local:
----------
          ID: successful_test
    Function: test.succeed_with_changes
      Result: True
     Comment: Success!
     Started: 20:09:22.760840
    Duration: 0.951 ms
     Changes:   
              ----------
              testing:
                  ----------
                  new:
                      Something pretended to change
                  old:
                      Unchanged

Summary for local
------------
Succeeded: 1 (changed=1)
Failed:    0
------------
Total states run:     1
Total run time:   0.951 ms

real	1m2.150s
user	0m1.139s
sys	0m0.395s

@terminalmage
Copy link
Contributor

@cavepopo We are getting closer now. Since we know that the delay is happening during the call() function, I've created this patch (link to raw content here). Please revert the logging you added in the prior test, then apply that patch and re-attempt the salt-call. The logging I added there should further narrow the point at which the delay happens.

NOTE: Github is in the midst of a major outage right now, so the gist I created is currently showing a 404. It should show up once they recover from the outage.

@terminalmage
Copy link
Contributor

terminalmage commented Oct 23, 2018

Bumping this comment thread just in case no email notification went out, since my last comment to @cavepopo was submitted during their (Github's) outage yesterday.

@cavepopo
Copy link
Contributor Author

Here :

root@opendj1 [~]: time salt-call -l debug state.apply  successful_test 2>&1 |tee /tmp/log.pbe 
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: [hidden]
12:11:43,710 [salt.utils.parsers:234 ][DEBUG   ][451] Configuration file path: /etc/salt/minion
12:11:43,711 [salt.utils.verify:535 ][WARNING ][451] Insecure logging configuration detected! Sensitive data may be logged.
12:11:43,711 [salt.loader      :697 ][DEBUG   ][451] Grains refresh requested. Refreshing grains.
12:11:43,711 [salt.config      :2117][DEBUG   ][451] Reading configuration from /etc/salt/minion
12:11:43,712 [salt.config      :2260][DEBUG   ][451] Including configuration from '/etc/salt/minion.d/_schedule.conf'
12:11:43,713 [salt.config      :2117][DEBUG   ][451] Reading configuration from /etc/salt/minion.d/_schedule.conf
12:11:44,200 [salt.loaded.int.grains.core                           :179 ][DEBUG   ][451] The `lspci` binary is not available on the system. GPU grains will not be available.
12:11:44,226 [salt.loaded.int.grains.core                           :598 ][DEBUG   ][451] Please install 'virt-what' to improve results of the 'virtual' grain.
12:11:44,338 [salt.minion                                           :674 ][DEBUG   ][451] Connecting to master. Attempt 1 of 1
12:11:44,339 [salt.minion                                           :237 ][DEBUG   ][451] Master URI: tcp://[hidden]:4506
12:11:44,389 [salt.crypt                                            :463 ][DEBUG   ][451] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506')
12:11:45,396 [salt.transport.zeromq                                 :388 ][DEBUG   ][451] Generated random reconnect delay between '1000ms' and '11000ms' (6042)
12:11:45,396 [salt.transport.zeromq                                 :391 ][DEBUG   ][451] Setting zmq_reconnect_ivl to '6042ms'
12:11:45,397 [salt.transport.zeromq                                 :397 ][DEBUG   ][451] Setting zmq_reconnect_ivl_max to '11000ms'
12:11:45,397 [salt.transport.zeromq                                 :128 ][DEBUG   ][451] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506', 'clear')
12:11:45,397 [salt.transport.zeromq                                 :194 ][DEBUG   ][451] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
12:11:45,398 [salt.transport.zeromq                                 :1029][DEBUG   ][451] Trying to connect to: tcp://[hidden]:4506
12:11:45,398 [salt.crypt                                            :207 ][DEBUG   ][451] salt.crypt.get_rsa_pub_key: Loading public key
12:11:45,410 [salt.crypt                                            :847 ][DEBUG   ][451] Decrypting the current master AES key
12:11:45,410 [salt.crypt                                            :199 ][DEBUG   ][451] salt.crypt.get_rsa_key: Loading private key
12:11:45,411 [salt.crypt                                            :179 ][DEBUG   ][451] salt.crypt._get_key_with_evict: Loading private key
12:11:45,413 [salt.crypt                                            :776 ][DEBUG   ][451] Loaded minion key: /etc/salt/pki/minion/minion.pem
12:11:45,452 [salt.crypt                                            :207 ][DEBUG   ][451] salt.crypt.get_rsa_pub_key: Loading public key
12:11:45,457 [salt.transport.zeromq                                 :437 ][DEBUG   ][451] Connecting the Minion to the Master publish port, using the URI: tcp://[hidden]:4505
12:11:45,457 [salt.crypt                                            :199 ][DEBUG   ][451] salt.crypt.get_rsa_key: Loading private key
12:11:45,458 [salt.crypt                                            :776 ][DEBUG   ][451] Loaded minion key: /etc/salt/pki/minion/minion.pem
12:11:45,461 [salt.pillar                                           :58  ][DEBUG   ][451] Determining pillar cache
12:11:45,461 [salt.transport.zeromq                                 :128 ][DEBUG   ][451] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506', u'aes')
12:11:45,461 [salt.crypt                                            :463 ][DEBUG   ][451] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506')
12:11:45,462 [salt.transport.zeromq                                 :194 ][DEBUG   ][451] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
12:11:45,465 [salt.transport.zeromq                                 :1029][DEBUG   ][451] Trying to connect to: tcp://[hidden]:4506
12:11:45,670 [salt.crypt                                            :199 ][DEBUG   ][451] salt.crypt.get_rsa_key: Loading private key
12:11:45,671 [salt.crypt                                            :776 ][DEBUG   ][451] Loaded minion key: /etc/salt/pki/minion/minion.pem
12:11:45,707 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded jinja.render
12:11:45,708 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded yaml.render
12:11:45,709 [salt.cli.caller                                       :163 ][DEBUG   ][451] GENERATING JID
12:11:45,709 [salt.cli.caller                                       :169 ][DEBUG   ][451] CHECKING FUNCTION AVAILABILITY
12:11:45,713 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded state.apply
12:11:45,713 [salt.cli.caller                                       :196 ][DEBUG   ][451] LOADING CLI ARGUMENTS
12:11:45,714 [salt.cli.caller                                       :206 ][DEBUG   ][451] WRITING PROCFILE
12:11:45,714 [salt.cli.caller                                       :215 ][DEBUG   ][451] GETTING FUNCTION
12:11:45,714 [salt.cli.caller                                       :218 ][DEBUG   ][451] RUNNING FUNCTION
12:11:45,717 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded saltutil.is_running
12:11:45,718 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded grains.get
12:11:45,725 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded config.get
12:11:45,730 [salt.transport.zeromq                                 :128 ][DEBUG   ][451] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506', u'aes')
12:11:45,731 [salt.crypt                                            :463 ][DEBUG   ][451] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506')
12:11:45,731 [salt.transport.zeromq                                 :194 ][DEBUG   ][451] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
12:11:45,735 [salt.transport.zeromq                                 :1029][DEBUG   ][451] Trying to connect to: tcp://[hidden]:4506
12:12:47,924 [salt.state                                            :923 ][INFO    ][451] Loading fresh modules for state activity
12:12:47,955 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded jinja.render
12:12:47,956 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded yaml.render
12:12:47,964 [salt.fileclient                                       :1124][DEBUG   ][451] In saltenv 'base', looking at rel_path 'successful_test.sls' to resolve 'salt://successful_test.sls'
12:12:47,964 [salt.fileclient                                       :1132][DEBUG   ][451] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/successful_test.sls' to resolve 'salt://successful_test.sls'
12:12:47,965 [salt.template                                         :51  ][DEBUG   ][451] compile template: /var/cache/salt/minion/files/base/successful_test.sls
12:12:47,965 [salt.utils.jinja                                      :69  ][DEBUG   ][451] Jinja search path: [u'/var/cache/salt/minion/files/base']
12:12:47,970 [salt.transport.zeromq                                 :128 ][DEBUG   ][451] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506', u'aes')
12:12:47,970 [salt.crypt                                            :463 ][DEBUG   ][451] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506')
12:12:47,971 [salt.transport.zeromq                                 :194 ][DEBUG   ][451] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
12:12:47,971 [salt.transport.zeromq                                 :1029][DEBUG   ][451] Trying to connect to: tcp://[hidden]:4506
12:12:47,972 [salt.utils.templates                                  :345 ][DEBUG   ][451] Jinja2 environment trim_blocks was set to True by jinja_sls_env
12:12:47,972 [salt.utils.templates                                  :345 ][DEBUG   ][451] Jinja2 environment lstrip_blocks was set to True by jinja_sls_env
12:12:47,977 [salt.template                                         :26  ][PROFILE ][451] Time (in seconds) to render '/var/cache/salt/minion/files/base/successful_test.sls' using 'jinja' renderer: 0.011873960495
12:12:47,977 [salt.template                                         :113 ][DEBUG   ][451] Rendered data from file: /var/cache/salt/minion/files/base/successful_test.sls:
successful_test:
  test.succeed_with_changes
12:12:47,978 [salt.loaded.int.render.yaml                           :68  ][DEBUG   ][451] Results of YAML rendering: 
OrderedDict([(u'successful_test', u'test.succeed_with_changes')])
12:12:47,978 [salt.template                                         :26  ][PROFILE ][451] Time (in seconds) to render '/var/cache/salt/minion/files/base/successful_test.sls' using 'yaml' renderer: 0.000998973846436
12:12:47,980 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded config.option
12:12:47,981 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded test.succeed_with_changes
12:12:47,981 [salt.state                                            :1789][INFO    ][451] Running state [successful_test] at time 12:12:47.981630
12:12:47,981 [salt.state                                            :1822][INFO    ][451] Executing state test.succeed_with_changes for [successful_test]
12:12:47,982 [salt.state                                            :308 ][INFO    ][451] Made the following changes:
'testing' changed from 'Unchanged' to 'Something pretended to change'

12:12:47,982 [salt.state                                            :1968][INFO    ][451] Completed state [successful_test] at time 12:12:47.982484 (duration_in_ms=0.855)
12:12:47,982 [salt.state                                            :2768][DEBUG   ][451] File /var/cache/salt/minion/accumulator/139862776676048 does not exist, no need to cleanup
12:12:47,983 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded state.check_result
12:12:47,984 [salt.cli.caller                                       :220 ][DEBUG   ][451] DONE RUNNING FUNCTION
12:12:47,984 [salt.cli.caller                                       :249 ][DEBUG   ][451] REMOVED PROCFILE
12:12:47,985 [salt.cli.caller                                       :268 ][DEBUG   ][451] PROCESSING RETURNER ""
12:12:47,985 [salt.cli.caller                                       :282 ][DEBUG   ][451] RETURNING DATA TO MASTER
12:12:47,985 [salt.transport.zeromq                                 :128 ][DEBUG   ][451] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506', u'aes')
12:12:47,985 [salt.crypt                                            :463 ][DEBUG   ][451] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'[hidden]', u'tcp://[hidden]:4506')
12:12:47,986 [salt.transport.zeromq                                 :194 ][DEBUG   ][451] Connecting the Minion to the Master URI (for the return server): tcp://[hidden]:4506
12:12:47,986 [salt.transport.zeromq                                 :1029][DEBUG   ][451] Trying to connect to: tcp://[hidden]:4506
12:12:47,996 [salt.cli.caller                                       :284 ][DEBUG   ][451] DONE RETURNING DATA TO MASTER
12:12:48,000 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded highstate.output
12:12:48,002 [salt.utils.lazy                                       :100 ][DEBUG   ][451] LazyLoaded nested.output
local:
----------
          ID: successful_test
    Function: test.succeed_with_changes
      Result: True
     Comment: Success!
     Started: 12:12:47.981629
    Duration: 0.855 ms
     Changes:   
              ----------
              testing:
                  ----------
                  new:
                      Something pretended to change
                  old:
                      Unchanged

Summary for local
------------
Succeeded: 1 (changed=1)
Failed:    0
------------
Total states run:     1
Total run time:   0.855 ms

real	1m5.269s
user	0m1.122s
sys	0m0.492s

@terminalmage
Copy link
Contributor

Sorry for the delay, I was on vacation. I will look at this soon.

@cavepopo
Copy link
Contributor Author

cavepopo commented Nov 8, 2018

No worries ! Hope you enjoyed your vacation :)

@Ch3LL
Copy link
Contributor

Ch3LL commented Feb 24, 2020

and if you switch to gitpython do you still see the issue? Im wondering if these memory leaks are the cause to slowing down teh performance

@esilva-everbridge
Copy link

The memory leaks are certainly having an impact on performance but it isn't clear if they are the cause of the performance issues with auth. It looks like we are using gitpython now, as using pygit made everything much worse.

@Ch3LL
Copy link
Contributor

Ch3LL commented Feb 26, 2020

can you reporrt back whether you are not seeing the auth issues after a while using gitpython or not?

@esilva-everbridge
Copy link

We do see minion auth delays with gitpython.

@Ch3LL
Copy link
Contributor

Ch3LL commented Feb 27, 2020

and performance is better or no?

@esilva-everbridge
Copy link

After restarting the masters it Initially it seems improved but then degrades, until we have to restart them again.

@Ch3LL
Copy link
Contributor

Ch3LL commented Mar 2, 2020

interesting. i thought this was specific to pygit2. Can you share a sanitized version of your configurations? i'll try to replicate

@RobHayes512
Copy link

I'm also having a problem that this thread seems to fit with. I've got the 3000 version of master/minion with Python 2. Master is RHEL7 and I'm only running against a single windows minion. Gitfs is running pygit2. I've changed a variety of configuration information on the master trying to help, but I've set everything back to a basic install, except the gits entries needed to make it work. Haven't touched any configuration on the minion.
We have a repo with some large zip files that we copy out and unzip on the minion. Right now the repo is at 1.8 GB and no file is any bigger than 180 MB. If I comment out this specific repo, things are very fast. As soon as I add this back in, everything slows to a crawl, even if I'm running a state that doesn't use any files from this specific repo. Cmd.run and test,ping return instantly, but any state apply sends the salt-master process to 100% and then the minion starts timing out connecting back to the master. There's times when the job won't even show in jobs.active, but eventually I can see the results by looking up the jid.
Unfortunately my work blocks github login, so I couldn't comment at work. I can read the posts, I just can't login to comment. If you need any extra info, please let me know.

@Reiner030
Copy link

I thinks this is the "git file handler equivalent" of my found bug:

Saltstack bug:
Default fileserver backend recurses the whole /srv/salt (and i guess also /srv/pillar) > folder structure on every request.

Maybe you can test this by checking out / exporting the git repository instead URL / some "dummy" data in this folder to see if there are response time changes.

So far I remember I couldn't log even in debug mode that salt-master was scanning the whole folder structure (maybe I got this with lsof | grep /path/to/mountpoint) but mount/unount of this folders showed the change clearly.
For a git release you can maybe monitor your network/ connections to github to verify the high usage...

@RobHayes512
Copy link

I've somewhat already done this. We added the entire Visual Studio Build Tools network layout (like 15 GB and no telling how many files) to the local filesystem on the master and didn't see any noticeable performance difference. And I'm the same way with the logs, even in trace I'm not seeing anything that gives me an idea of what's going on.

@Reiner030
Copy link

In my case it was a gluster share filesystem I mounted into one subfolder so it seems to be commonly a non-local filesystem problem...

@duckfez
Copy link
Contributor

duckfez commented Mar 10, 2020

@RobHayes512 sounds similar to some of my own issues in #55419. I am not sure though that this is the same problem that @cavepopo has documented in THIS issue. His seems different somehow and I don't have the skill to find the difference.

When you run a state.apply the minions all call back to the master to ask for a file list from the fileserver. You can emulate this call exclusively on the master using salt-run fileserver.file_list -- the runtime of that will be similar (but not identical) to what the minions themselves are seeing when they make the call to the master.

When using gitfs + pygit2, fileserver.file_list is incredibly slow - for reasons I've been able to instrument but don't know how to locate root cause on. There is some caching at play via the master parameter file_server_list_cache_time -- and it does help, some.

But .. something else I've noticed is that the way the file list cache locking appears to work, it seems like multiple master worker threads can be - independently - working on rebuilding the file list cache, so you still wind up with a degree of thundering herd against your master.

Bottom line: If you're using gitfs+pygit2, switch to gitfs+gitpython. If that makes your state applies faster, then your problem is #55419. If not, it could be this issue or something else entirely.

@RobHayes512
Copy link

I tried gitpython yesterday, but we’re using something authentication-wise that isn’t available in gitpyhon.

@sagetherage
Copy link
Contributor

Reviewed in Grooming 2012-JUN-21 and we believe this PR may address the leaks: #60386 and would like someone here try this PR as a fix.

@sagetherage sagetherage self-assigned this Jun 21, 2021
@dwoz
Copy link
Contributor

dwoz commented Jun 21, 2021

@sagetherage #60386 will address leaks but won't resolve the performance issues referenced in this issue.

@sagetherage
Copy link
Contributor

@dwoz yes, only the leaks, we left this as still in grooming since the original report is on a now unsupported version of Salt the next steps would be to also attempt to address performance - not closing at this time.

@sagetherage sagetherage removed their assignment Jul 13, 2021
@sagetherage sagetherage removed this from the Blocked milestone Jul 19, 2021
@sagetherage sagetherage removed the info-needed waiting for more info label Jul 19, 2021
@sagetherage sagetherage added this to the Approved milestone Jul 19, 2021
@sagetherage sagetherage added the Bug broken, incorrect, or confusing behavior label Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior gitfs Performance
Projects
None yet
Development

No branches or pull requests