Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SaltStack 3006.x file Function State Execution Performance Degradation #65450

Closed
cianyleow opened this issue Oct 24, 2023 · 10 comments
Closed
Assignees
Labels
Bug broken, incorrect, or confusing behavior Performance Regression The issue is a bug that breaks functionality known to work in previous releases.

Comments

@cianyleow
Copy link
Contributor

Description of the tech debt to be addressed, include links and screenshots

When upgrading to SaltStack 3006.x, a significant performance regression was identified in state execution, resulting in large increases to execution time for certain state functions.

A side-by-side analysis of the same state application between 3005.1 and 3006.3 below highlights the performance regression seen:

State Function Average 3005.1 Timing Average 3006.3 Timing Average % Increase
cmd.run 451.12ms 447.01ms -0.91%
cmd.wait 1.22ms 0.77ms -37.16%
file.directory 1.10ms 1.14ms 2.81%
file.managed 77.56ms 484.13ms 524.20%
file.serialize 5.94ms 248.95ms 4,093.13%
pkg.latest 2.50s 2.55s 1.81%
service.dead 52.55ms 54.87ms 4.41%

As can be seen in the data, the file.managed and file.serialize functions are 5-40x slower (on average) after this change, which was traced back to the re-initialization of the file client every time it was required.

In particular, this change (#63984), which adjusted object lifecycle management for the fileclient, was identified as the initial driver of the performance impact. Coupled with a bug fix (#64113), caching of file client initialisation in the cp and defaults modules as well as the jinja utils was removed, which accounted for the increased state execution timings seen.

For reference, an internal patch was applied to reintroduce file client initialisation caching, and it mitigated the performance impact completely.

Versions Report

Salt: 3006.3
Debian: Bookworm (12)

@OrangeDog OrangeDog added Regression The issue is a bug that breaks functionality known to work in previous releases. Performance labels Oct 24, 2023
@OrangeDog
Copy link
Contributor

Can you provide the salt --versions report and some indication of your fileserver config?
This is a bug, not technical debt.

@OrangeDog OrangeDog added Bug broken, incorrect, or confusing behavior info-needed waiting for more info needs-triage and removed tech-debt labels Oct 24, 2023
@cianyleow cianyleow changed the title [TECH DEBT] SaltStack 3006.x file Function State Execution Performance Degradation [BUG] SaltStack 3006.x file Function State Execution Performance Degradation Oct 26, 2023
@cianyleow
Copy link
Contributor Author

cianyleow commented Oct 26, 2023

We're using the default fileserver configuration (we don't set anything as far as I'm aware), with no changes between our setup for 3005.1 and 3006.3.

Here's the salt --versions report for the 3005.1 and 3006.3 Master/Minion pairs used for the testing:

salt@old-master: salt --versions
Salt Version:
          Salt: 3005.1

Dependency Versions:
          cffi: 1.15.1
      cherrypy: unknown
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]
  python-gnupg: 0.4.9
        PyYAML: 6.0
         PyZMQ: 24.0.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 12 bookworm
        locale: utf-8
       machine: x86_64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 12 bookworm
------------------------------

salt@old-minion$ salt-call --versions
Salt Version:
          Salt: 3005.1

Dependency Versions:
          cffi: 1.15.1
      cherrypy: Not Installed
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]
  python-gnupg: Not Installed
        PyYAML: 6.0
         PyZMQ: 24.0.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 12 bookworm
        locale: utf-8
       machine: aarch64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 12 bookworm
------------------------------

salt@new-master: salt --versions
Salt Version:
          Salt: 3006.3

Python Version:
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]

Dependency Versions:
          cffi: 1.15.1
      cherrypy: Not Installed
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
  looseversion: 1.0.3
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
  python-gnupg: 0.4.9
        PyYAML: 6.0
         PyZMQ: 24.0.1
        relenv: Not Installed
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 12.0 bookworm
        locale: utf-8
       machine: x86_64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 12.0 bookworm
------------------------------

salt@new-minion$ salt-call --versions
Salt Version:
          Salt: 3006.3

Python Version:
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]

Dependency Versions:
          cffi: 1.15.1
      cherrypy: Not Installed
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
  looseversion: 1.0.3
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
  python-gnupg: Not Installed
        PyYAML: 6.0
         PyZMQ: 24.0.1
        relenv: Not Installed
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 2023.10.3 bookworm
        locale: utf-8
       machine: aarch64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 2023.10.3 bookworm

@OrangeDog OrangeDog removed the info-needed waiting for more info label Oct 26, 2023
@AVKarelin
Copy link

The same problem with 3006.4

@cianyleow
Copy link
Contributor Author

Thanks for the fix - the __file_client__ dunder is ingenious. That being said, I was still seeing a bit of a performance regression during template rendering as fileclient wasn't being passed through in the context here.

I've attempted to fix it by explicitly passing through the file client where this is invoked (the file module and pillar/state implementations) and we use it - would that be helpful for me to try and submit as a patch?

@dwoz
Copy link
Contributor

dwoz commented Feb 6, 2024

Thanks for the fix - the __file_client__ dunder is ingenious. That being said, I was still seeing a bit of a performance regression during template rendering as fileclient wasn't being passed through in the context here.

I've attempted to fix it by explicitly passing through the file client where this is invoked (the file module and pillar/state implementations) and we use it - would that be helpful for me to try and submit as a patch?

If you are able to submit a patch that would be great, otherwise I will take a look at it.

@dwoz dwoz modified the milestones: Approved, Sulfur v3006.8 Feb 6, 2024
@dwoz
Copy link
Contributor

dwoz commented Feb 9, 2024

Thanks for the fix - the __file_client__ dunder is ingenious. That being said, I was still seeing a bit of a performance regression during template rendering as fileclient wasn't being passed through in the context here.
I've attempted to fix it by explicitly passing through the file client where this is invoked (the file module and pillar/state implementations) and we use it - would that be helpful for me to try and submit as a patch?

If you are able to submit a patch that would be great, otherwise I will take a look at it.

@cianyleow I'm going to take a look at this for 3006.8

@cianyleow
Copy link
Contributor Author

@dwoz this is where I got to with my patch: cianyleow@7620c4d

Hopefully this helps!

@dwoz
Copy link
Contributor

dwoz commented Feb 13, 2024

@cianyleow
Copy link
Contributor Author

@cianyleow Are you able to verify this patch resolves the issue? https://github.com/saltstack/salt/pull/66058/commits/317a6dd44be18441a9b9b579d7ff99d497a074f8s

@dwoz I won't be able to get back to this for a couple of weeks, but I think its conceptually similar to how I patched (I went for a more explicit approach for the use cases I have).

@dwoz
Copy link
Contributor

dwoz commented May 1, 2024

Fixed in 3006.8

@dwoz dwoz closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Performance Regression The issue is a bug that breaks functionality known to work in previous releases.
Projects
None yet
Development

No branches or pull requests

4 participants