You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi All,
I'm blocked due to failure with Shinken monitoring.., someone please help me on this... I can provide all information's and logs for sorting out this issue.
with OS upgrade, we are planning to run container based Shinken components for monitoring Windows 2008, 2012 and 2019 nodes.
individual containers are been created for all the components say Arbiter, Broker, Poller, Scheduler...etc
Configured Webui and Thruk for UI
All the configurations are good, and connectivity is there, but I could notice that the Hosts and Service checks are getting timed out.., after been in Pending state for quite some time... moreover these checks are not executing on time as well...
Enabled Debug log level and I could see queued up tasks are not getting picked by Poller.. Could not understand the reason behind it..
But when I logged in to the Poller-container and executed the command manually it got executed instantly..
System Details
OS - Alma 9.3
CPU - 4
RAM - 8GB
Shinken Version - 2.4.3
Even though the system has sufficient resources and only a few simple services are there to check, still the Poller service is consuming most of the CPU but the tasks are not getting done.
In the Poller log Queued up tasks are not getting picked up
docker logs shinken-scheduler-1 |grep WARNING
[1710405731] WARNING: [Shinken] 4 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710405791] WARNING: [Shinken] 4 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710405851] WARNING: [Shinken] 4 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710405911] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710405971] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710406031] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710406091] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710406151] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710406211] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710406932] WARNING: [Shinken] 6 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710406992] WARNING: [Shinken] 5 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710407052] WARNING: [Shinken] 5 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710407112] WARNING: [Shinken] 5 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
[1710407173] WARNING: [Shinken] 5 actions never came back for the satellite 'shinken-poller-1'.I reenable them for polling
The text was updated successfully, but these errors were encountered:
sjose1x
changed the title
Poller is not picking up the Queued checks, the Host and Service checks are getting timed out.
Poller is not picking up the Queued tasks, the Host and Service checks are getting timed out.
Mar 14, 2024
I could locate the issue as in, the docker is not support multiprocessing... Poller uses multiprocessing for running worker.. need to figure out how to mitigate this
Workers are running, but work.start() is getting stuck.. from there on no progress
Hi All,
I'm blocked due to failure with Shinken monitoring.., someone please help me on this... I can provide all information's and logs for sorting out this issue.
with OS upgrade, we are planning to run container based Shinken components for monitoring Windows 2008, 2012 and 2019 nodes.
individual containers are been created for all the components say Arbiter, Broker, Poller, Scheduler...etc
Configured Webui and Thruk for UI
All the configurations are good, and connectivity is there, but I could notice that the Hosts and Service checks are getting timed out.., after been in Pending state for quite some time... moreover these checks are not executing on time as well...
Enabled Debug log level and I could see queued up tasks are not getting picked by Poller.. Could not understand the reason behind it..
But when I logged in to the Poller-container and executed the command manually it got executed instantly..
System Details
OS - Alma 9.3
CPU - 4
RAM - 8GB
Shinken Version - 2.4.3
Even though the system has sufficient resources and only a few simple services are there to check, still the Poller service is consuming most of the CPU but the tasks are not getting done.
In the Poller log Queued up tasks are not getting picked up
Scheduler DEBUG logs
In the Scheduler log I could see these WARNING
Scheduler log
@geektophe Could you please help me on this.. ;)
The text was updated successfully, but these errors were encountered: