-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Major Bug] Memory leak causing increased CPU/Memory usage over time on idle host #71
Comments
Sorry you are having issues @mlehner616! Are you able to provide logs for the lifecycled instances? It's very odd that they are consuming so much CPU and also that there are many processes running 🤔 |
My Logs look completely normal. I ran pprof and have a suspicion that something in SpotListener is the culprit. I've added I did find that SpotListener was creating a new instance of |
I'm also going to patch it with my branch on one instance and leave Other than this issue, this package works like a dream so 👍 . |
I am seeing higher than expected CPU usage on one of our longer lived instances. I'm running a patched version of lifecycled with pprof hooks, will see how it goes. At a guess, this is going to be the cloudwatch logging code. |
I'm seeing the same upwards trajectory on CPU utilisation over time on a couple of instances I have running lifecycled v3.0.1 (not using According to the docs here I think @mlehner616 is correct about our use of |
Added a PR with the tiny fix discussed above. |
Is the CPU usage perhaps too low in the last screenshot - only peaking at 0.03% CPU usage when we have timers, looping API requests and dumping all log entries directly to CloudWatch? Ref my comments in #74, I don't understand how |
Yeah I was a bit surprised myself about this one🤗, I’ll keep digging. I have a couple more things to try and I’m starting wonder if I accidentally uncovered something going on with my version of journald itself. |
Might it be possible to run a CPU trace and see what the culprit is @mlehner616? (https://blog.golang.org/profiling-go-programs) |
The new trace stuff is kind of amazing too: https://medium.com/@cep21/using-go-1-10-new-trace-features-to-debug-an-integration-test-1dc39e4e812d |
I fired up How are things looking on your end @mlehner616? |
Having run the example above for ~4 days now, I can't see any increased CPU usage. I think #72 has solved this issue, and that we can consider this issue closed without any further code changes - or are you seeing something else on your end @mlehner616? |
I've confirmed at this point that #72 and #73 together resolve this issue. I've also confirmed that adding Looking forward to a 3.0.2 release. |
Thanks @mlehner616 and @itsdalmo, putting out a 3.0.2 release now! |
Environment:
I have two completely idle ec2 amazon linux 2 instances with lifecycled v3.0.1 and docker installed but no containers running. As a control, One instance is without lifecycled installed but still has docker installed with no containers running.
Behavior:
See attached metrics
htop snapshot from affected instance which shows heavy cpu/memory usage by lifecycled
Command line used in systemd unit:
The instance with lifecycled running gradually consumes memory and CPU over time. The control instance with just docker installed remains idle/flat over time.
Expected behavior:
Instance cpu and memory should remain mostly idle/flat over a length of time.
The text was updated successfully, but these errors were encountered: