Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sumo Collector spams NoSuchContainer to logs #96

Open
cptera opened this issue Nov 1, 2021 · 5 comments
Open

Sumo Collector spams NoSuchContainer to logs #96

cptera opened this issue Nov 1, 2021 · 5 comments

Comments

@cptera
Copy link

cptera commented Nov 1, 2021

We deploy the sumologic/collector:latest image in our customers production environemt to collect logs for our product (a set of containers running) and have it send logs to our sumologic account. Over the last few months we've noticed a lot of spam logs due to the sumologic collector throwing error like

jvm 1 | Exception in thread "onError:DockerLogInput:000000005BF8C7E3:'Docker-logs':cee113241350c24c5fbc416b8e953dd441c3aa957fb058d0597a554fea98c2db:connector_healthcheck.1.r6cze5gx0o50en9a713u7h860:1040812" java.lang.RuntimeException: com.github.dockerjava.api.exception.NotFoundException: No such container: cee113241350c24c5fbc416b8e953dd441c3aa957fb058d0597a554fea98c2db

From our investigation this is mostly happening for collector 19.351-4, but it's also happening for 19.361-4 and if I had to guess this may be due to switching from the Forked docker-java dependency to the open source one as listed in the release notes for https://github.com/SumoLogic/sumologic-collector-docker/releases/tag/v19.351-4

Our main issue is that this causes a lot of our sumo quota to get used up, log collection seems to be functioning fine.

@maimaisie
Copy link
Collaborator

@yuting-liu can you take a look as this is suspected to be related to the forked docker-java removal? Thanks

@yuting-liu
Copy link
Contributor

yuting-liu commented Nov 1, 2021

@cptera the error itself seems to be related to that the container didn't get restarted appropriately. Did you happen to see errors from the container log? It might include more details about the error.

@cptera
Copy link
Author

cptera commented Nov 1, 2021

Hmm, you're right, it looks like the container connector_healthcheck.1.r6cze5gx0o50en9a713u7h860 restarted, but the logs are still being collected after it restarted so I don't think it's an issue with docker or the container itself failing to restart.

@cptera
Copy link
Author

cptera commented Nov 1, 2021

Like we deploy everything in a docker swarm and we designed our containers to restart on certain errors and in this case it restarted on an error it should have. We've been running it this way for several years and we don't make major changes very often.

@cptera
Copy link
Author

cptera commented Dec 13, 2021

With the log4j bug our fix of "just using an older version" is no longer viable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants