-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm chart logsCollection Failed to process entry #718
Comments
@jonaskello please share the log that caused the error. |
Yes, the whole file it's quite big but here is a sample of it:
|
It appears your timestamps have too many digits in after the |
The logs are from a worker node in the k8s cluster where the chart is deployed. The worker node has this config:
So the logs are from containerd. These are the datetime setting on the node (not sure if it is relevant):
|
The regex for containerd logs which is here seems to assume So this string does not match:
Because of the
|
Interesting. From the original error we can see that the filelog receiver routed the log to routes:
- output: parser-docker
expr: 'body matches "^\\{"'
- output: parser-crio
expr: 'body matches "^[^ Z]+ "'
- output: parser-containerd
expr: 'body matches "^[^ Z]+Z"' So it is expecting containerd logs to use I believe there are a couple possibilities here:
The cri-o parser would be able to handle your timezone offset if the millisecond precision matched. @djaglowski @dmitryax do you see anything out of the ordinary with our operators? opentelemetry-helm-charts/charts/opentelemetry-collector/templates/_config.tpl Lines 195 to 229 in dcdcfaf
|
While researching this issue I found this which may be of interest: https://github.com/kubernetes/design-proposals-archive/blob/main/node/kubelet-cri-logging.md
It gives this example:
Original found linked in this issue. |
Found this in one of the worker's logs: Something weird seems to be going on with the way containerd is creating the logs on our workers. Sometimes, for some entries the number of chars for milliseconds is 8 instead of 9. So I think otel-collector is correct here, it is chosing the cri logging format which is also what containerd is using. Got thrown off by the |
I researched the format for the
Comparing that to the format layout used in the helm chart here which is:
I could see they did not exactly look the same so I switched the one in the otel collector config to look exactly like the one defined in golang time pacakge, and now the error is not appearing anymore. I have no clue why, but I'm happy it is working now :-) |
@jonaskello, thanks for digging into this. It looks like a change we should make in the chart. Using Using |
Made a PR in #721 with the change that worked for me. |
See related discussion and PR in open-telemetry/opentelemetry-helm-charts#718
* Update time layout for cri-o log format Fixes #718 * bump chart version and generate examples
* Fix cri-o log format time layout See related discussion and PR in open-telemetry/opentelemetry-helm-charts#718 * Create fix-cri-o-log-format-time-layout.yaml Add changelog file * Update fix-cri-o-log-format-time-layout.yaml Remove external issue as it seems it does not validate * Update fix-cri-o-log-format-time-layout.yaml Add issue number
* Fix cri-o log format time layout See related discussion and PR in open-telemetry/opentelemetry-helm-charts#718 * Create fix-cri-o-log-format-time-layout.yaml Add changelog file * Update fix-cri-o-log-format-time-layout.yaml Remove external issue as it seems it does not validate * Update fix-cri-o-log-format-time-layout.yaml Add issue number
When using the helm chart and enabling value:
The following error is logged in the otel pods:
Seems related to #376 which was never resolved.
The text was updated successfully, but these errors were encountered: