-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault after updating from version 1.16.3 to 1.16.4 #378
Comments
Customize Dockerfile like this:
Then build image, use custom build may cause SEGV.
|
Workaround: disable jemalloc in customized container image. Set empty LD_PRELOAD="".
|
MEMO: It seems that it was crashed here: def self.read_and_free_outstr(ptr)
str = ptr.read_string
LibC.free(ptr)
str
end LibC.free is called via read_and_free_outstr in def cursor
out_ptr = FFI::MemoryPointer.new(:pointer, 1)
if (rc = Native.sd_journal_get_cursor(@ptr, out_ptr)) < 0
raise JournalError, rc
end
Journal.read_and_free_outstr(out_ptr.read_pointer)
end It was assumed that out_ptr is allocated and should be freed. With jemalloc, this mechanism may not work as expected. |
would it make sense to open a bug report on ledbettj systemd-journal project? |
I got this issue as well when updating from 1.16.3 to 1.17.0. I'm rolling back to 1.16.3 instead of disabling jemalloc because it sounds like a memory bug that's probably still there, it's just that it crashes under jemalloc and not the stock malloc. It would be great if someone familiar with the code could open an issue in systemd-journal if that's where the problem is. |
There is a known bug that combination with jemalloc and fluent-plugin-systemd causes free(): invalid crash for a long time. The problematic code is identified but the root cause is not fixed yet. There is a workaround for this - disable jemalloc explicitly. LD_PRELOAD= stop to use jemalloc. If you want to use jemalloc, set it via env like this: containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1-debian-forward env: - name: K8S_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: FLUENT_FORWARD_HOST value: "REMOTE_ENDPOINT" - name: FLUENT_FORWARD_PORT value: "18080" - name: LD_PRELOAD value: "/usr/lib/libjemalloc.so.2" Related issues: fluent/fluentd-docker-image#378 fluent/fluent-package-builder#369 fluent-plugins-nursery/fluent-plugin-systemd#110 ledbettj/systemd-journal#93 fluent#1478 Signed-off-by: Kentaro Hayashi <[email protected]>
There is a known bug that combination with jemalloc and fluent-plugin-systemd causes free(): invalid crash for a long time. The problematic code is identified but the root cause is not fixed yet. There is a workaround for this - disable jemalloc explicitly. LD_PRELOAD= stop to use jemalloc. If you want to use jemalloc, set it via env like this: containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1-debian-forward env: - name: K8S_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: FLUENT_FORWARD_HOST value: "REMOTE_ENDPOINT" - name: FLUENT_FORWARD_PORT value: "18080" - name: LD_PRELOAD value: "/usr/lib/libjemalloc.so.2" Related issues: fluent/fluentd-docker-image#378 fluent/fluent-package-builder#369 fluent-plugins-nursery/fluent-plugin-systemd#110 ledbettj/systemd-journal#93 fluent#1478 Signed-off-by: Kentaro Hayashi <[email protected]>
There is a known bug that combination with jemalloc and fluent-plugin-systemd causes free(): invalid crash for a long time. The problematic code is identified but the root cause is not fixed yet. There is a workaround for this - disable jemalloc explicitly. LD_PRELOAD= stop to use jemalloc. If you want to use jemalloc, set it via env like this: containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1-debian-forward env: - name: K8S_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: FLUENT_FORWARD_HOST value: "REMOTE_ENDPOINT" - name: FLUENT_FORWARD_PORT value: "18080" - name: LD_PRELOAD value: "/usr/lib/libjemalloc.so.2" Related issues: fluent/fluentd-docker-image#378 fluent/fluent-package-builder#369 fluent-plugins-nursery/fluent-plugin-systemd#110 ledbettj/systemd-journal#93 #1478 Signed-off-by: Kentaro Hayashi <[email protected]>
I've tried it with more recent version of jemalloc to investigate this SEGV.
This problem is still reproduced.
|
Probably I got the reason.
systemd-jounal gem calls libc's # some sd_journal_* functions return strings that we're expected to free
# ourselves. This function copies the string from a char* to a ruby string,
# frees the char*, and returns the ruby string.
def self.read_and_free_outstr(ptr)
str = ptr.read_string
LibC.free(ptr)
str
end When jemalloc is used, There was a pull request that fixes this issue: It looks good and the gem author also seems positive with this patch. |
It seems that it will not crash anymore. Test case: changed to use ptr.free(ptr) in systemd-journal
|
I've created a PR for upsteam. |
checking ledbettj/systemd-journal#97 alternative implementation.
.so is installed under:
Instead, shim/shim succeeds.
|
It should be: diff --git a/ext/shim/extconf.rb b/ext/shim/extconf.rb
index 94abd76..a53b749 100644
--- a/ext/shim/extconf.rb
+++ b/ext/shim/extconf.rb
@@ -7,4 +7,4 @@ require "mkmf"
# selectively, or entirely remove this flag.
append_cflags("-fvisibility=hidden")
-create_makefile("shim/shim")
+create_makefile("systemd/journal/shim") |
Observing changes...
diff --git a/v1.17/debian/Dockerfile b/v1.17/debian/Dockerfile
index 4a245d1..43849c6 100644
--- a/v1.17/debian/Dockerfile
+++ b/v1.17/debian/Dockerfile
@@ -6,6 +6,8 @@ LABEL maintainer "Fluentd developers <[email protected]>"
LABEL Description="Fluentd docker image" Vendor="Fluent Organization" Version="1.17.1"
ENV TINI_VERSION=0.18.0
+COPY systemd-journal-1.4.2.1.gem /fluentd/
+
# Do not split this into multiple RUN!
# Docker creates a layer for every RUN-Statement
# therefore an 'apt-get purge' has no effect
@@ -24,6 +26,10 @@ RUN apt-get update \
&& gem install async -v 1.32.1 \
&& gem install async-http -v 0.64.2 \
&& gem install fluentd -v 1.17.1 \
+ && gem install ffi \
+ && gem install --local /fluentd/systemd-journal-1.4.2.1.gem \
+ && gem install fluent-plugin-systemd \
+ && gem install fluent-plugin-watch-objectspace \
&& dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')" \
&& wget -O /usr/local/bin/tini "https://github.com/krallin/tini/releases/download/v$TINI_VERSION/tini-$dpkgArch" \
&& wget -O /usr/local/bin/tini.asc "https://github.com/krallin/tini/releases/download/v$TINI_VERSION/tini-$dpkgArch.asc" \
@@ -53,7 +59,6 @@ RUN groupadd -r fluent && useradd -r -g fluent fluent \
&& mkdir -p /fluentd/etc /fluentd/plugins \
&& chown -R fluent /fluentd && chgrp -R fluent /fluentd
-
COPY fluent.conf /fluentd/etc/
COPY entrypoint.sh /bin/ Threshold is a bit strict (x1.1), so observed that error notification was fired. Running docker image with: Checking objectspace with modified version of systemd-journal (jemalloc)Configure fluent.conf using objectspace:
Run with Checking objectspace with modified version of systemd-journal (without jemalloc)
|
There is known bug that systemd-journal 1.4.2 or older version doesn't work with custom memory allocator such as jemalloc because of inappropriate allocated memory handling. As a result, it causes a SEGV. This bug was frequently reported from fluent-docker-image and fluentd-kubernetes-daemonset images users. [1] [1] fluent/fluentd-docker-image#378 Recently, this bug was fixed [2] and released as systemd-journal 2.0.0. Now we should switch to it. [2] ledbettj/systemd-journal#97 NOTE: systemd-journal requires Ruby 3.0.0 or later, so need to bump base image version which provides Ruby 3.x - so use ubuntu:jammy for testing. Signed-off-by: Kentaro Hayashi <[email protected]>
I've checked with integrated fixed version (systemd-journal 2.0.0) into test container with/without jemalloc again. It seems that same tendency was shown from the attached logs. fix-with-jemalloc-2.0.0.log.gz So, it was resolved in systemd-journal 2.0.0. |
I've sent a feedback to adopt systemd-journal 2.0.0 |
systemd-journal 2.0.0 fixes segmentation fault with jemalloc memory allocator. fluent-plugin-systemd 1.1.0 adopts systemd-journal 2.0.0 or later. fluent/fluentd-docker-image#378 fluent/fluentd-docker-image#385 fluent#1517 Signed-off-by: Kentaro Hayashi <[email protected]>
This issue was fixed via fluent-plugin-systemd 1.1.0. (which uses systemd-journal 2.0.0) Please use fluent-plugin-systemd 1.1.0. https://github.com/fluent-plugins-nursery/fluent-plugin-systemd/releases/tag/v1.1.0 |
systemd-journal 2.0.0 fixes segmentation fault with jemalloc memory allocator. fluent-plugin-systemd 1.1.0 adopts systemd-journal 2.0.0 or later. fluent/fluentd-docker-image#378 fluent/fluentd-docker-image#385 #1517 Signed-off-by: Kentaro Hayashi <[email protected]>
Describe the bug
After updating the official docker container from fluent/fluentd:v1.16.3 to fluent/fluentd:v1.16.4 I got a segmentation fault during startup which end in a endless starting loop.
Additionally I have the following gem modules installed
To Reproduce
just update and restart
Expected behavior
should not crash
Your Environment
Your Configuration
Your Error Log
Additional context
No response
The text was updated successfully, but these errors were encountered: