-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: provide realistic runAsNonRoot security context values for fluent-bit #330
Comments
I was able to get aws-for-fluent-bit running with the following permissions - I have not seen any issues yet but will let you know if I do. I was also unnable to get running with nonroot as it does not appear fluent-bit can run unless running as user 0
|
So running as a Non-Root user isn`t an option at the moment? Can we confirm this? |
I'd love to be able to tune securityContext for running process as non-root, most importantly, the non-root/nobody user present in distroless image. Does this potential feature disallow fluent-bit from reading log files or there's additional complexity I'm not aware of? |
I think I have been able to get Fluent Bit running as a non-root user AND still use a hostPath volume for the tail database and buffering. But I'd like some feedback on my approach in case I'm missing something. Implementing this required 3 sets of changes to the Fluent Bit Helm chart.
In my Fluent Bit configuration, I just pointed to the mounted volume in the Fluent Bit has been running in this configuration for the last few hours without any problems as far as I can tell. Log messages are being collected and forwarded onto their destination (OpenSearch) with no obvious regression in the number of log messages processed. The Fluent Bit pod logs don't show any new ERROR or WARNING messages. I've SSH'ed onto the Kubernetes nodes and things look "right":
Hmmm, just noticed that the files within the FB storage directory are owned by user '3301' but the group is 'root'. I thought the Anyone see something wrong about this approach? Any hidden things I may be missing? NOTE: I'm working with Fluent Bit 2.2.2 and Fluent Bit Helm chart version 0.43.0. @joebowbeer If you get some time, please give this a try and see if it works for you. |
Hello, Could you please confirm if the solution has undergone testing and validation? or any other solutions for this issue? Thank you. |
@onap4105 I think I've tried something equivalent to this before, except I ran the chown command via ssh/exec and it did not work. |
Thank you @PettitWesley |
@PettitWesley I wonder if you ran into a timing issue: the pod has to be up and running before you can ssh/exec into it; wouldn't Fluent Bit have already come up and failed (due to file permissions) before you ssh'ed in and had a chance to change the file permissions? Or, is it possible that the issue was caused by differences between the AWS version of Fluent Bit and (non-AWS) Fluent Bit? I continued to play around with my approach after posting this and Fluent Bit continued to work as expected/desired for several days. I believe I was even able to remove the grant back of the FOWNER capability in the securityContext. So, from my week or two of testing, this approach seems to work. I've held off of moving to this in a more production environment hoping to get some feedback, preferable validation (or clear evidence of problems), from the wider Fluent Bit community. It's always helpful to have someone completely new try things out. @onap4105 I'm just a Fluent Bit user so I can't offer official support or validation. Give it a try and let us know whether it works in your use-case. Thanks. |
@gsmith-sas Below are my changes and the initial results based on your suggestions. I am still verifying and understanding the outcomes. Please let me know if you have any advice. I used https://github.com/fluent/fluent-operator/releases/tag/v2.8.0
# initContainers test run as non root user
initContainers:
- name: chowner-fb-storage
image: registry.hub.docker.com/library/alpine:3.12.0
command: ["chown", "3301:3301", "/fluent-bit"]
securityContext:
readOnlyRootFilesystem: true
capabilities:
drop: ["all"]
add: ["CHOWN"]
runAsUser: 0
runAsNonRoot: false
volumeMounts:
- name: positions
mountPath: /fluent-bit
# Note: I think this is hardcoded in the fluent-bit image, I use it instead of creating a new fb-storage.
Volumes:
positions:
Type: HostPath (bare host directory volume)
Path: /var/lib/fluent-bit/
HostPathType:
$ helm install fluent-operator -n fluentbit ./fluent-operator/
W0430 21:57:57.912852 19520 warnings.go:70] unknown field "spec.securityContext.capabilities"
W0430 21:57:57.912852 19520 warnings.go:70] unknown field "spec.securityContext.privileged"
W0430 21:57:57.912852 19520 warnings.go:70] unknown field "spec.securityContext.readOnlyRootFilesystem"
Error: INSTALLATION FAILED: failed to refresh resource information: fluentbits.fluentbit.fluent.io "fluent-bit" not found
$ helm list -n fluentbit
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
fluent-operator fluentbit 1 2024-04-30 21:57:43.0906769 -0400 EDT failed fluent-operator-2.8.0 2.8.0
$ kubectl get all -n fluentbit
NAME READY STATUS RESTARTS AGE
pod/fluent-bit-8sdnh 1/1 Running 0 9h
pod/fluent-bit-9xgm2 1/1 Running 0 9h
pod/fluent-bit-dtqw9 1/1 Running 0 9h
pod/fluent-bit-fdm9f 1/1 Running 0 9h
pod/fluent-bit-g54tw 1/1 Running 0 9h
pod/fluent-bit-t7dw9 1/1 Running 0 9h
pod/fluent-bit-vk27g 1/1 Running 0 9h
pod/fluent-bit-wlhvz 1/1 Running 0 9h
pod/fluent-bit-xx5g4 1/1 Running 0 9h
pod/fluent-operator-5d466549cb-s8cn6 1/1 Running 0 9h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/fluent-bit ClusterIP x.x.x.x <none> 2020/TCP 9h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/fluent-bit 9 9 9 9 9 <none> 9h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/fluent-operator 1/1 1 1 9h
NAME DESIRED CURRENT READY AGE
replicaset.apps/fluent-operator-5d466549cb 1 1 1 9h
$ kubectl logs -n fluentbit fluent-bit-wlhvz
level=info time=2024-05-01T01:58:00Z msg="fluent-bit started"
Fluent Bit v2.2.2
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
____________________
< Fluent Bit v2.2.2
-------------------
\
\
\ __---__
_- /--______
__--( / \ )XXXXXXXXXXX\v.
.-XXX( O O )XXXXXXXXXXXXXXX-
/XXX( U ) XXXXXXX\
/XXXXX( )--_ XXXXXXXXXXX\
/XXXXX/ ( O ) XXXXXX \XXXXX\
XXXXX/ / XXXXXX \__ \XXXXX
XXXXXX__/ XXXXXX \__---->
---___ XXX__/ XXXXXX \__ /
\- --__/ ___/\ XXXXXX / ___--/=
\-\ ___/ XXXXXX '--- XXXXXX
\-\/XXX\ XXXXXX /XXXXX
\XXXXXXXXX \ /XXXXX/
\XXXXXX > _/XXXXX/
\XXXXX--__/ __-- XXXX/
-XXXXXXXX--------------- XXXXXX-
\XXXXXXXXXXXXXXXXXXXXXXXXXX/
""VXXXXXXXXXXXXXXXXXXV""
[2024/05/01 01:58:00] [ info] [fluent bit] version=2.2.2, commit=eeea396e88, pid=13
[2024/05/01 01:58:00] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/05/01 01:58:00] [ info] [cmetrics] version=0.6.6
[2024/05/01 01:58:00] [ info] [ctraces ] version=0.4.0
[2024/05/01 01:58:00] [ info] [input:systemd:systemd.0] initializing
[2024/05/01 01:58:00] [ info] [input:systemd:systemd.0] storage_strategy='memory' (memory only)
[2024/05/01 01:58:00] [ info] [input:tail:tail.1] initializing
[2024/05/01 01:58:00] [ info] [input:tail:tail.1] storage_strategy='memory' (memory only)
[2024/05/01 01:58:00] [error] [input:tail:tail.1] parser 'cri' is not registered
[2024/05/01 01:58:00] [ info] [filter:kubernetes:kubernetes.1] https=1 host=kubernetes.default.svc port=443
[2024/05/01 01:58:00] [ info] [filter:kubernetes:kubernetes.1] token updated
[2024/05/01 01:58:00] [ info] [filter:kubernetes:kubernetes.1] local POD info OK
[2024/05/01 01:58:00] [ info] [filter:kubernetes:kubernetes.1] testing connectivity with API server...
[2024/05/01 01:58:00] [ info] [filter:kubernetes:kubernetes.1] connectivity OK
[2024/05/01 01:58:00] [ info] [output:stdout:stdout.0] worker #0 started
[2024/05/01 01:58:00] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2024/05/01 01:58:00] [ info] [sp] stream processor started
$ id
uid=3301 gid=0(root) groups=0(root),3301
$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
3301 1 0.0 0.0 711144 11944 ? Ssl 01:58 0:00 /fluent-bit/bin/fluent-bit-watcher
3301 13 0.2 0.0 120000 45676 ? Sl 01:58 1:24 /fluent-bit/bin/fluent-bit --enable-hot-reload -c /fluent-bit/etc/f3301
$ ls -lrt / | grep fluent
drwxr-xr-x 1 root root 4096 May 1 01:57 fluent-bit
$ ls -lrt /fluent-bit
total 16
drwxr-xr-x 2 root root 4096 Jan 14 16:22 log
drwxr-xr-x 1 root root 4096 Feb 18 07:53 etc
drwxr-xr-x 1 root root 4096 Feb 18 07:53 bin
drwxrwsrwt 3 root 3301 180 May 1 01:57 config
drwxr-xr-x 2 3301 3301 4096 May 1 01:57 tail
$ ls -lrt ./tail
total 4084
-rw-r--r-- 1 3301 root 8192 May 1 01:58 systemd.db
-rw-r--r-- 1 3301 root 16384 May 1 11:22 pos.db
-rw-r--r-- 1 3301 root 32768 May 1 12:21 pos.db-shm
-rw-r--r-- 1 3301 root 4120032 May 1 12:21 pos.db-wal
/var/lib# ls -lrt | grep fluent
drwxr-xr-x 2 3301 3301 4096 May 1 01:57 fluent-bit
/var/lib/fluent-bit# ls -lrt
total 4088
-rw-r--r-- 1 3301 root 8192 May 1 01:57 systemd.db
-rw-r--r-- 1 3301 root 24576 May 1 02:04 pos.db
-rw-r--r-- 1 3301 root 32768 May 1 02:05 pos.db-shm
-rw-r--r-- 1 3301 root 4120032 May 1 02:05 pos.db-wal
|
Provide realistic values for running fluent-bit as a non-root user.
The security context comments in values.yaml are not usable:
Issues:
nonroot
user id (65532:65532).Related:
The text was updated successfully, but these errors were encountered: