Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The default value for rotate_wait in the tail input plugin does not work on s390x #8828

Closed
rightblank opened this issue May 16, 2024 · 2 comments

Comments

@rightblank
Copy link

rightblank commented May 16, 2024

Bug Report

Describe the bug

The rotate_wait property of the tail input plugin on system s390x is 0 by default rather than 5 as specified in the doc

  • Fluent-bit works fine on x86 with the same docker command in the Steps to reproduce the problem section
  • Specify the value in the configuration file would help to avoid the issue, but it's unknown how many configs are there with the same issue.

To Reproduce

  • Rubular link if applicable: None
  • Example log message if applicable:
docker run -ti --rm cr.fluentbit.io/fluent/fluent-bit:3.0.3 /fluent-bit/bin/fluent-bit -i tail  -p path=/var/log/syslog -o stdout
Fluent Bit v3.0.3
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io/

___________.__                        __    __________.__  __          ________  
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \ 
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  < 
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/ 

[2024/05/15 05:40:55] [ info] [fluent bit] version=3.0.3, commit=3529bbb132, pid=1
[2024/05/15 05:40:55] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/05/15 05:40:55] [ info] [cmetrics] version=0.9.0
[2024/05/15 05:40:55] [ info] [ctraces ] version=0.5.1
[2024/05/15 05:40:55] [ info] [input:tail:tail.0] initializing
[2024/05/15 05:40:55] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/05/15 05:40:55] [error] [input:tail:tail.0] invalid 'rotate_wait' config value
[2024/05/15 05:40:55] [error] failed initialize input tail.0
[2024/05/15 05:40:55] [error] [engine] input initialization failed
  • Steps to reproduce the problem:

Start the fluent-bit:3.0.3 docker image on s390x ubuntu machine with the commands below:

docker run -ti --rm cr.fluentbit.io/fluent/fluent-bit:3.0.3 /fluent-bit/bin/fluent-bit -i tail  -p path=/var/log/syslog -o stdout

Expected behavior

Screenshots

Your Environment

  • Version used: 3.0.3
  • Configuration: None,
  • Environment name and version (e.g. Kubernetes? What version?): docker
$  docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.10.5)
  compose: Docker Compose (Docker Inc., v2.18.1)

Server:
 Containers: 7
  Running: 3
  Paused: 0
  Stopped: 4
 Images: 149
 Server Version: 20.10.21
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 
 runc version: 
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.15.0-213-generic
 Operating System: Ubuntu 18.04.6 LTS
 OSType: linux
 Architecture: s390x
 CPUs: 8
 Total Memory: 15.54GiB
 Name: pok1-qz1-sr1-rk010-s20
 ID: SLPK:Q4W4:PPSB:HKYT:7GFN:TLQK:FXED:DN7G:FGKH:NHZP:RZ4N:4KWL
 Docker Root Dir: /data-vol/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  192.168.0.18:5000
  de.icr.io
  marina.eden.cloudlab.austin.ibm.com:5000
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
  • Server type and version: z14

  • Operating System and version:

$  uname -a
Linux pok1-qz1-sr1-rk010-s20 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:29:31 UTC 2023 s390x s390x s390x GNU/Linux
$  cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.6 LTS"
  • Filters and plugins: None

Additional context

@rightblank
Copy link
Author

rightblank commented May 31, 2024

It turns out to be a endian issue, the below change in src/flb_config_map.c would help to fix it.

--- a/src/flb_config_map.c
+++ b/src/flb_config_map.c
@@ -649,7 +649,7 @@ int flb_config_map_set(struct mk_list *properties, struct mk_list *map, void *co
         }
         else if (m->type == FLB_CONFIG_MAP_TIME) {
             m_i_num = (int *) (base + m->offset);
-            *m_i_num = m->value.val.s_num;
+            *m_i_num = m->value.val.i_num;
         }

The root cause is that

  • rotate_wait is stored as a 4 byte int number in the config_map.
  • value.val.s_num gets it as a 8 bytes size_t number, this means num = num << 32 on big endian systems like s390x
  • when assign new the value back to the int number *m_i_num, only the lower 4 bytes of 0 is passed to it, so it will be equal to 0.

@rightblank
Copy link
Author

rightblank commented May 31, 2024

Similar endian issue happens at flb_config_map.c#L778-L787,

  • plugins like systemd set the switch variables like lowercase as int,
  • if flb_utils_bool(kv->val) returns 1 in the below code, execute*m_bool = ret will set the first byte of the integer to 1, the resulted integer will be 1<<24 on big endian systems as there are 3 bytes appeneded to the right.
            else if (m->type == FLB_CONFIG_MAP_BOOL) {
                m_bool = (char *) (base + m->offset);
                ret = flb_utils_bool(kv->val);
                if (ret == -1) {
                    flb_error("[config map] invalid value for boolean property '%s=%s'",
                              m->name, kv->val);
                    return -1;
                }
                *m_bool = ret;
            }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant