-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[求助/Help] default-telegraf-2ggl7 default-host-5kbs7 无法启动 #21956
Comments
主要应该是由于host 启动失败导致telegraf启动失败。 |
lscpu |
更新 sys_info 使用 longtext: #22041 |
版本 ocboot-master-v3.11.9
host 和 telegraf 无法启动
kubectl -n onecloud get pods default-telegraf-2ggl7 default-host-5kbs7
NAME READY STATUS RESTARTS AGE
default-telegraf-2ggl7 0/1 Init:CrashLoopBackOff 25 (100s ago) 104m
default-host-5kbs7 2/3 Running 1 (2m47s ago)
7m47s
`Name: default-telegraf-d2wqh
Namespace: onecloud
Priority: 0
Service Account: onecloud-operator
Node: h100/192.168.50.198
Start Time: Sat, 11 Jan 2025 07:31:08 +0000
Labels: app=telegraf
app.kubernetes.io/component=telegraf
app.kubernetes.io/instance=onecloud-cluster-2b9b
app.kubernetes.io/managed-by=onecloud-operator
app.kubernetes.io/name=onecloud-cluster
controller-revision-hash=56c4c685f8
pod-template-generation=2
Annotations: onecloud.yunion.io/last-applied-configuration:
{"volumes":[{"name":"etc-telegraf","hostPath":{"path":"/etc/telegraf","type":"DirectoryOrCreate"}},{"name":"root","hostPath":{"path":"/","...
Status: Pending
IP: 192.168.50.198
IPs:
IP: 192.168.50.198
Controlled By: DaemonSet/default-telegraf
Init Containers:
telegraf-init:
Container ID: containerd://aca77256a0f40a3ddccb375108df6786e8ff6577c24c286ae02ab735011970df
Image: registry.cn-beijing.aliyuncs.com/yunionio/telegraf-init:release-1.19.2-0
Image ID: registry.cn-beijing.aliyuncs.com/yunionio/telegraf-init@sha256:dbda0b59b2506e76fd33547de7e13bc701b6571b4134485d1d96493b269b770e
Port:
Host Port:
Command:
/bin/telegraf-init
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sat, 11 Jan 2025 07:31:22 +0000
Finished: Sat, 11 Jan 2025 07:31:22 +0000
Ready: False
Restart Count: 2
Environment:
NODENAME: (v1:spec.nodeName)
INFLUXDB_URL: https://default-influxdb:30086
Mounts:
/etc/telegraf from etc-telegraf (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rjdfk (ro)
Containers:
telegraf:
Container ID:
Image: registry.cn-beijing.aliyuncs.com/yunionio/telegraf:release-1.19.2-9
Image ID:
Port:
Host Port:
Args:
/usr/bin/telegraf
-config
/etc/telegraf/telegraf.conf
-config-directory
/etc/telegraf/telegraf.d
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
HOST_ETC: /hostfs/etc
HOST_PROC: /hostfs/proc
HOST_SYS: /hostfs/sys
HOST_VAR: /hostfs/var
HOST_RUN: /hostfs/run
HOST_MOUNT_PREFIX: /hostfs
Mounts:
/etc/telegraf from etc-telegraf (rw)
/hostfs from root (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rjdfk (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
etc-telegraf:
Type: HostPath (bare host directory volume)
Path: /etc/telegraf
HostPathType: DirectoryOrCreate
root:
Type: HostPath (bare host directory volume)
Path: /
HostPathType: Directory
kube-api-access-rjdfk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node-role.kubernetes.io/controlplane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
Normal Scheduled 36s default-scheduler Successfully assigned onecloud/default-telegraf-d2wqh to h100
Normal Pulled 23s (x3 over 36s) kubelet Container image "registry.cn-beijing.aliyuncs.com/yunionio/telegraf-init:release-1.19.2-0" already present on machine
Normal Created 23s (x3 over 36s) kubelet Created container telegraf-init
Normal Started 22s (x3 over 36s) kubelet Started container telegraf-init
Warning BackOff 8s (x3 over 34s) kubelet Back-off restarting failed container telegraf-init in pod default-telegraf-d2wqh_onecloud(0e215719-6673-4a92-872e-0c93906c371d)
`
host
[warning 250111 07:30:43 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument start-host-ignore-sys-error
[error 2025-01-11 07:31:02 fileutils2.GetAllBlkdevsIoSchedulers(fileutils.go:175)] no block device avaiable
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "cat " , error: exit status 1 , output: cat: '': No such file or directory
[error 2025-01-11 07:31:03 hostinfo.(*SHostInfo).detectOsDist(hostinfo.go:814)] exit status 1
[error 2025-01-11 07:31:03 hostinfo.(*SHostInfo).detectOsDist(hostinfo.go:826)] Failed to detect distribution info
[debug 2025-01-11 07:31:03 procutils.(*Command).Run(procutils.go:89)] Execute command "systemctl cat -- openvswitch" , error: exit status 1
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl status yunion-host-sdnagent" , error: exit status 4 , output: Unit yunion-host-sdnagent.service could not be found.
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled yunion-host-sdnagent" , error: exit status 1 , output: Failed to get unit file state for yunion-host-sdnagent.service: No such file or directory
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl status yunion-host-sdnagent" , error: exit status 4 , output: Unit yunion-host-sdnagent.service could not be found.
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled yunion-host-sdnagent" , error: exit status 1 , output: Failed to get unit file state for yunion-host-sdnagent.service: No such file or directory
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl status yunion-host-deployer" , error: exit status 4 , output: Unit yunion-host-deployer.service could not be found.
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled yunion-host-deployer" , error: exit status 1 , output: Failed to get unit file state for yunion-host-deployer.service: No such file or directory
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl status yunion-host-deployer" , error: exit status 4 , output: Unit yunion-host-deployer.service could not be found.
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled yunion-host-deployer" , error: exit status 1 , output: Failed to get unit file state for yunion-host-deployer.service: No such file or directory
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl status telegraf" , error: exit status 4 , output: Unit telegraf.service could not be found.
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled telegraf" , error: exit status 1 , output: Failed to get unit file state for telegraf.service: No such file or directory
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl status telegraf" , error: exit status 4 , output: Unit telegraf.service could not be found.
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled telegraf" , error: exit status 1 , output: Failed to get unit file state for telegraf.service: No such file or directory
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled openvswitch-switch" , error: exit status 1 , output: disabled
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled openvswitch-switch" , error: exit status 1 , output: disabled
[debug 2025-01-11 07:31:03 procutils.(*Command).Output(procutils.go:98)] Execute command "systemctl is-enabled openvswitch-switch" , error: exit status 1 , output: disabled
[error 2025-01-11 07:31:03 auth.(*authManager).startRefreshRevokeTokens(auth.go:193)] refreshRevokeTokens: No valid admin token credential
[debug 2025-01-11 07:31:04 procutils.(Command).Output(procutils.go:98)] Execute command "which python" , error: exit status 1 , output:
.......
processors":[174,366],"total_threads":2},{"id":15,"index":75,"logical_processors":[175,367],"total_threads":2},{"id":40,"index":76,"logical_processors":[176,368],"total_threads":2},{"id":41,"index":77,"logical_processors":[177,369],"total_threads":2},{"id":42,"index":78,"logical_processors":[178,370],"total_threads":2},{"id":43,"index":79,"logical_processors":[179,371],"total_threads":2},{"id":44,"index":80,"logical_processors":[180,372],"total_threads":2},{"id":45,"index":81,"logical_processors":[181,373],"total_threads":2},{"id":46,"index":82,"logical_processors":[182,374],"total_threads":2},{"id":47,"index":83,"logical_processors":[183,375],"total_threads":2},{"id":72,"index":84,"logical_processors":[184,376],"total_threads":2},{"id":73,"index":85,"logical_processors":[185,377],"total_threads":2},{"id":74,"index":86,"logical_processors":[186,378],"total_threads":2},{"id":75,"index":87,"logical_processors":[187,379],"total_threads":2},{"id":76,"index":88,"logical_processors":[188,380],"total_threads":2},{"id":77,"index":89,"logical_processors":[189,381],"total_threads":2},{"id":78,"index":90,"logical_processors":[190,382],"total_threads":2},{"id":79,"index":91,"logical_processors":[191,383],"total_threads":2},{"id":0,"index":92,"logical_processors":[288,96],"total_threads":2},{"id":1,"index":93,"logical_processors":[289,97],"total_threads":2},{"id":2,"index":94,"logical_processors":[290,98],"total_threads":2},{"id":3,"index":95,"logical_processors":[291,99],"total_threads":2}],"distances":[32,10],"id":1,"memory":{"supported_page_sizes":[1073741824,2097152],"total_physical_bytes":826781204480,"total_usable_bytes":811614797824}}]},"version":"03"},"version":"release/3.11.9(60778a6a7724122408)"}: {"error":{"class":"UnclassifiedError","code":500,"details":"TxExec: Error 1406: Data too long for column 'sys_info' at row 1","request":{"body":"{"host":{"meta":{"cpu_info":"{\"processors\":[{\"capabilities\":[\"fpu\",\"vme\",\"de\",\"pse\",....9(60778a6a7724122408)"}}","headers":{"Content-Length":"169586","Content-Type":"application/json","User-Agent":"yunioncloud-go/201708","X-Auth-Token":"","X-Yunion-Parent-Id":"","X-Yunion-Peer-Service-Name":"host","X-Yunion-Remote-Addr":"default-region:30888","X-Yunion-Span-Id":"0","X-Yunion-Span-Name":"","X-Yunion-Strace-Debug":"true","X-Yunion-Strace-Id":"c010d555"},"method":"POST","url":"https://default-region:30888/zones/21d90935-8db8-4f3a-87f0-60581c7e6052/hosts"}}}
The text was updated successfully, but these errors were encountered: