embedded etcd: dropped internal Raft message since sending buffer is full (overloaded network) #3511

iameli · 2021-06-24T18:28:15Z

Environmental Info:
K3s Version:

# k3s -v
k3s version v1.20.6+k3s1 (8d043282)
go version go1.15.10

Node(s) CPU architecture, OS, and Version:
Linux dp4605 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
Three servers, zero agents. Using embedded etcd, wireguard backend, embedded containerd runtime.

Describe the bug:
One of the servers in the cluster had a hardware failure. Since that happened, this message has been spammed over and over to the k3s journald logs on one of the two remaining machines:

Jun 24 18:20:04 dp4605 k3s[855297]: {"level":"warn","ts":"2021-06-24T18:20:04.739Z","caller":"rafthttp/peer.go:267","msg":"dropped internal Raft message since sending buffer is full (overloaded network)","message-type":"MsgHeartbeat","local-member-id":"d1c004e84980efe9","from":"d1c004e84980efe9","remote-peer-id":"865cdac463481cfd","remote-peer-active":false}

Digging through others of these looking for other pertinent messages... there's this, which makes sense considering the node is down:

Jun 24 18:20:03 dp4605 k3s[855297]: {"level":"warn","ts":"2021-06-24T18:20:03.847Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"865cdac463481cfd","rtt":"0s","error":"dial tcp REDACTED:2380: connect: no route to host"}
Jun 24 18:20:03 dp4605 k3s[855297]: {"level":"warn","ts":"2021-06-24T18:20:03.850Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"865cdac463481cfd","rtt":"0s","error":"dial tcp REDACTED:2380: connect: no route to host"}

The other

Steps To Reproduce:

Unsure yet. Presumably this happens when a node gets taken out of the cluster in some kind of unhealthy way?

The text was updated successfully, but these errors were encountered:

brandond · 2021-06-24T18:32:38Z

These messages are coming from the embedded etcd; there's not really any way to turn them off. You will see these messages until the node comes back up, or is deleted from the cluster.

stale · 2021-12-21T19:28:45Z

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

stale bot added the status/stale label Dec 21, 2021

stale bot closed this as completed Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

embedded etcd: dropped internal Raft message since sending buffer is full (overloaded network) #3511

embedded etcd: dropped internal Raft message since sending buffer is full (overloaded network) #3511

iameli commented Jun 24, 2021

brandond commented Jun 24, 2021

stale bot commented Dec 21, 2021

embedded etcd: dropped internal Raft message since sending buffer is full (overloaded network) #3511

embedded etcd: dropped internal Raft message since sending buffer is full (overloaded network) #3511

Comments

iameli commented Jun 24, 2021

brandond commented Jun 24, 2021

stale bot commented Dec 21, 2021