Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wireguard timeout problem #187

Closed
dnlldl opened this issue Jan 17, 2025 · 4 comments
Closed

wireguard timeout problem #187

dnlldl opened this issue Jan 17, 2025 · 4 comments

Comments

@dnlldl
Copy link

dnlldl commented Jan 17, 2025

By default with no timeout configured all peers are considered on at all time, which is not suitable for most monitoring situations. When a timeout is set, peers are considered inactive when that timeout is reached, allowing for more accurate charts. However, we then get warnings and criticals if the peers are not connected, which may not be suitable to every situation.

Here is an example:

Image

In this case, I have one tunnel set up with 2 different peers. I can see on the first service that I have 2 configured peers with 1 considered connected. However, I'm getting a warning and a critical because the peers are inactive. In my situation, I want the peers to stay OK as long as they're configured, there is no problem even if the peer isn't connected.

A ruleset could be added where the state of a peer is chosen depending on the timeout. For example, I could have the first peer always OK and the second peer being a WARN if the timeout is exceeded because I'm expecting that peer to be always on.

@gurubert
Copy link
Member

Then just set a timeout for the second peer.

Wireguard does not tell us if a client is active or not. We only have the last time of activity. Hence a timeout is needed if you want to determine if a peer is still active or not.

We first had a global builtin timeout of 300 seconds. This was not sufficient for each user, so a ruleset was introduced to set individual timeouts. Due to how Checkmk is implemented the parameter gets set for each individual service check. Hence the "Wireguard wg0" check has its own timeout parameters to determine the activity.

There is not much we can do here.

@dnlldl
Copy link
Author

dnlldl commented Jan 18, 2025

Then just set a timeout for the second peer.

How? there is no service filter in the actual ruleset... it's either everything or nothing.

Image

Or is this just bugged somehow.

@dnlldl
Copy link
Author

dnlldl commented Jan 18, 2025

Yeah OK the field with the empty label between "Explicit hosts" and "Service labels" is for an "Item".

I tried "wg0" in there without success, timeout still applied to all peers.

Also, clicking "Wireguard" or "Rule 1 in Main" leads to an error:

Image

Image

@dnlldl
Copy link
Author

dnlldl commented Jan 18, 2025

Let me know if it's something specific to my install.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants