-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cluster.peers refresh #383
Conversation
How does this impact |
@domgreen it does (https://github.com/improbable-eng/thanos/blob/master/cmd/thanos/query.go#L183) , which looks for members already connected via gossip and tries to do a grpc connection to those nodes. (atleast as I understand that code) So this means that once dns refresh happens new members join via gossip, then later on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes sense, just I do not immediately know if that is safe to rejoin explicitly multiple times for memberlist library. Need to double check - or maybe you know the answer?
Another way of improving this is set to true (or put to flag) "waitIfEmpty" for clusters.New
for other components that store alone. (: What do you think?
pkg/cluster/cluster.go
Outdated
} | ||
|
||
prev := p.mlist.NumMembers() | ||
curr, err := p.mlist.Join(p.knownPeers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not trivial here -> basically you join member that already joined. I need to double check the consequences, but I am kind of worried that:
- peer is not closed until you create new one (join)
- even if we met above it will meet a little bit distruption.
have you looked closely what is the beaviour of this for memberlist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bplotka I've looked thru code and it won't close the peer. But you are right about the disruption, as it calls pushPullNode (https://github.com/hashicorp/memberlist/blob/master/memberlist.go#L225), which is rather expensive operation, as per docs:
pushPull is invoked periodically to randomly perform a complete state
exchange. Used to ensure a high level of convergence, but is also
reasonably expensive as the entire state of this node is exchanged
with the other node.
I'll change that we just join new members, just by resolving dns and looping thru current members
@domgreen So the issue is possible here: We join gossip cluster based on knownPeers that is based on some DNS domain. Let's say all components start simultaneously. At the moment of startup query resolved N components. But we could have N+10 components in the system, and they are running just Kubernetes did not catch that yet so it is not yet resolvable. On the other hand the non resolvable component cannot see query yet because it not yet resolvable as well (will be in some time). It's tricky thing :/ |
42b6b7e
to
4713645
Compare
@Bplotka here is an example log after these changes applied.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit, but looks nice.
pkg/cluster/cluster.go
Outdated
} | ||
|
||
if !isPeerFound { | ||
curr, err := p.mlist.Join([]string{peer}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not looping through resolvedPeers and currMembers to find not connected addresses and just do p.mlist.Join(notConnected)
? Might be easier to grasp. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, make sense.
pkg/cluster/cluster.go
Outdated
@@ -43,6 +45,7 @@ type Peer struct { | |||
const ( | |||
DefaultPushPullInterval = 5 * time.Second | |||
DefaultGossipInterval = 5 * time.Second | |||
DefaultRefreshInterval = 30 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe minute for it? or something bit longer? Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a minute should be fine. We are acquiring a lock (https://github.com/improbable-eng/thanos/pull/383/files#diff-1a365fc2cc3058f4efcd7c1457c216c2R239), so less frequent runs make sense. And it shouldn't increase startup time as it take awhile for every component to join the cluster.
Somewhat related, there have been discussions on upstream Prometheus/Alertmanager to re-use the Prometheus service discovery in Alertmanager for discovering memberlist peers, if that happens we should reuse the same piece here. |
Yea, similar flow, I am up for it @brancz 👍 Can you link me to the PRs when they will be some? |
Thanks @povilasv ! |
* Add cluster.peers refresh * Fix test * Change to list current members and join only new ones * Fixes after review
* Add cluster.peers refresh * Fix test * Change to list current members and join only new ones * Fixes after review
* Add cluster.peers refresh * Fix test * Change to list current members and join only new ones * Fixes after review
Capnproto improvements
Changes
REF #372
Adds a job which runs periodically and refreshes peer.state (requeries DNS server to get peer ip addresses)
The issue is that if at startup DNS has no records, "islands" of 1 thanos members form and DNS never gets refreshed.
An example case where it happens is when you run thanos in Kubernetes with
liveness
probes configured and restart all thanos pods at once. So on startup no IP's will be added to DNS due tolivenessProbe
not yet being ready and oncemembership.Join()
happens it will never refresh dns.Verification
This is log showing that refresh actually works
before=2 after=4
, which added thanos store gateway and cluster formed. In this case I explicitly set refresh to happen after 5mins to check that actually gossip doesn't heal itself and only after refresh happened it did.