-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🏃 Proposal to extract cluster-specifics out of the Manager #1075
Conversation
`runnables` are started. | ||
|
||
|
||
The new `ClusterConnector` interface will look like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideas for a better name are very welcome :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybePeerCluster
or ClusterPeer
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding naming: What about just Cluster
? Its pretty much what this represents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster seems fine (it's bugging me a little bit that it doesn't say that it's not super clear that this type doesn't set up a cluster, like envtest, but rather connects to a cluster, but don't consider that comment blocking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster it is now
|
||
```go | ||
if cc, isClusterConnector:= runnable.(clusterconnector.ClusterConnector); isClusterConnector { | ||
m.caches = append(m.caches, cc.GetCache()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the whole thing a bit weird, because we will end up starting both the ClusterConnectors
Cache and the ClusterConnector
separately. The alternate would be to require ppl to do a mgr.Add(clusterConnector.GetCache)
which I would like to avoid because its unintuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine to do type assertion here, since this's internal implementation details.
The alternate would be to require ppl to do a mgr.Add(clusterConnector.GetCache) which I would like to avoid because its unintuitive.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just do:
type HasCaches interface {
GetCache() ...
}
if getter, hasCaches := runnable.(hasCaches); hasCaches {
...
}
(could make a additional method to avoid accidentally conflicting too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
GetWebhookServer() *webhook.Server | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to separate this into 3 interfaces? In my point of view, this will make the API clearer. In multicluster environments, you have to call MultiClusterManager#Connect
once for each connected cluster in contrast to have one implicit connection and having n-1
explicit connections.
type cluserconnector.ClusterConnector{
// same as above
...
}}
type MultiClusterManager interface {
// Create a new connection to a cluster. Creates a new instance of ClusterConnector and adds it to this manager
Connect(config *rest.Config, name string, opts ...Option) cluserconnector.ClusterConnector
// same as Manager from proposal except cluserconnector.ClusterConnector
Add(Runnable) error
Elected() <-chan struct{}
SetFields(interface{}) error
AddMetricsExtraHandler(path string, handler http.Handler) error
AddHealthzCheck(name string, check healthz.Checker) error
AddReadyzCheck(name string, check healthz.Checker) error
Start(<-chan struct{}) error
GetWebhookServer() *webhook.Server
}
type Manager interface {
cluserconnector.ClusterConnector
MultiClusterManager
}
Introducing a factory method Connect
inside MultiClusterManager
would also solve the problems of having to use type assertions (see comment of @alvaroaleman).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kramerul Generally I like the idea (although I would probably embedd the Manager
ito the MultiClusterManager
and not the other way round) but it opens up an interesting question: What config will we use for leader election?
Right now LeaderElection is the reason why it IMHO makes sense to define one primary
cluster which we implicitly do by requiring a config to construct a manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alvaroaleman, I haven't thought about this aspect. But it's absolutely correct, that you need to have a primary connection for leader election. In this case I would agree to have only one Manager
interface.
The Connect
method could ease the usage of the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having an explicit interface for MultiCluster manager.
I would probably embedd the Manager ito the MultiClusterManager and not the other way round)
+1
A possible alternative is:
type ClusterConnector{
// same as above
}
// For single cluster use-case only
type Manager interface {
// same as above
}
type MultiClusterManager interface {
Manager
// Create a new connection to a secondary cluster. Creates a new instance of ClusterConnector and adds it to this manager. Each ClusterConnector should have its unique name.
ConnectSecondaryCluster(config *rest.Config, name string, opts ...Option) (cluserconnector.ClusterConnector, error)
// Look up the ClusterConnector by its unique name.
GetClusterConnectorFor(name string) (cluserconnector.ClusterConnector, error)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added it like this:
type MultiClusterManager interface {
Manager
// Add another named cluster to the MultiClusterManager. The name
// must be unique. The MultiClusterManager will wait for all clusters
// caches to be started before starting anytything else.
AddCluster(config *rest.Config, name string) error
// GetClusters returns all Clusters this MultiClusterManager knows about.
// The cluster used to construct the MultiClusterManager is named `primary`
// and will be used for LeaderElection, if enabled.
GetClusters() map[string]clusterconnector.ClusterConnector
}
In all the multi-cluster controllers I've built so far, the actual controller just wants to watch in all clusters but doesn't necessarily know the number or names of them ahead of time which is why a method that returns a map of all clusters is IMHO more useful. If there is one that gets a special treatment, this is still possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of the patterns I've seen for multicluster end up having clusters come & go over time. Even if we're not going to tackle that now, we should keep that in mind while designing.
I'm also not certain about the whole "primary" / "secondary" cluster thing. Perhaps modeling this as there's a "leader election" cluster and "functional" clusters (one of which may be the leader election cluster)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of the patterns I've seen for multicluster end up having clusters come & go over time. Even if we're not going to tackle that now, we should keep that in mind while designing.
IMHO this proposal should do nothing to prevent that but it does not need to do anything yet to simplify that. Once we want such a functionality, we would probably add a RemoveCluster
to the interface.
I'm also not certain about the whole "primary" / "secondary" cluster thing. Perhaps modeling this as there's a "leader election" cluster and "functional" clusters (one of which may be the leader election cluster)?
I guess we could probably change AddCluster
to have a UseForLeaderElection
bool opt if leader election is enabled, use that cluster (and error out when starting and note exactly one cluster is configured to be used for leader election). WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, after trying to update the sample, I would like to not make a MultiClusterManager
part of this proposal, because it opens up a set of questions I would like to not answer just yet:
- The
Builder
needs to be extended to have something likeWatchesFromCluster(source, "cluster-name", handler)
, otherwise we need to blindly dereference elements from themap[clusterName]cluster.Cluster
, potentially getting NPDs - The
Builder
needs to be extended to have something likeWatchesFromAllClustersExcept(source, "exepcted-cluster-name", handler)
- Probably more use cases for watches I didn't think about
- There should be a nice way to pass on all clients for all clusters to a controller that doesn't involve writing a loop (probably the multi cluster client abstraction that Solly suggested?)
- When adding official support for building a controller that watches an arbitraty number of clusters, we also need an official way of encoding the cluster name into the
reconcile.Request
And probably more.
IMHO, extracting the clusterspecifics out of the manager won't block any of the work mentioned above but allows us to start working on the topic without requiring us to already anticipate all use cases (And is already a big improvement over the status quo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fine with doing that in a follow-up
|
||
// SetFields will set any dependencies on an object for which the object has implemented the inject | ||
// interface - e.g. inject.Client. | ||
SetFields(interface{}) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what will happen when the embeded ClusterConnector
also has SetFields
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK this is fine as long as they have the same signature
GetWebhookServer() *webhook.Server | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having an explicit interface for MultiCluster manager.
I would probably embedd the Manager ito the MultiClusterManager and not the other way round)
+1
A possible alternative is:
type ClusterConnector{
// same as above
}
// For single cluster use-case only
type Manager interface {
// same as above
}
type MultiClusterManager interface {
Manager
// Create a new connection to a secondary cluster. Creates a new instance of ClusterConnector and adds it to this manager. Each ClusterConnector should have its unique name.
ConnectSecondaryCluster(config *rest.Config, name string, opts ...Option) (cluserconnector.ClusterConnector, error)
// Look up the ClusterConnector by its unique name.
GetClusterConnectorFor(name string) (cluserconnector.ClusterConnector, error)
}
|
||
```go | ||
if cc, isClusterConnector:= runnable.(clusterconnector.ClusterConnector); isClusterConnector { | ||
m.caches = append(m.caches, cc.GetCache()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine to do type assertion here, since this's internal implementation details.
The alternate would be to require ppl to do a mgr.Add(clusterConnector.GetCache) which I would like to avoid because its unintuitive.
+1
/assign @DirectXMan12 Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late comments. Mostly looks good, couple of things inline
`runnables` are started. | ||
|
||
|
||
The new `ClusterConnector` interface will look like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster seems fine (it's bugging me a little bit that it doesn't say that it's not super clear that this type doesn't set up a cluster, like envtest, but rather connects to a cluster, but don't consider that comment blocking)
type ClusterConnector interface { | ||
// SetFields will set cluster-specific dependencies on an object for which the object has implemented the inject | ||
// interface, specifically inject.Client, inject.Cache, inject.Scheme, inject.Config and inject.APIReader | ||
SetFields(interface{}) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tangentially related aside: I'd like to see if we can maybe refactor the internal DI stuff a bit before 1.0 -- the complete lack of type signature, etc bugs me a bit, but I've yet to figure out a better answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its an unrelated topic but IMHO we should try to get rid of all of it, because it makes changes to it runtime errors and not compile-time errors which is very bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. I'd like to see a proposal for how to cleanly do that (this is not to sound dismissive or snarky -- I genuinely would like to see a proposal :-) )
GetScheme() *runtime.Scheme | ||
|
||
// Start starts the ClusterConnector | ||
Start(<-chan struct{}) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we going to integrate the context work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this proposal is orthogonal to the context work, the MultiClusterManager
will make use of the same Runnable
interface as the normal Manager
, whatever that is at time of implementation
GetWebhookServer() *webhook.Server | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of the patterns I've seen for multicluster end up having clusters come & go over time. Even if we're not going to tackle that now, we should keep that in mind while designing.
I'm also not certain about the whole "primary" / "secondary" cluster thing. Perhaps modeling this as there's a "leader election" cluster and "functional" clusters (one of which may be the leader election cluster)
?
|
||
```go | ||
if cc, isClusterConnector:= runnable.(clusterconnector.ClusterConnector); isClusterConnector { | ||
m.caches = append(m.caches, cc.GetCache()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just do:
type HasCaches interface {
GetCache() ...
}
if getter, hasCaches := runnable.(hasCaches); hasCaches {
...
}
(could make a additional method to avoid accidentally conflicting too)
return reconcile.Result, err | ||
} | ||
|
||
if err := r.mirrorClusterClient.Get(context.TODO(), r.NamespacedName, &corev1.Secret); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tangent: Not that we need to tackle it here, but we could even create a multicluster client that (ab)used the unused ClusterName
field (maybe dangerous), or used a client option, context field, etc to avoid needing multiple clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a great idea (but I would prefer to keep that as an orthogonal work item)
} | ||
|
||
if err := r.mirrorClusterClient.Get(context.TODO(), r.NamespacedName, &corev1.Secret); err != nil { | ||
if !kerrors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: client.IgnoreNotFound(err)
exists ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how that is of use, we have to return on NotFound so its handled differently from no error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, the pattern just becomes if client.IgnoreNotFound(err) != nil
. Mainly just avoids remembering another k8s package to import.
panic(err) | ||
} | ||
|
||
mirrorClusterConnector, err := clusterconnector.New(cfg2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we leave the possibility for options here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we should use functional opts once we have a need for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, need to specify that in the signature so it's not breaking.
We should probably do a review before v1 and figure out spots where we should put functional options in (e.g. manager.New
probably wants functional options).
f4e21db
to
33f39b4
Compare
Everyone, I think I have responded to all feedback, PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very minor nits, otherwise
/approve
(will add the LGTM once nits are addressed, since it'll be removed anyway. Really would love "lgtm with nits" in k8s :-/)
panic(err) | ||
} | ||
|
||
mirrorClusterConnector, err := clusterconnector.New(cfg2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, need to specify that in the signature so it's not breaking.
We should probably do a review before v1 and figure out spots where we should put functional options in (e.g. manager.New
probably wants functional options).
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, DirectXMan12 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@DirectXMan12 updated to incorporate your suggestions |
/retest |
/lgtm |
test failure is #1171 |
/retest |
@alvaroaleman Is this proposal got implemented? If not do you have any tracking ticket for it. |
@ashishvya it is implemented |
@alvaroaleman could you kindly link the PR for implementing this proposal? I struggle to find where it's implemented. |
@taragu the implementation of this is in pkg/cluster, you can use the git history to find the relevant PRs |
A proposal based on #950
/assign @mengqiy @vincepri @estroz