Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: use corev1.PodSpec for the PodTemplate field #172

Closed
lllamnyp opened this issue Apr 13, 2024 · 13 comments
Closed

Proposal: use corev1.PodSpec for the PodTemplate field #172

lllamnyp opened this issue Apr 13, 2024 · 13 comments

Comments

@lllamnyp
Copy link
Member

Proposal

I suggest we consider using the k8s core type corev1.PodSpec instead of our own implementation for overrides in the pod template generated by the operator:

 type PodTemplate struct {
     EmbeddedObjectMetadata
-    Spec PodSpec
+    Spec corev1.PodSpec
 }

Pros:

  • Users of the operator will be able to redefine all fields in the pod spec, even those, not anticipated by us.
  • The merging logic of the generated template with the overrides will be greatly simplified (see below).
  • We will not have to maintain a pod spec in parallel with k8s core.
  • Users will not be confused, expecting certain fields in EtcdCluster.spec.podTemplate.spec to work, whereas they won't.

Merging logic

Right now we merge the pod template with the final result in a manner like

Spec: corev1.PodSpec{
    Containers:                    generateContainers(cluster),
    ImagePullSecrets:              cluster.Spec.PodTemplate.Spec.ImagePullSecrets,
    // several fields omitted
    RuntimeClassName:              cluster.Spec.PodTemplate.Spec.RuntimeClassName,
    Volumes:                       volumes,
},

This can be reduced to

mergedJson, _ := strategicpatch.StrategicMergePatch(
        defaultPodSpecJSONBytes,
        podTemplateOverride,
        &corev1.PodSpec{},
)
json.Unmarshal(mergedJson, &finalPodSpec)

This has the advantage of correctly merging list values, letting users add or override fields under keys such as containers, env, volumes, and so on.

Cons:

  • Users of the operator will be able to redefine all fields in the pod spec, even those, not anticipated by us.
  • This opens more possibilities for breaking changes. For example, a user might attach a sidecar/init-container with the same name, that we shall later on use for a sidecar/init-container introduced by us.
  • This opens more possibilities to generate a configuration that doesn't work/breaks the etcd cluster.
  • We will not be able to provide custom API documentation for fields not maintained by us.
  • StrategicMergePatch introduces extra JSON marshaling/unmarshaling.

Further remarks

  • The current design supports an allowlist of fields to be modified. I would argue that such fields should be included at the top level of EtcdClusterSpec, since we already anticipate common use-cases for them. This would free most users from ever having to set the podTemplate field to a non-nil value.
  • I do not think we should restrict power users from going deep under the hood if they find it necessary, especially when this restriction isn't brought about by technical difficulties, but, rather, lifting it simplifies the code.
  • Finally, issue [API] not using core/v1 pod spec #132 and PR Replace original PodSpec to corev1 PodSpec #167 are already requesting this feature.
@sircthulhu
Copy link
Member

I beileve it's rather a good idea. But we should keep in mind for future releases to think twice if there is something to be a possible breaking change.

I vote for corev1.PodSpec.

@gecube
Copy link
Collaborator

gecube commented Apr 13, 2024

This opens more possibilities for breaking changes. For example, a user might attach a sidecar/init-container with the same name, that we shall later on use for a sidecar/init-container introduced by us.

I am waiting for some work-around of it or idea how to prevent - maybe admission controller and validation of fields?

@lllamnyp
Copy link
Member Author

I am waiting for some work-around of it or idea how to prevent - maybe admission controller and validation of fields?

A scenario like this is fundamentally unpreventable, because there is no way of knowing if the user accidentally had a container with the same name as ours, or if he actually wanted to override some fields on our sidecar.

@gecube
Copy link
Collaborator

gecube commented Apr 13, 2024

@lllamnyp it's not true, as you can disallow the applying of incorrect object to k8s API - with admission controller. It is the same mechanics as nginx ingress disallow multiple Ingresses with the same pair host - path.

@gecube
Copy link
Collaborator

gecube commented Apr 13, 2024

Users will not be confused, expecting certain fields in EtcdCluster.spec.podTemplate.spec to work, whereas they won't.

what does it mean? You are referring to fields not implemented in out version of PodSpec? They are not "not working", they are "not implemented" or "non existing" in our spec, so it would be clearly understandable from kubectl explain, schema and when applying the object with a proper error message from k8s api.

@kvaps
Copy link
Member

kvaps commented Apr 14, 2024

I want to support the idea that the podTemplate could include all possible parameters from pod spec. The podTemplate spec should fully match the official Kubernetes specification, providing consistency and clarity for users. In my opinion, using the standard specification, already familiar to users and present in all libraries, is the best approach. This logic is consistent with controllers like StatefulSet controller with its volumeTemplate and CronJob controller with its jobTemplate.

An operator is primarily a similar controller that is just more knowledgeable about the logic of stateful application it operates. It knows how to properly deploy and scale it. It should assist the user, but not restrict them from potential mishaps. Moreover, if a user intends to break the deployment, they will find a way to do so regardless if we restrict certain parameters.

Also I'm convinced that the same common predictable logic sould ve used for merging parameters across all such someTemplate fields, and in my view, the strategic merge patch seems the most expected way for doing that.

@sergeimonakhov
Copy link
Contributor

I want to support the idea that the podTemplate could include all possible parameters from pod spec. The podTemplate spec should fully match the official Kubernetes specification, providing consistency and clarity for users. In my opinion, using the standard specification, already familiar to users and present in all libraries, is the best approach. This logic is consistent with controllers like StatefulSet controller with its volumeTemplate and CronJob controller with its jobTemplate.

An operator is primarily a similar controller that is just more knowledgeable about the logic of stateful application it operates. It knows how to properly deploy and scale it. It should assist the user, but not restrict them from potential mishaps. Moreover, if a user intends to break the deployment, they will find a way to do so regardless if we restrict certain parameters.

Also I'm convinced that the same common predictable logic sould've used for merging parameters across all such someTemplate fields, and in my view, the strategic merge patch seems the most expected way for doing that.

Examples with StatefulSet and CronJob are not relevant, they have a single source of truth and their specifications have strict validation on required fields.

Currently you have two states:

  1. maintained by the controller (hardcoding a part of the specification in the mutation webhook).
  2. user-defined state that arrives with the object.

This is a bad approach - it violates encapsulation, contains hidden execution logic, and can lead to many errors. I believe the idea of using podSpec is redundant.

The operator shouldn't allow the user to configure the entire pod specification because it is unnecessary for them. The operator's primary responsibility is to ensure the cluster's operation and restrict the user from attempting to disrupt this work, which is the opposite of the current situation.

As a result, with this approach, the ability to use default values and validations in CRDs will be lost, leading to the need for additional code development.

@gecube
Copy link
Collaborator

gecube commented Apr 15, 2024

@sergeimonakhov totally agree with you.
Also I want to mention that stable first class operators like strimzi does not allow ALL fields from PodSpec. Proof: https://strimzi.io/docs/operators/latest/configuring.html#type-PodTemplate-schema-reference

The operator shouldn't allow the user to configure the entire pod specification because it is unnecessary for them. The operator's primary responsibility is to ensure the cluster's operation and restrict the user from attempting to disrupt this work, which is the opposite of the current situation.

If you (i.e. user) need it, you always can write your own mutation controller (for instance, based on Kyverno).

@lllamnyp
Copy link
Member Author

@lllamnyp it's not true, as you can disallow the applying of incorrect object to k8s API - with admission controller. It is the same mechanics as nginx ingress disallow multiple Ingresses with the same pair host - path.

@gecube Say, a user has

spec:
  containers:
  - name: our-future-sidecar
    # some fields with overrides

What was the user's intent? Did he want to add a sidecar with a name that conflicts with a potential sidecar that was implemented by us?

Or did he deliberately want to override our sidecar?

@lllamnyp
Copy link
Member Author

@sergeimonakhov

Currently you have two states:

  1. maintained by the controller (hardcoding a part of the specification in the mutation webhook).
  2. user-defined state that arrives with the object.

This is a bad approach - it violates encapsulation, contains hidden execution logic, and can lead to many errors. I believe the idea of using podSpec is redundant.

I completely agree that the code as is isn't great. There's a lot of custom merging logic that is error prone and the podTemplate field is at one point even used as a container to store intermediate computations. That's why I believe that using the native spec is a better way. Check out #175 and the amount of redundant code it removes. I'm sure we can agree that no code is stable code xD.

@sergeimonakhov
Copy link
Contributor

@sergeimonakhov

Currently you have two states:

  1. maintained by the controller (hardcoding a part of the specification in the mutation webhook).
  2. user-defined state that arrives with the object.

This is a bad approach - it violates encapsulation, contains hidden execution logic, and can lead to many errors. I believe the idea of using podSpec is redundant.

I completely agree that the code as is isn't great. There's a lot of custom merging logic that is error prone and the podTemplate field is at one point even used as a container to store intermediate computations. That's why I believe that using the native spec is a better way. Check out #175 and the amount of redundant code it removes. I'm sure we can agree that no code is stable code xD.

All deleted code will be brought back into the tests to verify that strategic patching works correctly when merging arrays within an array, overwriting fields, and in other corner cases...

@lllamnyp
Copy link
Member Author

All deleted code will be brought back into the tests

Tests for the current custom merging logic already exist, so no code will be "brought back" per se, but you are right that some tests will be rewritten. Still, wouldn't you much rather have some basic tests that verify that you are correctly using an external library function, rather than a super-detailed suite of tests for a custom function that you rolled yourself?

Now I do believe that not having podTemplate at all is also a good choice, but that's out of scope for this issue.

@lllamnyp
Copy link
Member Author

I'm closing this issue as implemented in #175

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants