-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAPT changes related to kubified tinkerbell stack #160
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting out what you've got so far! I have just a few suggestions
c183fc3
to
fc9077d
Compare
246bf1c
to
c2a03f5
Compare
c2a03f5
to
f9e524b
Compare
f9e524b
to
045b178
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work @panktishah26!
045b178
to
2a2eaef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the docs! Just some minor changes
2a2eaef
to
417b0f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Modified tink hardware crd example Signed-off-by: panktishah26 <[email protected]> ## Description Modified tink hardware crd example. ## Why is this needed I have updated the tink hardware yaml file example which will help customers to create a cluster using CAPT. Fixes: # ## How Has This Been Tested? I have tested this functionality with CAPT changes related to kubified stack in this [PR](tinkerbell/cluster-api-provider-tinkerbell#160). ## Checklist: I have: - [ ] updated the documentation and/or roadmap (if required) - [ ] added unit or e2e tests - [ ] provided instructions on how to upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love seeing this work coming together, just a few questions and suggestions. Please excuse my ignorance if anything I brought up has already been discussed while I've been offline.
@@ -501,7 +507,7 @@ func (mrc *machineReconcileContext) createWorkflow() error { | |||
}, | |||
Spec: tinkv1.WorkflowSpec{ | |||
TemplateRef: mrc.tinkerbellMachine.Name, | |||
HardwareRef: mrc.tinkerbellMachine.Spec.HardwareName, | |||
HardwareMap: map[string]string{"device_1": hardware.Spec.Metadata.Instance.ID}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reading through the changes, but initial reaction here is: should this be a well defined data type rather than a map[string]string here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the assumption here that hardware.Spec.Metadata.Instance.ID is going to match the MAC address?
I'm wondering if there is a better way to handle this mapping that provides a clearer mapping between Hardware and Workflow.
Maybe something like this (simplified yaml):
hardwareRefs:
- placeholder: "device_1",
- hardwareName: mrc.tinkerbellMachine.Spec.HardwareName,
<some way to specify interface index and whether to use IP or MAC for matching>
This would potentially give the same flexibility as the current map without losing an easy way to match the hardware associated with a given workflow (especially for the purposes in displaying in kubectl get
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reading through the changes, but initial reaction here is: should this be a well defined data type rather than a map[string]string here?
I think this could use a better data type, but that will necessitate changes to the API now defined in Tinkerbell.
Is the assumption here that hardware.Spec.Metadata.Instance.ID is going to match the MAC address?
Yes. We should come up with a better way to encode this, but its probably outside the scope of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaning this up in the future sounds good to me
@@ -345,7 +349,7 @@ func (mrc *machineReconcileContext) ensureHardware() (*tinkv1.Hardware, error) { | |||
} | |||
|
|||
mrc.tinkerbellMachine.Spec.HardwareName = hardware.Name | |||
mrc.tinkerbellMachine.Spec.ProviderID = fmt.Sprintf("tinkerbell://%s", hardware.Spec.ID) | |||
mrc.tinkerbellMachine.Spec.ProviderID = fmt.Sprintf("tinkerbell://%s", hardware.UID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of the UID here would cause issues if types are ever backed up and restored (or if resources are "pivoted" to a different Kubernetes cluster). This wasn't an issue with the Tinkerbell when using the Postgres backend because users because users were specifying the UUID when creating the hardware.
Before recommending an alternative approach, I have a few questions to narrow down what supported permutations we expect to exist.
- Would we expect to be able to reference hardware defined in a different namespace to be used for a given Tinkerbell Machine?
- Would we expect the use of hardware defined on a different k8s cluster to be used?
- Would we expect to be able to reference hardware defined using Tinkerbell with the postgresql backend still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we expect to be able to reference hardware defined in a different namespace to be used for a given Tinkerbell Machine?
I don't think so, even though Tinkerbell types are namespaced, the Tink stack really only operates in a single namespace. Machines aren't aware of the namespace with how they communicate to the tink components (tink server/worker over gRPC with a MAC ID as the identifier, hardware with boots over DHCP with MAC address, tink-worker to hegel over http with an IP address as the identifier)
Would we expect the use of hardware defined on a different k8s cluster to be used?
Not at this point. We could certainly wire up CAPT to have two K8s clients to separate clusters, one for CAPI and one for Tink, but it doesn't operate this way today. For now, it would probably be unneeded complexity.
Would we expect to be able to reference hardware defined using Tinkerbell with the postgresql backend still?
No, as this PR rips out all the Tink API client that uses the Postgres backend.
We should probably use tinkerbell://{namespace}/{name}
for repeatability across clusters, as you brought up with the pivot point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this change in this PR or should I create an another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@panktishah26 its up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@micahhausler thanks for the clarification, it really helps.
I don't think we need to. block the PR on updating the providerID, especially since it would require additional changes to the controllers to support the new pattern.
I would consider getting things updated to use tinkerbell//{namespace}/{name}
a release blocker, though to avoid users hitting issues related to backup/restore and pivot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @micahhausler and @detiber, based on our discussion, it would be good if we address this in the different PR since this change require changes to the controllers.
CAPT changes related to kubified tinkerbell stack CAPT changes related to kubified tinkerbell stack CAPT changes related to kubified tinkerbell stack CI related changes Signed-off-by: panktishah26 <[email protected]>
417b0f4
to
2bea178
Compare
I have reverted go.mod file change. Kindly let me know if there is any other changes needs to be done. @micahhausler @detiber |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, awesome work @panktishah26!
My team has been actively working on this project. My approver role will help expedite the PR review/approve process and help unblock any open issues or feature requests. Org request issue: tinkerbell/org#24 Requirements: * I have reviewed the [community membership guidelines](https://github.com/tinkerbell/proposals/blob/main/proposals/0024/GOVERNANCE.md) * I have [enabled 2FA on my GitHub account](https://github.com/settings/security) * I have subscribed to the [tinkerbell-contributors e-mail list](https://groups.google.com/g/tinkerbell-contributors) * I am actively contributing to 1 or more Tinkerbell subprojects PRs contributions on CAPT: - #130 - #176 PRs reviewed on CAPT: - #160 - #184 - #182 Sponsors: @micahhausler @chrisdoherty4 @jacobweinstock
Description
This PR contains changes related to kubified tinkerbell stack. Tink PostgreSQL dependency is removed from the CAPT and CAPT controllers will only have to communicate with underlying Kubernetes client.
Signed-off-by: Pankti Shah [email protected]
Why is this needed
We have kubified all the services of Tinkerbell stack like Tink-apis, Boots, Hegel, Rufio etc. We needed to change CAPT as well to make the whole kubified stack work to create a BareMetal cluster.
Fixes: #
How Has This Been Tested?
To test these changes,
Sample yaml file for new hardware CRDs
How are existing users impacted? What migration steps/scripts do we need?
Existing users might have to take the latest versions of each Tinkerbell services to created a cluster.
Checklist:
I have: