-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
design-proposals: Cloud Hypervisor integration #184
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @sboeuf. Thanks for your PR. I'm waiting for a kubevirt member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc |
/cc |
/cc |
/ok-to-test |
Submitting a design proposal to the KubeVirt community describing how we could integrate the Cloud Hypervisor VMM with KubeVirt. Signed-off-by: Sebastien Boeuf <[email protected]>
68ead09
to
7818080
Compare
Pinging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, great design proposal!
Maybe it has been discussed already, but why can't we have libvirt drive cloud-hypervisor? Last time I checked they had basic/experimental support for it, but we could work with them to implement whatever is missing. That would dramatically reduce the scope of this integration project, and therefore the amount of code we have to maintain long term in KubeVirt.
Also, the use of the word "hypervisor" here doesn't make much sense to me, but according to wikipedia, "the contemporary usage" of the word is for VMMs... Oh well!
Thanks!
The problem with libvirt is that it's introducing yet another layer that's not needed in this situation since KubeVirt is the management layer already. With libvirt in the picture, we end up with KubeVirt layer managing the libvirt layer managing the VMM. It's complex and it requires more resources per Pod. And last thing about libvirt, it's very much QEMU specific, and I don't think it's really meant for managing other VMMs than QEMU (even if that's what it claims).
Yes I agree the naming of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your work here is impressive. Here are a few thoughts on how I think we could make this practical.
One issue we have with maintaining KubeVirt is how vast it has become. This is similar in concept to the limitations the Kubernetes Core itself hit when which motivated the community to create the CRD abstraction and other well defined "plugin" like abstractions.
I think we're bumping into those same limitations here when it comes to expanding the hypervisort type into the core project.
Here's what I'd like to be investigated. What interfaces would KubeVirt need to provide in order for you to successfully integrate your custom virt-laucher and hypervisor outside of the KubeVirt Core?
For example, Is this as simple as allowing using a custom virt-launcher binary and custom validation webhooks to ensure only supported features are used with the custom launcher?
Yes it is as simple as that. The |
If this is the case, is there anything that prevents the cloud hypervisor support from being developed externally of the kubevirt/kubevirt repo? for example, we could create a new repo in the kubevirt org (maybe something like |
I would love to see a second implementation in kubevirt/kubevirt at least initially, just to have a validation implementation which ensures that we don't break the interface too easily. I don't think that we need to test it exhaustively (like just latest k8s should b e good enough). Cloud-hypervisure due to its single-static-binary delivery may be a pretty nice and easy validation platform next to qemu. In general I would agree that we don't want 10 implementations in the core repo. |
Well if we were going down this road, then the libvirt implementation should also be moved to an addon model. |
In general, I'm not against it, but I don't see how can we get away with only making changes to the virt-launcher and not the virt-handler. What I am missing in this proposal is a mechanism of exposure and negotiation of each virt-launchers' capabilities. There are also implications on the virt-handler that are not covered in the proposal.
There are probably many other cases that I'm missing. |
This is my concern too. However, we cannot and should not expect contributors to have the knowledge of fixing issues in both libvirt/qemu and cloud-hypervisor based virt-launchers. I do believe that we need to provide a safe environment for the could hypervisor to develop and mature. Perhaps we need to have an incubation repo as other projects have. |
The mechanism of exposure is directly through the VMI spec, and the negotiation would be performed by rejecting anything that is not supported by the Cloud Hypervisor implementation. The creation of the Pod would fail on such rejection.
Can you elaborate on that? I mean if
Why does it need to be different? And again, it is up to the
Do you mean it directly talks to the libvirt daemon? If that's the case, this is clearly an area for improvement.
Sure and that's fine, as I'm not planning on replacing what's already there.
|
Interesting, I actually disagree with that. And that's actually why I think it's important to keep things in the core project, so that the CI can catch a regression that would be caused by a change/refactoring in the code shared by both implementations.
Sure, I'm not opposed to that, as long as the end goal is to promote the code into the core project just like libvirt is. |
I'm hesitant to say with any certainty this is the end goal.
This is actually my concern. Given the current critical mass of expertise within the community, I don't know if we can practically support cloud hypervisor in the core repo. simply maintaining libvirt/qemu as that stack continues to evolve has been challenging. I agree that having cloud hypervisor in the core repo would be the best thing for cloud hypervisor support, but given our current CI battles and community expertise, i don't think we can handle this without negatively impacting the community as a whole. This is why I was trying to redirect the conversation into how we can maintain cloud hypervisor support in a separate repo. What I think is more achievable (at least right now) is to begin versioning all communication between handler/launcher in order to create a consistent interface between the two components. The GRPC channels are already versioned, but there are other communication channels which are not, such as the method used between handler and launcher to mount hotplug/containerdisks, how handler helps setup network interfaces, and the locations of unix sockets just to name a few. If we want plugable launchers, we'd need to map all the interactions between handler/launcher and ensure backwards compatibility. So, my thought process here is if we tighten up our handler/launcher interactions, then perhaps it is be practical to support cloud hypervisor in another repo |
The problem you're referring to is not related to Cloud Hypervisor. Of course maintainers can't have all the knowledge, and this is why they must be assigned specific domains of expertise. The whole review of a complex PR can be shared across multiple persons, that is totally acceptable IMO.
Just to clarify, I would take full responsibility of reviewing PRs affecting Cloud Hypervisor and I would provide the expertise related to this field. I'm very worried that we're making a choice here based on logistical issues.
Sure but it feels like we should include libvirt/QEMU in the discussion then. It would be a bit unfair to have libvirt part of KubeVirt core, and not Cloud Hypervisor.
Wow, this a whole other level of work you're proposing here. I'm not sure I would have the time to dive into such work, and I don't think I have enough knowledge of KubeVirt to take care of such task. A lot of what I'm hearing here is about some existing issues in the KubeVirt community/CI, and it feels there's a lot to solve before even considering another VMM like Cloud Hypervisor to be integrated. TBH, it's overwhelming given the integration of Cloud Hypervisor would already require a considerable amount of work... |
I understand. My goal certainly isn't to discourage you. I want to state accurately that the bar is high for what it would take to transform KubeVirt into a project that interact with anything other than qemu/kvm. Your PoC is impressive and has proven what is technically possible. The next step is to transform that into something that's supportable within the constrains of our community. I believe a path forward would be to begin discussing what the equivalent of the k8s CRI for container runtimes would look like for kubevirt... So we'd need a similar interface to CRI, but for hypervisor runtimes for kubevirt. Also, at the controller level we'd need some sort of plugin for constructing the VMI's pod as well that understands hypervisor nuances (like the memory overhead calculations for pod requests/limits). This is only my stance though. I'd like to hear more from @rmohr @vladikr @fabiand and others. |
Is there any progress on this proposal? I'm very interested in KubeVirt supporting different Hypervisors. This will make KuveVirt have better performance and lower overhead in some scenarios. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. /close |
@kubevirt-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@wavezhang: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hey @wavezhang! Are you interested to continue this effort? Please make sure to read the comment above about the need to create an interface, similar to k8s CRI, but between kubevirt and hypervisor runtimes. If you want to push this subject forward, I'll be happy to help paving the path forward according to the discussion above. |
Submitting a design proposal to the KubeVirt community describing how we
could integrate the Cloud Hypervisor VMM with KubeVirt.