Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SecondaryNetwork of SR-IOV type for VM Nodes #6881

Merged
merged 1 commit into from
Jan 8, 2025

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Dec 20, 2024

VM Nodes don't have access to Physical Function, the patch removes the code that checks whether the Physical Function exists or not, which is not really necessary.

It also fixes two issues:

  1. The usage of grpc.NewClient was wrong.
  2. The kubelet socket was not accessible in the Pod.

TODO:

  • Confirm if there is any risk to always mount /var/lib/kubelet/pod-resources to antrea-agent Pod.
    The directory only contains a socket that provides read-only APIs for exposing resource allocation details of Pods. Multus thick mode also mounts the directory unconditionally.

@tnqn tnqn marked this pull request as draft December 20, 2024 11:33
Comment on lines +263 to +265
- name: host-pod-resources
mountPath: /var/lib/kubelet/pod-resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that Multus mounts all of /var/lib/kubelet, but they probably need more things than we do:

https://github.com/k8snetworkplumbingwg/multus-cni/blob/fba1fea81e0d4040244badf9d8c8e0eaddc717e8/deployments/multus-daemonset-thick.yml#L181-L183

Would you see any reason to also set mountPropagation here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multus mounts /var/lib/kubelet perhaps because it fallbacks to reading checkpoint file /var/lib/kubelet/device-plugins/kubelet_internal_checkpoint when the pod-resources socket doesn't exist: https://github.com/k8snetworkplumbingwg/multus-cni/blob/fba1fea81e0d4040244badf9d8c8e0eaddc717e8/pkg/kubeletclient/kubeletclient.go#L65-L73.
At the moment I'm not sure if it's necessary to mount the checkpoint file, but even if it's necessary, it doesn't seem reasonable to mount the whole directory as there are senstive data in it, like kubelet's TLS cert and key.
I'll check whether there could be a situation that the socket doesn't exist.

I think we shouldn't set mountPropagation like Multus, because the default one is None, meaning the volume in a container will not receive new mounts from the host or other containers, and filesystems mounted inside the container won't be propagated to the host or other containers, which is exactly what we want as we only need to access the socket. I don't know why Multus needs the DaemonSet Pod to receive new mounts from the host, the PR k8snetworkplumbingwg/multus-cni#1300 that added it wasn't very clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked kubelet code, the socket will always be opened, there should be no need to fallback to checkpoint file.

@luolanzone luolanzone added the area/secondary-network Issues or PRs related to support for secondary networks in Antrea label Dec 27, 2024
@luolanzone luolanzone added this to the Antrea v2.3 release milestone Dec 27, 2024
@tnqn tnqn force-pushed the sr-iov branch 2 times, most recently from 4412c05 to 9445466 Compare January 3, 2025 16:50
VM Nodes don't have access to Physical Function, the patch removes the
code that checks whether the Physical Function exists or not, which is
not really necessary.

It also fixes two issues:
1. The usage of grpc.NewClient was wrong.
2. The kubelet socket was not accessible in the Pod.

Signed-off-by: Quan Tian <[email protected]>
@tnqn tnqn force-pushed the sr-iov branch 3 times, most recently from 8c65744 to 57ba2b6 Compare January 6, 2025 12:31
@tnqn tnqn marked this pull request as ready for review January 6, 2025 13:45
@tnqn
Copy link
Member Author

tnqn commented Jan 6, 2025

/test-all

@tnqn tnqn requested a review from antoninbas January 7, 2025 06:28
@tnqn tnqn added kind/bug Categorizes issue or PR as related to a bug. action/release-note Indicates a PR that should be included in release notes. labels Jan 8, 2025
@tnqn tnqn merged commit bf41a62 into antrea-io:main Jan 8, 2025
107 of 112 checks passed
@tnqn tnqn deleted the sr-iov branch January 8, 2025 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/secondary-network Issues or PRs related to support for secondary networks in Antrea kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants