-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod IP is unreachable when Calico VXLAN mode is set to Always #244
Comments
There are iptables rules set by Calico to drop VXLAN packets.
I manually added pod VM IP address by executing the following command, but the added IP address immediately deleted.
|
A possible workaround of this issue is to use another UDP port for VXLAN packets. |
I confirmed that changing the UDP port fixes the connectivity issue. |
Another workaround is to add an iptables rule that explicitly allows VXLAN packets by peer pod tunnels. Calico's configuration parameter https://projectcalico.docs.tigera.io/reference/resources/felixconfig |
@yoheiueda do you mean a separate VxLAN UDP port for peerpod? if it does not require calico config change, I think it's preferred. |
I am working of making VXLAN UDP port in cloud-api-adaptor configurable. Then, we can use a different VXLAN port in cloud-api-adaptor, while we are using the default VXLAN port in Calico. |
Fixes confidential-containers#244 Signed-off-by: Yohei Ueda <[email protected]>
Fixes #244 Signed-off-by: Yohei Ueda <[email protected]>
I submitted an issue at Calico. |
Thanks for chasing this down @yoheiueda, we hit the same issue when implementing Azure support. If calico does not want to change their behavior then it would be good to change the default vxlan port so that we don't have to pass a custom port in every provider. |
* proto: Add TTRPC proto for VM info This patch add TTRPC proto definition for querying VM ID of a pod VM. Each cloud provider may implement this service to provide VM information. Fixes #112 Signed-off-by: Yohei Ueda <[email protected]> * aws: Add support for AWS AMI generation using packer tool Fixes: #6 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: remove internal sandbox data at the end of StopVM method Fixes: #123 Signed-off-by: Da Li Liu <[email protected]> * deploy: Add artifacts to create pre-install container image Fixes: #121 Signed-off-by: Pradipta Banerjee <[email protected]> * deploy: Add artifacts to create runtime payload container image Fixes: #121 Signed-off-by: Pradipta Banerjee <[email protected]> * deploy: Add deployment manifests Fixes: #121 Signed-off-by: Pradipta Banerjee <[email protected]> * deploy: Add README for operator install and payload image creation Fixes: #121 Signed-off-by: Pradipta Banerjee <[email protected]> * deploy: Ignore built binaries Fixes: #121 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: create kubelet dir for CSI node plugin Fixes: #128 Signed-off-by: Lei Li <[email protected]> * libvirt: remove mac address generation libvirt can generate an unique mac address if the network section of the XML does not mention it. To simplify the code, remove the mac generation logic from the libvirt provider. Fixes: #117 Signed-off-by: Bandan Das <[email protected]> * proxy: fetch imageName from digest via cri grpc Fixes: #126 Signed-off-by: huoqifeng <[email protected]> * aws: Use caller's context Fixes: #131 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud|aws: Mask cloud configuration sensitive fields - Created tests to check approach works as designed - Added utility function to redact provided fields - Add redacting implementation for IBM cloud and AWS config type Partial Fixes: #83 (Doesn't solve `ps -ef | grep cloud-api-adaptor` exposure) Signed-Off-By: James Tumber <[email protected]> * go: Update go.sum Reflect updates in the kata-containers CCv0-peerpod branch. Fixes #141 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: add cri_runtime_endpoint variable Fixes: #139 Signed-off-by: huoqifeng <[email protected]> * ibmcloud: Add --workdir option Add an option to change working directory to store temporary QCOW2 images. We can speed up image builds by storing temporary images on tmpfs. Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Add SUDO variable We can build a pod VM image with a non-root user as follows. make SUDO=sudo build Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Run update-grub Fixes #143 Signed-off-by: Yohei Ueda <[email protected]> * libvirt: Fix context handling Fixes: #131 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Fix context handling Fixes: #131 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: switch to golang 1.18 for containerd Fixes: #146 Signed-off-by: Pradipta Banerjee <[email protected]> * golang: bump up golang version Fixes:#146 Signed-off-by: Pradipta Banerjee <[email protected]> * ci: bump up golang version Fixes:#146 Signed-off-by: Pradipta Banerjee <[email protected]> * pkg/util: Implement agent protocol redirector Implement a common library for redirecting agent protocol RPC calls. Fixes: #150 Signed-off-by: Yohei Ueda <[email protected]> * pkg/adaptor: Refactor proxy service with agentproto.Redirector Use the agent proto redirector library in the agent proxy of cloud-api-adaptor. Signed-off-by: Yohei Ueda <[email protected]> * pkg/forwarder: Rename pacakges for refactoring The packages in agent-protocol-forwarder have confusing names. This change refactors such package names. pkg/forwarder/daemon.go daemon.Daemon -> pkg/forwarder/forwarder.go forwarder.Daemon pkg/forwarder/agent/agent.go agent.Forwarder -> pkg/forwarder/interceptor/interceptor.go interceptor.Interceptor Signed-off-by: Yohei Ueda <[email protected]> * pkg/forwarder: Refactor to use redirector package This change add a agent proxy service in agent-protocol-forwarder. Fixes #152 Signed-off-by: Yohei Ueda <[email protected]> * pkg/forwarder: Add agent proto logging Add logging for the following methods. * CreateContainer * StartContainer * RemoveContainer * CreateSandbox Signed-off-by: Yohei Ueda <[email protected]> * pkg: Move DNS workaround to forwarder This patch moves the workaround for the DNS issue from cloud-api-adaptor to agent-protocol-forwarder. Signed-off-by: Yohei Ueda <[email protected]> * pkg/forwarder: Specify netns in container spec This patch inserts a network namespace path into the container spec, so that kata-agent creates containers in the specified network namespace. Fixes #109 Signed-off-by: Yohei Ueda <[email protected]> * image: Remove unused option The -host-interface option is not used in the VXLAN mode, so this patch removes it from systemd service definitions. Signed-off-by: Yohei Ueda <[email protected]> * image: Stop using nsenter This patch changes VM image build files to stop using the nsenter command to specify a network namespace for pod networking. Signed-off-by: Yohei Ueda <[email protected]> * build: Build cleanups Few steps were missing from the docs which resulted in failed builds for anyone starting with building the different components. Signed-off-by: Pradipta Banerjee <[email protected]> * aws: add build tags aws build tag was missing from the code files Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: add build tags ibmcloud build tag was missing from the code files Fixes #156 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: add skip_verify_console variable Allows skipping of console output during verify Fixes: #89 Signed-off-by: Georgina Kinge <[email protected]> * aws: redact additional config fields The access key id also needs to be redacted during logging. Further, one of the logging statement was not redacting the fields Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Redact sensitive fields from service log One of the logger statements was not redacting the sensitive fields. Fixes: #160 Signed-off-by: Pradipta Banerjee <[email protected]> * aws: Make EC2 launch template name configurable Fixes: #158 Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: create image using packer Fixes:#148 Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: Update README to include image build instructions Fixes:#148 Signed-off-by: Pradipta Banerjee <[email protected]> * install: Update images based on latest changes Fixes: #162 Signed-off-by: Pradipta Banerjee <[email protected]> * install: run cloud-api-adaptor from within a pod Supports only libvirt and aws cloud providers - build the container image by running: $ podman build --build-arg CLOUD_PROVIDER==<aws|libvirt> . - deploy: 1. kubectl apply -f install/yamls/deploy.yaml 2. kustomize the cloud provided specific settings under install/overlays/<aws|libvirt>/kustomization.yaml 3. kubectl apply -k install/overlays/<aws|libvirt> - delete: $ kubectl delete -k install/overlays/<aws|libvirt> * from kustomize POV install/yamls/ is the base and overlays are under install/overlays/* Signed-off-by: Snir Sheriber <[email protected]> Fixes: #5 * Makefile: add image deploy and delete targets make image - build image using $engine and push it to $registry make deploy - deploy peer-pods to a confgiured cluster according to the pre-configured install/overlays/$(CLOUD_PROVIDER)/kustomization.yaml make delete - deletes peer-pods Signed-off-by: Snir Sheriber <[email protected]> * docs: cloud-api-adaptor in a pod installtion and building instructions. while here fix previous formatting Signed-off-by: Snir Sheriber <[email protected]> * ci: fix test issue due to sudo usage Fixes: #165 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: set skip_verify_console in terraform configs Fixes: #168 Set the skip_verify_console variable in the IBM Cloud Terraform configurations. Signed-off-by: Matthew Arnold <[email protected]> * docs: update aws doc mention installtion of packer's Amazon plugin Signed-off-by: Snir Sheriber <[email protected]> * install: fix caa pod deployment add missing kustomization.yaml file update runtime-payload tag modify Dockerfile to avoid shipping unnecessary files Fixes: #170 Signed-off-by: Snir Sheriber <[email protected]> * gitignore: Add binaries to ignore - Ignore cloud-api-adaptor and agent-protocol-forwarder binaries. - Ignore .vscode directory. Signed-off-by: Suraj Deshmukh <[email protected]> * azure: Add initial skeleton - Add skaffold code. - Add placeholder functions. - Add azure config structs. - Register azure driver. Signed-off-by: Pradipta Banerjee <[email protected]> * azure: Provide command line options Add azure specific command line options. Signed-off-by: Pradipta Banerjee <[email protected]> * go: Update dependencies for azure - Add azure SDK dependencies. - Upgrade go version to 1.18 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: Add network interface to the machine Add code to create network network interface for the machine. Signed-off-by: Suraj Deshmukh <[email protected]> * azure: Add a flag for public SSH key Take the path to the public SSH key from the user. Signed-off-by: Suraj Deshmukh <[email protected]> * azure: Create pod VM image using packer - Add packer configs for Azure VM image creation. - Add docs on building the image and using CAA. Signed-off-by: Pradipta Banerjee <[email protected]> * azure: Add VM creation code Add code to create VM instance with all the necessary parameters. Signed-off-by: Suraj Deshmukh <[email protected]> * azure: Add VM deletion code - Delete instance. - Delete disk. - Delete NIC. Fixes: #120 Signed-off-by: Suraj Deshmukh <[email protected]> * install: disable suffix generation for configmap and secrets Fixes: #173 Signed-off-by: Pradipta Banerjee <[email protected]> * go: Remove unnecesary replace directive in go.mod go.mod has a replace directive for google.golang.org/genproto. This is a workaround for a problem related to the TTRPC package. The problem has been fixed, and we no longer need the workaround. Fixes #175 Signed-off-by: Yohei Ueda <[email protected]> * go: Reintroduce workaround for TTRPC issue This change reverts 540890f Fixes #180 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Enable cross-region cos endpoints - Allow the user to create and use a cross-region cos bucket - Made uploads on slower networks more reliable by doing multipart uploads Fixes #169 Signed-Off-By: James Tumber <[email protected]> * doc: COS region selection documentation Updated the README.md for the ibmcloud terraform end to end configuration. Fixes #169 Signed-Off-By: James Tumber <[email protected]> * aws: fix usage of aws-region command line param Fixes: #178 Signed-off-by: Pradipta Banerjee <[email protected]> * Dockerfile: use same image for building and executing as it's easy to miss dependencies during development (such as ca certificates or protobuf) Fixes: #182 Signed-off-by: Snir Sheriber <[email protected]> * install: enable installation when crio is used set its configuration files etc.. Signed-off-by: Snir Sheriber <[email protected]> Fixes: #184 * install: update runtime-payload image to include latest commits and a shim patch to return actual pid in GetPid Signed-off-by: Snir Sheriber <[email protected]> * docs: remove installation caa as service instructions as we use the caa in pod installation as default Signed-off-by: Snir Sheriber <[email protected]> * install: allow libvirt ssh key authorization by passing the ssh private key to the container Signed-off-by: Snir Sheriber <[email protected]> Fixes: #186 * webhook: update sdk and deps Fixes: #190 Signed-off-by: Pradipta Banerjee <[email protected]> * webhook: Update install and dev instructions Fixes: #190 Signed-off-by: Pradipta Banerjee <[email protected]> * install: add resource management webhook deployment instructions Fixes: #190 Signed-off-by: Pradipta Banerjee <[email protected]> * build: bump go deps to fix build issues Fixes: #194 Signed-off-by: Pradipta Banerjee <[email protected]> * install: add missing ssh_mount.yaml file which is needed in order to mount the ssh key with libvirt Signed-off-by: Snir Sheriber <[email protected]> Fixes: #192 * ibmcloud: Embed pause container image Fixes #196 Signed-off-by: Yohei Ueda <[email protected]> * doc: update readme 1. fix incorrect link to webhook doc 2. Add link to install guide in the main readme 3. Fix minor formatting issues Fixes: #200 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Use normal Unix domain socket Fixes #198 Signed-off-by: Yohei Ueda <[email protected]> * pkg: Improve error handling at dialing socket This patch fixes incorrect error handling in the redirector, and also introduces retry logic at dialing kata-agent. Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Build static libseccomp for kata-agent Fixes #206 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Define GOPATH for Ansible build Fixes #208 Signed-off-by: Dave Hay <[email protected]> * ibmcloud: Use variables for source code repos This patch remove hard-coded source code repository URLs and branch names, and introduce variables to specify them. The default values are the original hard-coded ones. Fixes #211 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: refactor the way that GOPATH is set Fixes #213 Signed-off-by: Dave Hay <[email protected]> * go: Update go.sum to use the upstream CCv0 branch The PR of remote hypervisor support has been merged. We can switch from the CCv0-peerpod branch at https://github.com/yoheiueda/kata-containers to the CCv0 branch at https://github.com/kata-containers/kata-containers Fixes #215 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Update Terraform to use upstream CCv0 Update the Terraform variables to specify the upstream CCv0 branch. Signed-off-by: Yohei Ueda <[email protected]> * ci: Update workflows not to clone Kata repo Kata containers repo is no longer necessary to build cloud-api-adaptor. Signed-off-by: Yohei Ueda <[email protected]> * docker: Update Dockerfile to use the upstream CCv0 Checkout the upstream Kata Containers CCv0 branch to build container image for cloud-api-adaptor. Signed-off-by: Yohei Ueda <[email protected]> * doc: Update documentation to use upstream CCv0 Update the repository URL for the CCv0 branch to the upstream one. Signed-off-by: Yohei Ueda <[email protected]> * azure: add provider to documentation Fixes #220 Signed-off-by: Magnus Kulke <[email protected]> * azure: remove SubnetName and VnetName flags Fixes: #221 Signed-off-by: Magnus Kulke <[email protected]> * vsphere: Initial CAA implementation Fixes #135 Signed-off-by: Cathy Avery <[email protected]> * pkg/forwarder: create non-existing mount source dir Fixes #128 Signed-off-by: Lei Li <[email protected]> * libvirt: embed pause container image in the pod VM image Fixes: #202 Signed-off-by: Pradipta Banerjee <[email protected]> * aws: embed pause container image in the pod VM image Fixes: #201 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: embed pause container image in the pod VM image Fixes: #203 Signed-off-by: Pradipta Banerjee <[email protected]> * install: Update runtime payload image with latest kata runtime changes Fixes: #231 Signed-off-by: Pradipta Banerjee <[email protected]> * aws: fix copy-files.sh script to copy pause image Fixes: #233 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: fix copy-files.sh script to copy pause image Fixes: #233 Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: fix copy-files.sh script to copy pause image Fixes: #233 Signed-off-by: Pradipta Banerjee <[email protected]> * git: update top-level gitignore Don't track different binaries used in the POD VM image Signed-off-by: Pradipta Banerjee <[email protected]> * aws: Build static libseccomp for kata-agent for aws Fixes: #234 Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: Build static libseccomp for kata-agent Fixes: #234 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: Build static libseccomp for kata-agent Fixes: #234 Signed-off-by: Pradipta Banerjee <[email protected]> * vsphere: Add container image support for deployment Fixes: #224 Signed-off-by: Pradipta Banerjee <[email protected]> * aws: allow pod VM creation without using a launch template Fixes: #122 Signed-off-by: Pradipta Banerjee <[email protected]> * git: Update gitignore file to include only built binaries The existing entries were resulting in ignoring changes to few code files as well. Signed-off-by: Pradipta Banerjee <[email protected]> * aws: Disable EC2 launchtemplate usage by default Fixes: #122 Signed-off-by: Pradipta Banerjee <[email protected]> * aws: Update kustomization file Fixes: #122 Signed-off-by: Pradipta Banerjee <[email protected]> * aws: remove templating from kustomization.yaml as it's not used anymore Fixes: #239 Signed-off-by: Snir Sheriber <[email protected]> * docs: update development prerequisites with g++ Signed-off-by: Snir Sheriber <[email protected]> * build.sh: clean manifest cache after push in all image build scripts see: https://github.com/docker/cli/issues/954 Fixes: #242 Signed-off-by: Snir Sheriber <[email protected]> * aws: fix aws provider cloud image Fixes: #245 Signed-off-by: Pradipta Banerjee <[email protected]> * cmd: remove logging of all parameters without redaction Fixes: #247 Signed-off-by: Pradipta Banerjee <[email protected]> * vsphere: redact sensitive parameters from logging redact username and password Signed-off-by: Pradipta Banerjee <[email protected]> * podnetwork: Parameterize VXLAN port and ID Fixes #244 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: deploy ibmcloud provider as a pod Fixes:#255 Signed-off-by: Pradipta Banerjee <[email protected]> * image: quote all the shell variables Quote all the shell variables to prevent globbing and word splitting Signed-off-by: Pradipta Banerjee <[email protected]> * install: Fix the volume spec in cri_runtime_endpoint.yaml file The 'Socket' type need to be part of hostPath instead of volumes. Signed-off-by: Pradipta Banerjee <[email protected]> * vsphere: Set data center before looking for template Fixes #249 Signed-off-by: Cathy Avery <[email protected]> * caa-pod: run in fedora container to avoid old pkgs security risks Fixes: #254 Signed-off-by: Snir Sheriber <[email protected]> * vsphere: Add option command line settings as defined by kustomization.yaml Fixes #250 Signed-off-by: Cathy Avery <[email protected]> * forwarder: log when processing pull image requests Add logging at processing pull image requests. Signed-off-by: Yohei Ueda <[email protected]> * adaptor: do not send CID in a pull image request Fixes #259 Signed-off-by: Yohei Ueda <[email protected]> * adaptor: Correct pod names reported by cri-o cri-o reports sandbox names in the different format than containerd. This patch corrects pod names reported by cri-o so that they are consistent with containerd. Fixes #261 Signed-off-by: Yohei Ueda <[email protected]> * install: fix optionals var expansion in entrypoint.sh Fixes: #266 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Use image.rs instead of skopeo This patch removes the skopeo and umoci commands from the pod VM image. We can still install skopeo and umoci commands by setting the optional variable USE_SKOPEO. Fixes #256 Signed-off-by: Yohei Ueda <[email protected]> * aws: Use image.rs instead of skopeo This patch removes the skopeo and umoci commands from the pod VM image. We can still install skopeo and umoci commands by setting the optional variable USE_SKOPEO. Signed-off-by: Yohei Ueda <[email protected]> * libvirt: Use image.rs instead of skopeo This patch removes the skopeo and umoci commands from the pod VM image. We can still install skopeo and umoci commands by setting the optional variable USE_SKOPEO. Signed-off-by: Yohei Ueda <[email protected]> * azure: Use image.rs instead of skopeo This patch removes the skopeo and umoci commands from the pod VM image. We can still install skopeo and umoci commands by setting the optional variable USE_SKOPEO. Signed-off-by: Yohei Ueda <[email protected]> * proxy: parameterise pause image Fixes: #268 This change paremeterizes the pause image with default set to existing one. This change makes it possible to proide a different pause image as required by the K8s distribution, for example OpenShift Also rearranged the hypervisor options for better readability Signed-off-by: Pradipta Banerjee <[email protected]> * install: Add pause image option Fixes: #268 Include pause image option for the configmap and the entrypoint script Signed-off-by: Pradipta Banerjee <[email protected]> * Azure: update readme Some information about routing table and adding routing table to VNET subnet was missing. Fixes: #270 Signed-off-by: Kautilya Tripathi <[email protected]> * util: Add CreateInstanceName function This function generates a VM instance name from sanitized values of node name, pod namespace, pod name, and sandbox ID. Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Sanitize strings when generating VM name hvutil.CreateInstanceName sanitizes input string values, and then generates a VM name. Fixes #265 Signed-off-by: Yohei Ueda <[email protected]> * libvirt: Sanitize strings when generating VM name hvutil.CreateInstanceName sanitizes input string values, and then generates a VM name. Signed-off-by: Yohei Ueda <[email protected]> * aws: Sanitize strings when generating VM name hvutil.CreateInstanceName sanitizes input string values, and then generates a VM name. Signed-off-by: Yohei Ueda <[email protected]> * azure: Sanitize strings when generating VM name hvutil.CreateInstanceName sanitizes input string values, and then generates a VM name. Signed-off-by: Yohei Ueda <[email protected]> * vsphere: Sanitize strings when generating VM name hvutil.CreateInstanceName sanitizes input string values, and then generates a VM name. Signed-off-by: Yohei Ueda <[email protected]> * webhook: allow configuration options via env variables Fixes: #272 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: cri-endpoint as optional Align ibmcloud cri-endpoint handling with other providers to avoid duplication in optionals. Fixes: #279 Signed-off-by: James Tumber <[email protected]> * azure: populate sandbox vm name earlier Fixes: #275 There might be a state in which the creation of a VM has been triggered but the respective api call hasn't returned yet. If a create-vm call is then cancelled at the call site due to a timeout, the VM resources will not be garbage collected, because the sandbox's vm name has not been populated yet. Signed-off-by: Magnus Kulke <[email protected]> * webhook: Add env variables to deployment manifest Fixes: #278 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: mask sensitive fields in cloud config - Use redacted config in log output Partially Fixes #83 Signed-off-by: Magnus Kulke <[email protected]> * webhook: Remove duplicate golang imports Fixes: #282 Signed-off-by: Pradipta Banerjee <[email protected]> * cmd: Take sensitive defaults from environment variables Adds `os.Getenv` option to each sensitive field. Fixes: #83 Signed-off-by: James Tumber <[email protected]> * Azure: Create caa image This adds CAA image for azure provider Fixes: #226 Signed-off-by: Kautilya Tripathi <[email protected]> * Makefile: Add help target This enables to list all the make targets and their descriptions using the `make help` command. Fixes: #288 Signed-off-by: Kautilya Tripathi <[email protected]> * image: Fix dependency for skopeo and umoci Fixes #295 Signed-off-by: Yohei Ueda <[email protected]> * azure: Rename resource_group_name resource_group is the azure standard, so changed it. Fixes: #298 Signed-off-by: Kautilya Tripathi <[email protected]> * vsphere: Use session manager to issue keep alive pings to vcenter Fixes #253 Signed-off-by: Cathy Avery <[email protected]> * caa-peer-pods: Run as Daemonset Fixes: #293 Signed-off-by: Kautilya Tripathi <[email protected]> * azure: use IMAGE_NAME for images Final image that is built should have a generic name instead of something random like uuid. Signed-off-by: Kautilya Tripathi <[email protected]> * aws: use IMAGE_NAME for images Final image that is build should have a generic name instead of something random like uuid Fixes: #300 Signed-off-by: Kautilya Tripathi <[email protected]> * docs: Update README and architecture diagram Fixes: #303 Signed-off-by: Pradipta Banerjee <[email protected]> * webhook: update webhook manifest to avoid deadlock Fixes: #305 Signed-off-by: Pradipta Banerjee <[email protected]> * operator: Update kustomize to use DaemonSet Update the cri_runtime_endpoint kustomize patch to apply to a DaemonSet called cloud-api-adaptor-daemonset rather than the Deployment called cloud-api-adaptor-deployment to match the change in #297 Fixes: #308 Signed-off-by: stevenhorsman <[email protected]> * Actions: Add jobs for building caa images When the code is pushed to the staging branch, we need container images for different cloud providers to be built automatically. Fixes: #205 Signed-off-by: Kautilya Tripathi <[email protected]> * entrypoint: exec cloud-api-adaptor process also use exec form in Dockerfile Fixes: #289 Signed-off-by: Snir Sheriber <[email protected]> * install: fix optionals parameter handling in the entrypoint script Fixes: #316 Signed-off-by: Pradipta Banerjee <[email protected]> * all: remove secrets from entrypoint.sh These secrets can be passed in as environment variables. Fixes: #313 Signed-off-by: James Tumber <[email protected]> * libvirt: adapt ssh_mount to DaemonSet Fixes: #293 Signed-off-by: Snir Sheriber <[email protected]> * hvutil: Truncate too long instance name Fixes #323 Signed-off-by: Yohei Ueda <[email protected]> * ibmcloud: Increase remote hypervisor timeout Increase timeout to 10mins from default 1min to help not error when pulling bigger images Fixes: #334 Signed-off-by: stevenhorsman <[email protected]> * ibmcloud: Remove sig verification Add agent image section to stop signature verification being enabled Fixes: #331 Signed-off-by: stevenhorsman <[email protected]> * aws: Remove sig verification Add agent image section to stop signature verification being enabled Fixes: #331 Signed-off-by: stevenhorsman <[email protected]> * azure: Remove sig verification Add agent image section to stop signature verification being enabled Fixes: #331 Signed-off-by: stevenhorsman <[email protected]> * libvirt: Remove sig verification Add agent image section to stop signature verification being enabled Fixes: #331 Signed-off-by: stevenhorsman <[email protected]> * agent-config: Workaround for bug Temp workaround for bug kata-containers/kata-containers#5590 to allow endpoints to work in agnet-config.toml Signed-off-by: stevenhorsman <[email protected]> * vsphere: Reauthorize session when session is invalidated due to error Fixes #330 Signed-off-by: Cathy Avery <[email protected]> * azure: rename AZURE_SECRET to AZURE_CLIENT_SECRET Fixes: #340 The latter is the proper env name and it's also used in code. Signed-off-by: Magnus Kulke <[email protected]> * all: remove redundant cri runtime endpoint configuration Setting CRI_RUNTIME_ENDPOINT defines only the in-container side socket path, the pre-defined default fixed address should work for both containerd and crio. Fixes: #333 Signed-off-by: Snir Sheriber <[email protected]> * CI: enhance with go check and escapes detect Fixes: #320 Signed-off-by: Sam Yuan <[email protected]> * aws: retrieve instance metadata from IMDS if not explicitly set, retrieve subnet-id, region and key-name from AWS Instance Metadata Service Fixes: #315 Signed-off-by: Snir Sheriber <[email protected]> * aws: remove automatically retrieved variables and fix entrypoint.sh Signed-off-by: Snir Sheriber <[email protected]> * aws: retrieve security groups from IMDS if was not set NOTE: it allowed to retrieve multiple SGs Signed-off-by: Snir Sheriber <[email protected]> * webhook: add kind-delete target Running `make kind-delete` will delete the created kind cluster. Signed-off-by: Wainer dos Santos Moschetta <[email protected]> * webhook: give some time to Kind fully start Passing the --wait=120s argument to Kind so that it will be given some time to be ready. Signed-off-by: Wainer dos Santos Moschetta <[email protected]> * webhook: add automated tests and runner script This added three Bats tests for the webhook: - test it can mutate a pod - test it should not mutate non-peerpods - test default parameters can be changed Being the last one skipped because it is not passing. It is also introduced a runner script (run-local.sh) which will bootstrap the test environment with Kind and afterwards run those tests. At the end of the execution the cluster and created resources are deleted, however, you can retain them by running the script in debug mode: $ ./tests/e2e/run-local.sh -d However, the recommend way to run the e2e is with `make`: $ make test-e2e Signed-off-by: Wainer dos Santos Moschetta <[email protected]> * github: add CI workflow for the webhook component Added an github workflow that will run the end-to-end tests in case a pull request change the webhook. Fixes #291 Signed-off-by: Wainer dos Santos Moschetta <[email protected]> * webhook: pin the k8s version created on kind-cluster Ensure the k8s installed with kind is a known version than the latest. The same version is used on the other non-e2e tests. Signed-off-by: Wainer dos Santos Moschetta <[email protected]> * entrypoint: make optionals oneliner for shorter functions Fixes: #326 Signed-off-by: Snir Sheriber <[email protected]> * doc: add tips for network debugging Fixes: #347 Signed-off-by: huoqifeng <[email protected]> * docs: Update install instructions Instead of using the binary and manually updating command line flags when running CAA, users can make use of daemonset to run CAA now. Fixes: #294 Signed-off-by: Kautilya Tripathi <[email protected]> * doc: add collaborations information Fixes: #352 Signed-off-by: huoqifeng <[email protected]> * doc: format collaboration info Fixes: #352 Signed-off-by: huoqifeng <[email protected]> * vsphere: use packer to create the podvm template Mostly, derived from its libvirt counterpart, this script creates an esx guest from a standard ubuntu iso and then converts it into a template. Standard settings are in settings.auto.pkrvars.hcl. vsphere config is expected in vsphere.auto.pkrvals.hcl which will be created by the Makefile if not present; user-data.pkrtpl.hcl contains the autoinstall template. The automated input of characters at boot to start autoinstall is kind of flaky, the currently working sequence is defined in boot_command of the main script. Fixes: #337 Signed-off-by: Bandan Das <[email protected]> * podnetwork: Handle network interface with multiple addresses Fixes: #357 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: remove manual route creation When using Calico CNI, it was unexpectedly dropping VXLAN packets unrelated to calico. To avoid manual route creation one has to configure VXLAN encapsulation on calico and a new VXLAN UDP port is used rather than the default one. Fixes: #359 Signed-off-by: Kautilya Tripathi <[email protected]> * aws: enable overriding of default vxlan port via configmap Fixes: #361 Signed-off-by: Pradipta Banerjee <[email protected]> * azure: enable overriding of default vxlan port via configmap Fixes: #361 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: enable overriding of default vxlan port via configmap Fixes: #361 Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: enable overriding of default vxlan port via configmap Fixes: #361 Signed-off-by: Pradipta Banerjee <[email protected]> * vsphere: enable overriding of default vxlan port via configmap Fixes: #361 Signed-off-by: Pradipta Banerjee <[email protected]> * install: enable overriding of default vxlan port via configmap Fixes: #361 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Update image mount points Update the mount points of the directories that images are unpacked into, to avoid the tmpfs size restrictions Fixes: #338 Signed-off-by: stevenhorsman <[email protected]> * aws: Update image mount points Update the mount points of the directories that images are unpacked into, to avoid the tmpfs size restrictions Fixes: #338 Signed-off-by: stevenhorsman <[email protected]> * azure: Update image mount points Update the mount points of the directories that images are unpacked into, to avoid the tmpfs size restrictions Fixes: #338 Signed-off-by: stevenhorsman <[email protected]> * libvirt: Update image mount points Update the mount points of the directories that images are unpacked into, to avoid the tmpfs size restrictions Fixes: #338 Signed-off-by: stevenhorsman <[email protected]> * vsphere: Update image mount points Update the mount points of the directories that images are unpacked into, to avoid the tmpfs size restrictions Fixes: confidential-containers#338 Signed-off-by: stevenhorsman <[email protected]> * libvirt: update installation instructions and volume creation required sizes Fixes: #365 Signed-off-by: Snir Sheriber <[email protected]> * ibmcloud: Make keygen.sh work on MacOS Use ssh-copy-id -f if the OS is MacOS (Darwin) Fixes: #34 Signed-off-by: Matthew Arnold <[email protected]> * aws: allow authentication against container image registries from within the podvm also, while here, update instructions Fixes: #367 Signed-off-by: Snir Sheriber <[email protected]> * libvirt: allow authentication against container image registries from within the podvm Fixes: #367 Signed-off-by: Snir Sheriber <[email protected]> * azure: allow authentication against container image registries from within the podvm Fixes: #367 Signed-off-by: Snir Sheriber <[email protected]> * vsphere: allow authentication against container image registries from within the podvm Fixes: #367 Signed-off-by: Snir Sheriber <[email protected]> * all: introduce common image directory - Adds Makefile which has the common code between providers - Common files, copy-files.sh, services, etc Each provider can implement their own targets as there is variation (build, push, etc) Fixes #314 Signed-off-by: James Tumber <[email protected]> * docs: Add network topology diagrams when using vxlan Fixes: #373 Signed-off-by: Pradipta Banerjee <[email protected]> * podnetwork: Fix networking when using OVS When using OVS (OpenvSwitch) based CNIs like openshift-sdn or ovn-kubernetes, POD IPs are unreachable from the cluster nodes (worker or controller). There are two issues at play here - OVS based CNIs uses flow rules specific to the mac address of pod. And the pod mac address is used from the CNI created namespace on the worker node. However the container process which runs in the Pod VM uses a different mac address and unless the flow rules are updated with the mac address from the Pod VM, Pod IP is not reachable from the cluster nodes - Certain CNIs (eg ovn-kubernetes) disables ARP broadcast and uses the pod mac address assigned by the CNI. However it doesn't matches with the mac address used in Pod VM and hence packets are received by the Pod VM with incorrect dst address over the vxlan tunnel To fix the issues, this PR uses the CNI assigned MAC address for the POD VM vxlan0 interface. Fixes: #369 Signed-off-by: Pradipta Banerjee <[email protected]> * podnetwork: update test cases to use pod mac address Fixes: #369 Signed-off-by: Pradipta Banerjee <[email protected]> * vsphere: Add redaction of private config info Fixes #377 Signed-off-by: Cathy Avery <[email protected]> * libvirt: Ensure pod VMs get unique DHCP ip Remove machine-id when creating base qcow2 image for Pod VM. This ensures that pod VMs created from the same image gets unique DHCP ips Fixes: #363 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Use common Makefile for image build - Include podvm Makefile - Remove duplicate code Fixes: #382 Signed-off-by: James Tumber <[email protected]> * ibmcloud: remove image files, replaced by podvm/files Fixes: #382 Signed-off-by: James Tumber <[email protected]> * vsphere: Ensure pod VMs get unique DHCP ip Remove machine-id when creating base qcow2 image for Pod VM. This ensures that pod VMs created from the same image gets unique DHCP ips. Backported from the libvirt version for #376 Fixes: #363 Signed-off-by: Pradipta Banerjee <[email protected]> Signed-off-by: Bandan Das <[email protected]> * libvirt: separate out the ubuntu packer template into a separate dir Move ubuntu target to its own separate folder so that we can keep the same call to packer build when we introduce rhel. The IMAGE_FILE target in Makefile is also slightly changed so that it can be reused when we introduce the rhel image build. Packer limitations: Another reason for doing it this way is that packer does not give an easy way to have two separate build scripts in the same dir. We can either add all builders to the same script file or as we do here, separate out the builds into their own dirs. It does add some code duplication, specifically the variable definitions. No functional change. Fixes: #384 Signed-off-by: Bandan Das <[email protected]> * libvirt: add option to build a rhel podvm image Create a new qemu builder for handling rhel builds. The provisioners also need slight modifications, mainly to take care of selinux relabeling. Fixes: #384 Signed-off-by: Bandan Das <[email protected]> * ibmcloud: fix some strings for ibmcloud document fixes: #392 Signed-off-by: Da Li Liu <[email protected]> * podvm: restart agent-protocol-forwarder on failure RHEL 9 is encountering an issue where cloudconfig init hasn't completed before agent-protocol-forwarder starts and it fails because it can't find /peerpods/daemon.json. Restart the service on failure. Fixes: #388 Suggested-by: Pradipta Banerjee <[email protected]> Signed-off-by: Bandan Das <[email protected]> * aws: use common makefile for generating the AMI image Fixes: #399 Signed-off-by: Pradipta Banerjee <[email protected]> * podvm: unmount misc cgroup as its not handled by kata Ref issue: https://github.com/kata-containers/kata-containers/issues/4610 Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: remove ssh_mount.yaml it's not needed as the ssh key mount configuration is defined globaly Fixes: #401 Signed-off-by: Snir Sheriber <[email protected]> * vsphere: template: add a force flag option Makes it convenient to overwrite the existing template without having to manually delete it. Fixes:#397 Signed-off-by: Bandan Das <[email protected]> * vsphere: template: separate out ubuntu into its own dir A couple of renames and the introduction of PODVM_DISTRO as we have for libvirt. This makes sure that we can add other distros to be run by packer. Fixes: #397 Signed-off-by: Bandan Das <[email protected]> * vsphere: template: add rhel podvm image Creates a RHEL podvm packer template for vsphere from an installable iso. Fixes: #397 Signed-off-by: Bandan Das <[email protected]> * vsphere: Add error log statement to NewServer Fixes #409 Signed-off-by: Cathy Avery <[email protected]> * controller: add peer-pod-controller This adds a controller to handle the peer-pod components lifecycle as described in issue #328. It is a minimal implementation and needs further improvement in the future but it is enough to get started. I propose that we include this in a subdirectory of peer-pod-controller Fixes #328 Signed-off-by: Jens Freimann <[email protected]> * podvm: Add support for creating podvm qcow2 image within container Sample execution to build for libvirt provider: cd podvm docker build -t podvm_builder -f Dockerfile.podvm_builder . docker build -t podvm_libvirt --build-arg BUILDER_IMG=localhost/podvm_builder:latest \ --build-arg CLOUD_PROVIDER=libvirt -f Dockerfile.podvm . Sample execution to build for aws provider: cd podvm docker build -t podvm_builder -f Dockerfile.podvm_builder . docker build -t podvm_aws --build-arg BUILDER_IMG=localhost/podvm_builder:latest \ --build-arg CLOUD_PROVIDER=aws -f Dockerfile.podvm . Fixes: #391 Signed-off-by: Wainer dos Santos Moschetta <[email protected]> Signed-off-by: Pradipta Banerjee <[email protected]> * libvirt: Build pod vm image from within the 'podvm' dir Use the generic qcow2 generation method for libvirt Sample execution For non-container builds, cd podvm CLOUD_PROVIDER=libvirt make image For container builds, cd podvm docker build -t podvm_builder -f Dockerfile.podvm_builder . docker build -t podvm_libvirt --build-arg CLOUD_PROVIDER=libvirt \ --build-arg BUILDER_IMG=localhost/podvm_builder:latest -f Dockerfile.podvm . Fixes: #391 Signed-off-by: Pradipta Banerjee <[email protected]> * git: Ignore binary files Remove binary files from git tracking Signed-off-by: Pradipta Banerjee <[email protected]> * podvm: Force code download to specific directory Ensure the source code is explicitly downloaded to specific directories kata-containers: source code for kata containers cloud-api-adaptor: source code for remote hypervisor implementation Signed-off-by: Wainer dos Santos Moschetta <[email protected]> Signed-off-by: Pradipta Banerjee <[email protected]> * podvm: Allow overriding ubuntu image url via build arg Allow overriding ubuntu image url and checksum via build arguments Signed-off-by: Pradipta Banerjee <[email protected]> * podvm: Add Attestation agent to make file - Add attestation agent if AA_KBC has been set - Add LIBC to the makefile - Add steps to update the agent-config.toml - Add default aa_kbc_params & update config automatically Fixes: #390 Signed-off-by: Jordan Jackson <[email protected]> * docs: Add steps for setting up the attestation agent in ibmcloud - Update the readme to include the steps for setting up and running the agent - including the steps for setting up authenticated registry support Fixes: #390 Signed-off-by: Jordan Jackson <[email protected]> * docs: Update AWS readme to use attestation agent - Update readme to include how to setup the Attestation agent for authenticated registry Signed-off-by: Jordan Jackson <[email protected]> * install: avoid ccruntime configuration duplication by using kustomize Fixes: #410 Signed-off-by: Snir Sheriber <[email protected]> * vsphere: Make more inputs mandatory vcenter url, datastore, and datacenter are now mandatory Fixes #415 Signed-off-by: Cathy Avery <[email protected]> * podvm: Fix incorrect download folder for kata containers Fixes: #418 Signed-off-by: Pradipta Banerjee <[email protected]> * podvm: Add cleanup helper - Delete the ttrpc server socket when it stops This helps handle podvm restarts, including during podvm image creation. Fixes: #420 Signed-off-by: James Tumber <[email protected]> * controller: fix wrong go module import This should have been changed before I sent the pull request. Change it now to use code from this repo instead of my private github repo. Fixes #423 Signed-off-by: Jens Freimann <[email protected]> * ibmcloud: generate an IBM Secure Execution image - add document for IBM Secure Execution support - update build.sh script to generate an IBM Secure Execution image - support multiple different host keys - using luks encrypted root partition fixes: #406 Signed-off-by: Da Li Liu <[email protected]> Co-authored-by: leilibj <[email protected]> * vsphere: template: workaround ks user bug RHEL8 kickstarts do not seem to recognize the isencrypted flag to user command. Use plaintext instead that works universally. While at it, also make username/password configurable. Fixes: #429 Signed-off-by: Bandan Das <[email protected]> * controller: fix typo in environment variable for peerpods namespace Fix a simple type that led to the cloud-api-adaptor daemon set not being created. Fixes #424 error message: "Failed setting ControllerReference for cloud-api-adaptor DS" Signed-off-by: Jens Freimann <[email protected]> * vsphere: template: unmount misc cgroup as its not handled by kata Now that we are using legacy cgroups... https://github.com/kata-containers/kata-containers/issues/4610 Fixes: #431 Signed-off-by: Bandan Das <[email protected]> * ibmcloud: build: fix missing ndb devices Fix missing ndb devices the first time the cleanup subroutine in the build.sh script is called due to the ndb module not being loaded. Fixes: #433 Signed-off-by: Matthew Arnold <[email protected]> * all: inject passed auth.json credentials file to podvm to authenticate with image registries, requires skopeo support in the podvm Fixes: #380 Signed-off-by: Snir Sheriber <[email protected]> * all: support image-rs registries authentication by converting auth.json file to a resources file and pass it to the offline kbc expected file path requires AA_KBC="offline_fs_kbc" at image build Signed-off-by: Snir Sheriber <[email protected]> * image: support aa-offline_fs_kbc-resources.json placing so that it will be copied if provided at image build Signed-off-by: Snir Sheriber <[email protected]> * all: update authenticated registries instructions and merged common instructions Signed-off-by: Snir Sheriber <[email protected]> * podvm: Add support for building CentOS based podvm image Following are key changes to support both CentOS and Ubuntu based podvm images - Uses common cloud-init user specified via userdata - Makes it configurable to specify LIBC for kata-agent/rust builds - Separate distro specific Dockerfiles - Disables login for the ssh user (peerpod) Fixes: #434 Signed-off-by: Pradipta Banerjee <[email protected]> * ibmcloud: Bump version of go Instal version 1.19.3 of go in the Ansible playbook to match the kata-containers minimum version Fixes: #440 Signed-off-by: stevenhorsman <[email protected]> * ibmcloud: install: Add disable selinux to kata config Add disable_guest_selinux = true to the kata configuration on the k8s worker. Fixes: #437 Signed-off-by: Matthew Arnold <[email protected]> Signed-off-by: Yohei Ueda <[email protected]> Signed-off-by: Pradipta Banerjee <[email protected]> Signed-off-by: Da Li Liu <[email protected]> Signed-off-by: Lei Li <[email protected]> Signed-off-by: Bandan Das <[email protected]> Signed-off-by: huoqifeng <[email protected]> Signed-off-by: James Tumber <[email protected]> Signed-off-by: Georgina Kinge <[email protected]> Signed-off-by: Snir Sheriber <[email protected]> Signed-off-by: Matthew Arnold <[email protected]> Signed-off-by: Suraj Deshmukh <[email protected]> Signed-off-by: Dave Hay <[email protected]> Signed-off-by: Magnus Kulke <[email protected]> Signed-off-by: Cathy Avery <[email protected]> Signed-off-by: Kautilya Tripathi <[email protected]> Signed-off-by: stevenhorsman <[email protected]> Signed-off-by: Sam Yuan <[email protected]> Signed-off-by: Wainer dos Santos Moschetta <[email protected]> Signed-off-by: Jens Freimann <[email protected]> Signed-off-by: Jordan Jackson <[email protected]> Co-authored-by: Yohei Ueda <[email protected]> Co-authored-by: Pradipta Banerjee <[email protected]> Co-authored-by: Da Li Liu <[email protected]> Co-authored-by: Lei Li <[email protected]> Co-authored-by: Bandan Das <[email protected]> Co-authored-by: huoqifeng <[email protected]> Co-authored-by: James Tumber <[email protected]> Co-authored-by: Georgina Kinge <[email protected]> Co-authored-by: Snir Sheriber <[email protected]> Co-authored-by: Suraj Deshmukh <[email protected]> Co-authored-by: Dave Hay <[email protected]> Co-authored-by: Magnus Kulke <[email protected]> Co-authored-by: Cathy Avery <[email protected]> Co-authored-by: Kautilya Tripathi <[email protected]> Co-authored-by: Kautilya Tripathi <[email protected]> Co-authored-by: stevenhorsman <[email protected]> Co-authored-by: Sam Yuan <[email protected]> Co-authored-by: Wainer dos Santos Moschetta <[email protected]> Co-authored-by: snir911 <[email protected]> Co-authored-by: Pradipta Banerjee <[email protected]> Co-authored-by: Jens Freimann <[email protected]> Co-authored-by: Jordan Jackson <[email protected]>
Fixes confidential-containers#244 Signed-off-by: Yohei Ueda <[email protected]>
Fixes confidential-containers#244 Signed-off-by: Yohei Ueda <[email protected]>
When pod network is set up with Calico, and its inter-node communication is configured to use VXLAN, the pod IP on a pod VM is unreachable from the worker node.
This problem does not occur with Flannel even though Flannel also uses VXLAN.
The used VXLAN IDs (VNIs) are as follows, so there are no conflicts.
The text was updated successfully, but these errors were encountered: