-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNV-52722: Pass through extra VDDK configuration options to importer pod. #3572
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@mnecas Any concerns about the need to copy the ConfigMap to the importer namespace? I wasn't sure if that would make things awkward to use from the forklift side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, I have added some a note and NP but nothing on my side
withHidden, err := os.ReadDir(common.VddkArgsDir) | ||
if err != nil { | ||
if os.IsNotExist(err) { | ||
return "", nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP: Please add a comment that the user did not specify the vddk additional config
pkg/image/nbdkit.go
Outdated
@@ -228,6 +236,30 @@ func getVddkPluginPath() NbdkitPlugin { | |||
return NbdkitVddkPlugin | |||
} | |||
|
|||
// Extra VDDK configuration options are stored in a ConfigMap mounted to the | |||
// importer pod. Just look for the first file in the mounted directory, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just look for the first file
just note we need to be sure to document this so user does not chain the configs to separate configmaps.
@mnecas: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
I think it should be okay, but also think we should document/verify. |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
tests/datavolume_test.go
Outdated
@@ -1254,6 +1276,30 @@ var _ = Describe("[vendor:[email protected]][level:component]DataVolume tests", | |||
Message: "Import Complete", | |||
Reason: "Completed", | |||
}}), | |||
Entry("[test_id:5083]succeed importing VDDK data volume with extra arguments ConfigMap set", dataVolumeTestArguments{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to assert something in the final step of this test? like checking the config was applied?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this test I wanted to make sure that the contents of the ConfigMap are present in the file, so the assertion happens in the vddk-test-plugin (the fgets/strcmp). Is there a better way to check the result from the importer? Like can this test read the pod logs or termination message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh okay I see
if (strcmp(extras, "VixDiskLib.nfcAio.Session.BufSizeIn64KB=16") != 0) { // Must match datavolume_test |
You can either read pod logs or make this a part of the termination message struct
containerized-data-importer/pkg/importer/vddk-datasource_amd64.go
Lines 1011 to 1019 in 625a9e9
// GetTerminationMessage returns data to be serialized and used as the termination message of the importer. | |
func (vs *VDDKDataSource) GetTerminationMessage() *common.TerminationMessage { | |
return &common.TerminationMessage{ | |
VddkInfo: &common.VddkInfo{ | |
Version: vddkVersion, | |
Host: vddkHost, | |
}, | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an in-progress implementation to check the pod logs, but before I commit to that I am also going to try getting the result into the termination message to see if it's any cleaner. Either way it's definitely better to check the result from the test code than from the VDDK test plugin itself, that should get rid of a source of silent failures in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The termination message approach should be cleaner, you would be asserting on a common.TerminationMessage struct that you got from the importer's API termination message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this and putting the result in the termination message is only cleaner from this angle, the final assert. The rest of the implementation needed to have the importer scanning for test-only output from the nbdkit log just to save an arbitrary string value into the termination struct, which is fairly limited space that could probably be put to better use.
I think it ended up being nicer to just scan the log from the test, so the actual importer doesn't need to do extra stuff to accommodate the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, thanks for trying 🙏
doc/datavolumes.md
Outdated
@@ -349,6 +349,13 @@ spec: | |||
[Get VDDK ConfigMap example](../manifests/example/vddk-configmap.yaml) | |||
[Ways to find thumbprint](https://libguestfs.org/nbdkit-vddk-plugin.1.html#THUMBPRINTS) | |||
|
|||
#### Extra VDDK Configuration Options | |||
|
|||
The VDDK library itself looks in a configuration file (such as `/etc/vmware/config`) for extra options to fine tune data transfers. To pass these options through to the VDDK, store the configuration file contents in a ConfigMap and add a `cdi.kubevirt.io/storage.pod.vddk.extraargs` annotation to the DataVolume specification. The ConfigMap will be mounted to the importer pod as a volume, and the first file in the mounted directory will be passed to the VDDK. This means that the ConfigMap must be placed in the same namespace as the DataVolume, and the ConfigMap should only have one file entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you considered making this an API on the DataVolume, but, since you need a backport, you prefer the annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it didn't seem worth changing the CRDs and all the generated stuff for just for this uncommon fine-tuning configuration option. I can certainly change the API if that would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhenriks wdyt? since this is backporting I am leaning to the annotation as well, but I am not sure.. usually any annotation becomes an API that we forget about
Issue: The scale and perf team found a way how to improve the transfer speeds. Right now the only way to enable this feature is to set the v2v extra vars. The v2v extra vars pass the configuration to the virt-v2v and virt-v2v-in-place. The v2v extra vars configuration is general and not specific for VDDK. This causes the warm migration which uses the virt-v2v-in-place to fail as it does not use any VDDK parameters. Those parameters should be passed to the CNV CDI instead. Fix: Add a way to easily enable and configure the AIO. This feature is VDDK and provider-specific as it requires to have specific vSphere and VDDK versions. So we can't enable this feature globally nor by default. So this PR adds the configuration to the Provider spec settings and create a configmap with the necessary configuration and either mounts the configmap to the guest conversion pod for cold migration or passes the configmap name to the CDI DV annotation. Example: ``` apiVersion: forklift.konveyor.io/v1beta1 kind: Provider metadata: name: vsphere namespace: forklift spec: settings: sdkEndpoint: vcenter useVddkAioOptimization: 'true' vddkAioBufSize: 16 // optional defaults to 16 vddkAioBufCount: 4 // optional defaults to 4 vddkInitImage: 'quay.io/xiaodwan/vddk:8' type: vsphere ``` Ref: - https://issues.redhat.com/browse/MTV-1804 - kubevirt/containerized-data-importer#3572 - https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv Signed-off-by: Martin Necas <[email protected]>
Issue: The scale and perf team found a way how to improve the transfer speeds. Right now the only way to enable this feature is to set the v2v extra vars. The v2v extra vars pass the configuration to the virt-v2v and virt-v2v-in-place. The v2v extra vars configuration is general and not specific for VDDK. This causes the warm migration which uses the virt-v2v-in-place to fail as it does not use any VDDK parameters. Those parameters should be passed to the CNV CDI instead. Fix: Add a way to easily enable and configure the AIO. This feature is VDDK and provider-specific as it requires to have specific vSphere and VDDK versions. So we can't enable this feature globally nor by default. So this PR adds the configuration to the Provider spec settings and create a configmap with the necessary configuration and either mounts the configmap to the guest conversion pod for cold migration or passes the configmap name to the CDI DV annotation. Example: ``` apiVersion: forklift.konveyor.io/v1beta1 kind: Provider metadata: name: vsphere namespace: forklift spec: settings: sdkEndpoint: vcenter useVddkAioOptimization: 'true' vddkAioBufSize: 16 // optional defaults to 16 vddkAioBufCount: 4 // optional defaults to 4 vddkInitImage: 'quay.io/xiaodwan/vddk:8' type: vsphere ``` Ref: - https://issues.redhat.com/browse/MTV-1804 - kubevirt/containerized-data-importer#3572 - https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv Signed-off-by: Martin Necas <[email protected]>
/retest |
Issue: The scale and perf team found a way how to improve the transfer speeds. Right now the only way to enable this feature is to set the v2v extra vars. The v2v extra vars pass the configuration to the virt-v2v and virt-v2v-in-place. The v2v extra vars configuration is general and not specific for VDDK. This causes the warm migration which uses the virt-v2v-in-place to fail as it does not use any VDDK parameters. Those parameters should be passed to the CNV CDI instead. Fix: Add a way to easily enable and configure the AIO. This feature is VDDK and provider-specific as it requires to have specific vSphere and VDDK versions. So we can't enable this feature globally nor by default. So this PR adds the configuration to the Provider spec settings and create a configmap with the necessary configuration and either mounts the configmap to the guest conversion pod for cold migration or passes the configmap name to the CDI DV annotation. Example: ``` apiVersion: forklift.konveyor.io/v1beta1 kind: Provider metadata: name: vsphere namespace: forklift spec: settings: sdkEndpoint: vcenter useVddkAioOptimization: 'true' vddkAioBufSize: 16 // optional defaults to 16 vddkAioBufCount: 4 // optional defaults to 4 vddkInitImage: 'quay.io/xiaodwan/vddk:8' type: vsphere ``` Ref: - https://issues.redhat.com/browse/MTV-1804 - kubevirt/containerized-data-importer#3572 - https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv Signed-off-by: Martin Necas <[email protected]>
Issue: The scale and perf team found a way how to improve the transfer speeds. Right now the only way to enable this feature is to set the v2v extra vars. The v2v extra vars pass the configuration to the virt-v2v and virt-v2v-in-place. The v2v extra vars configuration is general and not specific for VDDK. This causes the warm migration which uses the virt-v2v-in-place to fail as it does not use any VDDK parameters. Those parameters should be passed to the CNV CDI instead. Fix: Add a way to easily enable and configure the AIO. This feature is VDDK and provider-specific as it requires to have specific vSphere and VDDK versions. So we can't enable this feature globally nor by default. So this PR adds the configuration to the Provider spec settings and create a configmap with the necessary configuration and either mounts the configmap to the guest conversion pod for cold migration or passes the configmap name to the CDI DV annotation. Example: ``` apiVersion: forklift.konveyor.io/v1beta1 kind: Provider metadata: name: vsphere namespace: forklift spec: settings: sdkEndpoint: vcenter useVddkAioOptimization: 'true' vddkAioBufSize: 16 // optional defaults to 16 vddkAioBufCount: 4 // optional defaults to 4 vddkInitImage: 'quay.io/xiaodwan/vddk:8' type: vsphere ``` Ref: - https://issues.redhat.com/browse/MTV-1804 - kubevirt/containerized-data-importer#3572 - https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv Signed-off-by: Martin Necas <[email protected]>
Issue: The scale and perf team found a way how to improve the transfer speeds. Right now the only way to enable this feature is to set the v2v extra vars. The v2v extra vars pass the configuration to the virt-v2v and virt-v2v-in-place. The v2v extra vars configuration is general and not specific for VDDK. This causes the warm migration which uses the virt-v2v-in-place to fail as it does not use any VDDK parameters. Those parameters should be passed to the CNV CDI instead. Fix: Add a way to easily enable and configure the AIO. This feature is VDDK and provider-specific as it requires to have specific vSphere and VDDK versions. So we can't enable this feature globally nor by default. So this PR adds the configuration to the Provider spec settings and create a configmap with the necessary configuration and either mounts the configmap to the guest conversion pod for cold migration or passes the configmap name to the CDI DV annotation. Example: ``` apiVersion: forklift.konveyor.io/v1beta1 kind: Provider metadata: name: vsphere namespace: forklift spec: settings: sdkEndpoint: vcenter useVddkAioOptimization: 'true' vddkAioBufSize: 16 // optional defaults to 16 vddkAioBufCount: 4 // optional defaults to 4 vddkInitImage: 'quay.io/xiaodwan/vddk:8' type: vsphere ``` Ref: - https://issues.redhat.com/browse/MTV-1804 - kubevirt/containerized-data-importer#3572 - https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv Signed-off-by: Martin Necas <[email protected]>
For interest only, here's an alternative to maintaining the fake VDDK plugin: In nbdkit itself we are able to test the real nbdkit-vddk-plugin without VDDK, using a fake VDDK: https://gitlab.com/nbdkit/nbdkit/-/blob/master/tests/dummy-vddk.c?ref_type=heads which is compiled to https://gitlab.com/nbdkit/nbdkit/-/blob/9ed65418c57128d4bf372f39b9f98bf6ecbe470a/tests/Makefile.am then you just have to point libdir= to the directory containing this file: |
This is cool, I will look into this. I think it would still require maintaining and updating an equivalent image to hold libvixDiskLib.so.6, but it would let me get rid of this test plugin check. |
The VDDK library itself accepts infrequently-used arguments in a configuration file, and some of these arguments have been tested to show a significant transfer speedup in some environments. This adds an annotation that references a ConfigMap holding the contents of this VDDK configuration file, and mounts it to the importer pod. The first file in the mounted directory is passed to the VDDK. Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
Instead of listing the mounted VDDK arguments directory and filtering out hidden files, just hard-code the expected file name and ConfigMap key. Signed-off-by: Matthew Arnold <[email protected]>
Put this in import_test and assert the values there, instead of in the VDDK test plugin. The VDDK plugin logs the given values, and then the test scans the log for what it expects to see. Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
/test pull-cdi-unit-test |
/test pull-cdi-goveralls |
tests/import_test.go
Outdated
@@ -189,6 +189,63 @@ var _ = Describe("[rfe_id:1115][crit:high][vendor:[email protected]][level:compo | |||
Expect(importer.DeletionTimestamp).To(BeNil()) | |||
} | |||
}) | |||
|
|||
It("[test_id:6689]succeed importing VDDK data volume with extra arguments ConfigMap set", Label("VDDK"), func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind putting this under a separate describe so it doesn't inherit the "Serial" part?
(I think this test can run in parallel to others just fine)
You can also change [test_id:6689]
to [test_id:XXXX]
; I don't think it works well if this entry is missing from the ID database
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done.
tests/framework/vddk.go
Outdated
@@ -12,6 +12,38 @@ import ( | |||
"kubevirt.io/containerized-data-importer/tests/utils" | |||
) | |||
|
|||
// CreateVddkDataVolume returns a VDDK data volume | |||
func (f *Framework) CreateVddkDataVolume(dataVolumeName, size, url string) *cdiv1.DataVolume { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come something like this didn't already exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but only in datavolume_test. I moved this test to import_test for some reason, I guess because I couldn't shoehorn the log scanning nicely into the big testDataVolume table. I moved it back to get rid of this change, it seems like it doesn't really belong in import_test any more than datavolume_test.
Signed-off-by: Matthew Arnold <[email protected]>
tests/datavolume_test.go
Outdated
Expect(err).ToNot(HaveOccurred()) | ||
for _, option := range vddkConfigOptions { | ||
By(fmt.Sprintf("Check for configuration value %s in nbdkit logs", option)) | ||
Expect(strings.Contains(logs, option)).To(BeTrue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the pod logs API instead of kubectl, check this out: #3184
And if you use .Should(ContainSubstring
I am pretty sure it'll print the logs so we can properly understand why CI fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, the end of the log gets truncated though. I am going to try showing the whole log temporarily and see if it offers any insights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it passes locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it passes locally. It also correctly fails if I add another ContainSubstring line, and I can see the target strings in the log output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, the config argument is not getting passed to nbdkit:
I0109 19:07:46.993477 1 nbdkit.go:311] Start nbdkit with: ['--foreground' '--readonly' '--exit-with-parent' '-U' '/tmp/nbd.sock' '--pidfile' '/tmp/nbd.pid' '--filter=retry' '--filter=cacheextents' '/opt/testing/libvddk-test-plugin.so' 'libdir=/opt/vmware-vix-disklib-distrib' 'server=vcenter.cdi-ukbxj:8989' 'user=user' 'password=+/tmp/password2397077821' 'thumbprint=testprint' 'vm=moref=vm-21' '--verbose' '-D' 'nbdkit.backend.datapath=0' '-D' 'vddk.datapath=0' '-D' 'vddk.stats=1' 'file=[teststore] testvm/testdisk.vmdk']
I must have a mistake further up somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I didn't get the annotations copied over correctly through the import populator stuff. The importer prime doesn't get the annotations, so I guess the pod doesn't get the ConfigMap mounted. I can reproduce this locally now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, yeah usually the hot path is CSI storage (and thus populators). I assume you were not setting any KUBEVIRT_STORAGE
env var, and thus falling back to legacy population with the local
storage class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was it. The tests are looking better now.
Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
This reverts commit 488849f. Signed-off-by: Matthew Arnold <[email protected]>
Also add a related unit test. Signed-off-by: Matthew Arnold <[email protected]>
Signed-off-by: Matthew Arnold <[email protected]>
/test pull-containerized-data-importer-e2e-ceph |
What this PR does / why we need it:
This pull request adds a new annotation "cdi.kubevirt.io/storage.pod.vddk.extraargs", referencing a ConfigMap that contains extra parameters to pass directly to the VDDK library. The use case is to allow tuning of asynchronous buffer counts for MTV as requested in CNV-52722. Testing has shown good results for cold migrations with:
These parameters are stored in a file whose path is passed to the VDDK via the nbdkit "config=" option. The file contents come from the referenced ConfigMap, and the ConfigMap is mounted to the importer pod as a volume.
Which issue(s) this PR fixes:
Fixes CNV-52722
Special notes for your reviewer:
As far as I can tell, a ConfigMap volume mount must be in the same namespace as the importer pod. So MTV will need to create or duplicate the ConfigMap to the same namespace as the DataVolume it creates.
Release note: