-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading to any version beyond 1.12 we are getting error of expired token for backing up data using datamover after 1 hr with IRSA #8173
Comments
Looks like the token to access object store has expired. |
its expires every one hour? because datauploads which takes less than an hour runs and completes.. the ones which take longer are getting cancelled. In the logs of node-agent we see this error at that time |
The expiration time of the token is not set by Velero, so you need to check how the token was created. |
but we were not getting this issue in 1.12 |
We use IRSA and I see iam token valid for 24h.
Looks like this commit seems to be relevant: https://github.com/vmware-tanzu/velero/pull/7374/files ?? |
Why that commit is related? Have you specified BSL->credentialFile? |
oops sorry , no we dont use credentialFile. Or may be kopia version changes with velero upgrade? |
Neither Velero nor Kopia could change the token being used, I guess there might be another token specified. We also have test cases for IRSA, but we didn't see the problem as here. |
issue happens with velero 1.13.2 with datamover. |
@Lyndon-Li this was working fine until 1.12 and started happening since upgrade to 1.13 also 1.14. Do we know what has changed since 1.12? This is blocking us from upgrading to 1.14 currently |
As mentioned above I'm getting the same error for restores which are longer than 1h. The restore will fail based on the I'm using below images with IAM role and IRSA:
On the restore-wait init container this message shows up in a loop
In the node-agent this message show up
The restore worked without any issues when downgraded to below versions
Hope this will help a bit. |
Thanks @catalinpan . Is there any workaround to make this work in 1.14? |
we use velero 1.14.1 |
This may be the expected behavior for now, multiple DUs may be created at the same time but processed one by one. If the 1st DU takes more than 1 hour, the second one's token will timeout. |
those are two different backups. 1st finish w/o issue. Second starts and DU created eariler then last run so it gets the old key that expire soon and DU goes cancelled. If Velero would try to get a new key before exit with error, this problem could not come up. If I reduce the duration of the key I can just hide the issue, but once DU needs more time than I set,then I need to set higher duration. |
the default duration for iam role is 1 hour. we use that one |
Is increasing the default duration helping in this case? @SCLogo |
In our case yes and no. yes because we saw less issue related to expired
key, but not in all case. The best solution would be if the kopia or Velero
tries to get a temp key before it exits with expiration error.
…On Tue, Nov 19, 2024 at 10:25 AM dharanui ***@***.***> wrote:
Is increasing the default duration helping in this case? @SCLogo
<https://github.com/SCLogo>
—
Reply to this email directly, view it on GitHub
<#8173 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANVQPA3X2DLB4ALW3XYGGZT2BL7YPAVCNFSM6AAAAABNPAN622VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBVGE2DMNJWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
--
Balazs Varga |* DevOps*
|
Hi @SCLogo / @Lyndon-Li , can you help me how to override DurationSeconds while velero is performing assumeRole ? I am using IRSA. Updation maxSessionDuration on role is not helping because default duration while assuming role is 1hr. according to aws/aws-cli#9021 there is no environmental variable for that currently. |
@dharanui . I am using kube2iam. As the default max duration is 1hour, but the kube2iam asks 30 mins temp roles. |
Hi @Lyndon-Li / @SCLogo , any idea when will this be fixed so that we can make it work with IRSA? |
velero version: 1.14.1
error: async write error: "unable to write content chunk 96 of FILE:000002: mutable parameters: unable to read format blob: error getting kopia.repository blob: The provided token has expired: mutable parameters: unable to read format blob: error getting kopia.repository blob: The provided token has expired"
The datauploads are failing after almost one hour of running.
Tried also to incraese repo maintainence frequency , but no luck
The text was updated successfully, but these errors were encountered: