-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend): Upgrade Argo server image version to v2.11.6 #4693
feat(backend): Upgrade Argo server image version to v2.11.6 #4693
Conversation
Hi @xinbinhuang. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
3b702c4
to
94b2947
Compare
@@ -15,7 +15,7 @@ COPY backend/requirements.txt . | |||
RUN python3 -m pip install -r requirements.txt | |||
|
|||
# Downloading Argo CLI so that the samples are validated | |||
#ADD https://github.com/argoproj/argo/releases/download/v2.7.5/argo-linux-amd64 /usr/local/bin/argo | |||
#ADD https://github.com/argoproj/argo/releases/download/v2.11.6/argo-linux-amd64 /usr/local/bin/argo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replace all v2.7.5
-> v2.11.6
in the repo, but I am not sure if I need to update this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should leave this one as well since it is the Kubeflow backend which will be running 2.7.7. Argo is in this case used to validate the workflows to check if they are valid according to argo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we have several places that use Argo but they can be somehow independent of each other. Can you tell me how each of them is used so that I can get a clearer picture of how they work together?
@@ -25,7 +25,7 @@ ACCOUNT=$(gcloud info --format='value(config.account)') | |||
kubectl create clusterrolebinding PROW_BINDING --clusterrole=cluster-admin --user=$ACCOUNT --dry-run -o yaml | kubectl apply -f - | |||
kubectl create clusterrolebinding DEFAULT_BINDING --clusterrole=cluster-admin --serviceaccount=default:default --dry-run -o yaml | kubectl apply -f - | |||
|
|||
ARGO_VERSION=v2.7.5 | |||
ARGO_VERSION=v2.11.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I need to update this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will be needed for all the test infrastructure to run with argo 2.11.6
@@ -27,7 +27,8 @@ spec: | |||
- --configmap | |||
- workflow-controller-configmap | |||
- --executor-image | |||
- gcr.io/ml-pipeline/argoexec:v2.7.5-license-compliance | |||
- gcr.io/ml-pipeline/argoexec:v2.11.6-license-compliance | |||
- --namespaced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the extra argument that I need to specifically add for my deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a corresponding documentation? Does omitting the argument make it cluster mode? We need to make sure it is customizable from an overlay to continue support multi-user mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster vs namespace install seems to start since v2.5 - this can also be found in the v2.5 release branch.
In my case, it did become a cluster install if I simply replace the image version to argoproj/argoexec:v2.11.6
- I got permission issues when trying to launch new pipelines.
I am not familiar with the multi-user
mode. Does it install the workflow-controller in cluster scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error log from when trying to launch pipelines without --namespaced
E1107 01:14:06.943364 1 reflector.go:153] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:105: Failed to list *unstructured.Unstructured: workflowtemplates.argoproj.io is forbidden: User "system:serviceaccount:kubeflow:argo" cannot list resource "workflowtemplates" in API group "argoproj.io" at the cluster scope
#RUN ARGO_VERSION=v2.11.6 && curl -sSL -o /usr/local/bin/argo \ | ||
RUN ARGO_VERSION=v2.4.3 && curl -sSL -o /usr/local/bin/argo \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this one either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used to kick of the ML pipeline sample e2e test, since the changes you do is only for the server side I think we should not update this to 2.11.6. But I think we should go with 2.7.7 since that is the version that is used by Kubeflow today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a bug from argo that prevented using the argo cli from a container, that's why we were stuck on 2.4.3. I'm not sure about latest status, if the latest version works, we can upgrade too.
/cc @Ark-kun can you provide context about that issue?
BTW, I was following the instructions from here: https://github.com/kubeflow/pipelines/blob/master/third_party/README.md I skipped step 6. |
I am also using 2.11.6 on my cluster. You should remove the executor image from the configmap. As far as I know it is deprecated and should only be in the workflow-controller deployment |
I realised that your changes are only for the argo server and not for Kubeflow it self, thus this PR will not completely solve: #4553 as I understand it. I am also not sure we we should update the argo server on its own since this will result in that Kubeflow pipelines use 2.7 and the argo server uses 2.11 of argo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd also need to update licenses, I created an issue to track updating documentation for these instructions: #4715.
Please keep your eyes on it.
@@ -27,7 +27,8 @@ spec: | |||
- --configmap | |||
- workflow-controller-configmap | |||
- --executor-image | |||
- gcr.io/ml-pipeline/argoexec:v2.7.5-license-compliance | |||
- gcr.io/ml-pipeline/argoexec:v2.11.6-license-compliance | |||
- --namespaced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a corresponding documentation? Does omitting the argument make it cluster mode? We need to make sure it is customizable from an overlay to continue support multi-user mode.
#RUN ARGO_VERSION=v2.11.6 && curl -sSL -o /usr/local/bin/argo \ | ||
RUN ARGO_VERSION=v2.4.3 && curl -sSL -o /usr/local/bin/argo \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a bug from argo that prevented using the argo cli from a container, that's why we were stuck on 2.4.3. I'm not sure about latest status, if the latest version works, we can upgrade too.
/cc @Ark-kun can you provide context about that issue?
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: xinbinhuang The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
You are correct - this PR only upgrades argo server only. I can also bake in the changes for upgrading Kubeflow, but I wonder if we should separate it into a different PR? or should they go into the same one? @Bobgy @NikeNano |
@xinbinhuang I think we can go with separate, previously we've upgraded both at different pace because they had different complexity. |
/ok-to-test |
/test kubeflow-pipeline-sample-test |
/retest |
@xinbinhuang: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi @xinbinhuang, you'll need to update https://github.com/kubeflow/pipelines/tree/master/third_party/argo with instructions that should be documented in #4715. I'd suggest creating a separate PR for it. I'll publish images like gcr.io/ml-pipeline/workflow-controller:v2.11.6-license-compliance once your PR is merged. |
This comment has been minimized.
This comment has been minimized.
@NikeNano @Bobgy @capri-xiyue I noticed some conversation on a more robust upgrading strategy to Argo3 #4999. I didn't quite have much time to finish this PR in the last year ending. I wonder if I should just close this PR so that whoever are upgrading it can start from scratch? |
You might want to go to v2.12 which is LTS. |
@alexec will we have the kubernetes client update backported to argo 2.12? It was one of the big reasons we wanted to upgrade. |
To @xinbinhuang, agree we can close this one if you do not have bandwidth to finish it. /close |
@Bobgy: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Description of your changes:
Part of #4553
Checklist:
The title for your pull request (PR) should follow our title convention. Learn more about the pull request title convention used in this repository.
PR titles examples:
fix(frontend): fixes empty page. Fixes #1234
Use
fix
to indicate that this PR fixes a bug.feat(backend): configurable service account. Fixes #1234, fixes #1235
Use
feat
to indicate that this PR adds a new feature.chore: set up changelog generation tools
Use
chore
to indicate that this PR makes some changes that users don't need to know.test: fix CI failure. Part of #1234
Use
part of
to indicate that a PR is working on an issue, but shouldn't close the issue when merged.Do you want this pull request (PR) cherry-picked into the current release branch?
If yes, use one of the following options:
cherrypick-approved
label to this PR. The release manager adds this PR to the release branch in a batch update.