Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TAS: API to support rank-based ordering for custom CRDs #3663

Closed
3 tasks
Tracked by #3599
mimowo opened this issue Nov 27, 2024 · 3 comments · Fixed by #3704
Closed
3 tasks
Tracked by #3599

TAS: API to support rank-based ordering for custom CRDs #3663

mimowo opened this issue Nov 27, 2024 · 3 comments · Fixed by #3704
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@mimowo
Copy link
Contributor

mimowo commented Nov 27, 2024

What would you like to be added:

API which allows to use custom PodIndex labels for custom CRD jobs, without the incentive to use labels
reserved for kubernetes in the in-house Jobs.

Why is this needed:

Completion requirements:

  • KEP update
  • API change
  • Docs update
@mimowo mimowo added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 27, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Nov 27, 2024

The proposal is to extend the workload PodSetTopologyRequest API with the following fields:

// PodIndexLabel indicates the name of the label indexing the pods. 
// For example, in the context of
// - kubernetes job this is: kubernetes.io/job-completion-index
// - JobSet: kubernetes.io/job-completion-index (inherited from Job)
// - Kubeflow: training.kubeflow.org/replica-index
PodIndexLabel *string

// SubGroupIndexLabel indicates the name of the label indexing the instances of replicated Jobs (groups)
// within a PodSet. For example, in the context of JobSet this is jobset.sigs.k8s.io/job-index.
SubGroupIndexLabel *string

// SubGroupIndexLabel indicates the count of replicated Jobs (groups) within a PodSet.
// For example, in the context of JobSet this value is read from jobset.sigs.k8s.io/replicatedjob-replicas.
SubGroupCount *int32

The values could be then set when implementing the PodSets() function in the GenericJob interface via the
PodSetTopologyRequest helper function like here.

Then, the API could be read from TopologyUngater, instead of the lookups.

@mimowo
Copy link
Contributor Author

mimowo commented Nov 27, 2024

@PBundyra
Copy link
Contributor

PBundyra commented Dec 2, 2024

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants