Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use local copy of JobStatus by mpi-operator #514

Merged
merged 2 commits into from
Feb 3, 2023

Conversation

tenzen-y
Copy link
Member

@tenzen-y tenzen-y commented Feb 2, 2023

Signed-off-by: Yuki Iwai [email protected]

I copied the JobStatus api to this repository.

Currently, we can not use auto-generated CRDs in our unit and E2E tests since the common.JobStatus requires JobStatus.Condition and JobStatus.ReplicaStatuses by default in the following:

type JobStatus struct {
	// Conditions is an array of current observed job conditions.
	Conditions []JobCondition `json:"conditions"`

	// ReplicaStatuses is map of ReplicaType and ReplicaStatus,
	// specifies the status of each replica.
	ReplicaStatuses map[ReplicaType]*ReplicaStatus `json:"replicaStatuses"`
...

https://github.com/kubeflow/common/blob/9ec55d141f90faaf52fd6df271e987e5a6781945/pkg/apis/common/v1/types.go#L25-L31

So, if we use auto-generated CRDs, we face the following errors in tests:

--- FAIL: TestMPIJobSuccess (2.13s)
mpi_job_controller_test.go:49: Using namespace test-gbp22
mpi_job_controller_test.go:94: Failed sending job to apiserver: MPIJob.kubeflow.org "job" is invalid: [status.conditions: Required value, status.replicaStatuses: Required value]

--- FAIL: TestMPIJobFailure (0.01s)
mpi_job_controller_test.go:175: Using namespace test-9fpm8
mpi_job_controller_test.go:221: Failed sending job to apiserver: MPIJob.kubeflow.org "job" is invalid: [status.conditions: Required value, status.replicaStatuses: Required value]

https://github.com/kubeflow/mpi-operator/actions/runs/4071386113/jobs/7013139204#step:8:39

Blocking: #510

@google-oss-prow google-oss-prow bot requested review from gaocegege and zw0610 February 2, 2023 10:42
@tenzen-y tenzen-y changed the title Use local copy of JobStatus by mpi-operator [WIP] Use local copy of JobStatus by mpi-operator Feb 2, 2023
@tenzen-y tenzen-y mentioned this pull request Feb 2, 2023
@alculquicondor
Copy link
Collaborator

What is missing in this PR?

@tenzen-y tenzen-y changed the title [WIP] Use local copy of JobStatus by mpi-operator Use local copy of JobStatus by mpi-operator Feb 2, 2023
@tenzen-y
Copy link
Member Author

tenzen-y commented Feb 2, 2023

What is missing in this PR?

@alculquicondor There are no missing. PTAL.

// ReplicaStatuses is map of ReplicaType and ReplicaStatus,
// specifies the status of each replica.
// +options
ReplicaStatuses map[MPIReplicaType]*ReplicaStatus `json:"replicaStatuses,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another violation of API conventions T_T
But we can only change it in a new API version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right :(
For v2beta1, we need to keep using this API for compatibility.

type JobCondition struct {
// Type of job condition.
// +options
Type JobConditionType `json:"type,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be optional. Also, in general we don't restrict this to enums for Job or Pod, with the assumption that external controllers can add their own.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be optional.

Makes sense.

Also, in general we don't restrict this to enums for Job or Pod, with the assumption that external controllers can add their own.

@alculquicondor Does that mean we want to change the type of Type to string?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be a type, but we shouldn't add an enum annotation (which you don't currently have)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see.
SGTM.

Signed-off-by: Yuki Iwai <[email protected]>
Copy link
Collaborator

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@alculquicondor
Copy link
Collaborator

Let's see which one merges first 😂 #511

/assign @terrytangyuan

@tenzen-y
Copy link
Member Author

tenzen-y commented Feb 2, 2023

Let's see which one merges first 😂 #511

/assign @terrytangyuan

Haha. I trust @terrytangyuan :)

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, terrytangyuan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 4c8b4fc into kubeflow:master Feb 3, 2023
@tenzen-y tenzen-y deleted the copy-job-conditions branch February 3, 2023 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants