Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [Operator] [Controller] [Service] | Ports without name are ignored #463

Closed
2 tasks done
ulfox opened this issue Aug 11, 2022 · 3 comments · Fixed by #891
Closed
2 tasks done

[Bug] [Operator] [Controller] [Service] | Ports without name are ignored #463

ulfox opened this issue Aug 11, 2022 · 3 comments · Fixed by #891
Assignees
Labels
bug Something isn't working good first issue Good for newcomers P2 Important issue, but not time critical

Comments

@ulfox
Copy link
Contributor

ulfox commented Aug 11, 2022

Search before asking

  • I searched the issues and found no similar issues.

KubeRay Component

ray-operator

What happened + What you expected to happen

Issue

If we create a new rayclusters.ray.io and we set in the headnode the following options

  ports:
  - containerPort: 10001
    protocol: TCP
  - containerPort: 8265
    protocol: TCP
  - containerPort: 8000
    protocol: TCP
  - containerPort: 6379
    protocol: TCP
  - containerPort: 9001
    protocol: TCP

The ray operator will create the deployment but it will not include in the headnode kubernetes service the above ports. In turn, this leads to workers being unable to communicate with the head node and timing out trying to connect to a non-existing port.

If however we use a port names, then the service will be updated with the right ports

What should happen

The crd should validate that the container ports have a valid name, or the operator should reject crd and throw an error instead of spawning the pods.

Reproduction script

Create or edit a rayclusters.ray.io and add in the headnode some ports without a name, e.g.

  ports:
  - containerPort: 10001
    protocol: TCP
  - containerPort: 8265
    protocol: TCP
  - containerPort: 8000
    protocol: TCP
  - containerPort: 6379
    protocol: TCP
  - containerPort: 9001
    protocol: TCP

The cluster will appear to be ok, but what will actually happen is workers starting and terminating after timing out

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@ulfox ulfox added the bug Something isn't working label Aug 11, 2022
@Jeffwan
Copy link
Collaborator

Jeffwan commented Aug 11, 2022

Thanks for reporting the issue. I think the originally, we do have reference using port name. This should be definitely improved.

@Jeffwan Jeffwan added this to the v0.4.0 release milestone Aug 11, 2022
@DmitriGekhtman DmitriGekhtman added P2 Important issue, but not time critical good first issue Good for newcomers labels Nov 4, 2022
@kevin85421
Copy link
Member

@Yicheng-Lu-llll Are you interested in this issue?

@Yicheng-Lu-llll
Copy link
Contributor

sure.

@Yicheng-Lu-llll Are you interested in this issue?

sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers P2 Important issue, but not time critical
Projects
None yet
5 participants