Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Generic data abstraction on top of CRD #53

Closed
2 tasks done
Jeffwan opened this issue Oct 3, 2021 · 7 comments
Closed
2 tasks done

[Feature] Generic data abstraction on top of CRD #53

Jeffwan opened this issue Oct 3, 2021 · 7 comments
Assignees
Labels
apiserver enhancement New feature or request

Comments

@Jeffwan
Copy link
Collaborator

Jeffwan commented Oct 3, 2021

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

In our system, not everyone is using kubectl to operate clusters directly. There're few major reasons.

  1. Current ray operator is very friendly to users who is familiar with Kubernetes operator pattern. For most data scientists, this way actually increase their learning curve.

  2. Using kubectl requires sophisticated permission system. I think some kubernetes cluster doesn't enable user level authentication. In my company, we use loose RBAC management and corp SSO system is not integrated with Kubernetes OIDC at all.

Due to above reason, I think it's worth to build some generic abstraction on top of RayCluster CRD. With the core api support, we can easily build backend services, CLI, etc to bridge users. Underneath, it still use Kubernetes to manage real data.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

/cc @chenk008 @akanso @chaomengyuan

@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Oct 3, 2021

API Definition

In order to better manage resources at the API level, a few proto files will be defined to describe resources. Technically, we can reuse Kubernetes resource directly. However, RayCluster CRD is probably not the best data structure to describe a cluster. At the same time, we want to leave some flexibility to leave some flexibilities to use database to store history data in the near future (for example, pagination etc).

message Cluster {
  // Output. Unique Cluster ID. Generated by API server.
  string id = 1;

  // Required input field. Unique Cluster name provided by user.
  string name = 2;

  // Required input field. Cluster's namespace provided by user
  string namespace = 3;

  // Required field. This field indicates the user who owns the cluster.
  string user = 4;

  // Optional input field. Ray cluster version
  string version = 5;

  // Optional field.
  enum Environment {
    DEV = 0;
    TESTING = 1;
    STAGING = 2;
    PRODUCTION = 3;
  }
  Environment environment = 6;

  // Required field. This field will be used to retrieve right ray container
  string cluster_runtime = 7;
  
  // Required field. This field 
  string compute_runtime = 8;

  // Output. The time that the Cluster created.
  google.protobuf.Timestamp created_at = 9;

  // Output. The time that the Cluster deleted.
  google.protobuf.Timestamp deleted_at = 10;
}

ComputeRuntime is equivalent to our header and worker pod template spec. Currently we only define some basic information, for rich feature like node affinity, tolerance etc. We have not include them yet.

message ComputeRuntime {
  string id = 1;
  string name = 2;
  enum Cloud {
      ALIBABA = 0;
      AWS = 1;
      AZURE = 2;
      GCP = 3;
      ON_PREM = 4;
  }
  Cloud cloud = 3;
  string region = 4;
  string availability_zone = 5;
  HeadGroupSpec head_group_spec = 6;
  repeated WorkerGroupSpec worker_group_sepc = 7;
}

message HeadGroupSpec {
  // Optional
  Resource resource = 1;
  // Optional
  map<string, string> ray_start_params = 2;
  // Optional
  string service_type = 3;
  // Output: internal/external service endpoint
  string service_address = 4;
}

message WorkerGroupSpec {
  // Optional input field.
  string group_name = 1;
  // Required input field
  int32 replicas = 2;
  // Optional
  int32 min_replicas = 3;
  // Optional
  int32 max_replicas = 4;
  // Optional
  Resource resource = 5;
  // Optional
  map<string, string> ray_start_params = 6;
}

ClusterRuntime is used to build node image. This is inspired by Anyscale. This is optional to some cluster, people can use base image + job level runtime as well.

message ClusterRuntime {
  string id = 1;
  string name = 2;
  string base_image = 3;
  repeated string pip_packages = 4;
  repeated string system_packages = 5;
  map<string, string> environment_variables = 6;
  string custom_commands = 7;
  // Output
  string image = 8;
}

Tech stack

.proto define core api, grpc and gateway services. go_client and swagger can be generated easily for further usage.

image

@Jeffwan Jeffwan mentioned this issue Oct 3, 2021
4 tasks
@chaomengyuan
Copy link
Contributor

Just a small comment: the names "ClusterRuntime" and "ComputeRuntime" are a little bit confusing. For me, the actual definition of "ComputeRuntime" is more like a "ClusterRuntime".

@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Oct 8, 2021

@chaomengyuan I think we can come up other ideas for images and try not to confuse user.

@chenk008
Copy link
Contributor

chenk008 commented Oct 12, 2021

Just a small comment: the names "ClusterRuntime" and "ComputeRuntime" are a little bit confusing. For me, the actual definition of "ComputeRuntime" is more like a "ClusterRuntime".

I think so, too. ComputeRuntime is a little confused. ClusterRuntime is similar to runtime env in ray.

@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Oct 17, 2021

I make some changes to API definition. /cc @chenk008 @chaomengyuan

  1. Remove confusing ClusterRuntime. Use a simple string image instead.
  • provide a separate Image to build images. In the future, once we have workspace, we can reuse the image concept
  1. Change ComputeRuntime to ClusterSpec. Create a reusable concept ComputeTemplate to describe resources (doesn't have ray information so it's reusable).

Please have a check. I am also thinking if we want to use reference like a foreign key or embed objects here. Since we don't use DB, we need to translate object to ConfigMap and then link everything together at cluster level.


message Cluster {
  // Output. Unique Cluster ID. Generated by API server.
  string id = 1;

  // Required input field. Unique Cluster name provided by user.
  string name = 2;

  // Required input field. Cluster's namespace provided by user
  string namespace = 3;

  // Required field. This field indicates the user who owns the cluster.
  string user = 4;

  // Optional input field. Ray cluster version
  string version = 5;

  // Optional field.
  enum Environment {
    DEV = 0;
    TESTING = 1;
    STAGING = 2;
    PRODUCTION = 3;
  }
  Environment environment = 6;

  // Required field. This field will be used to retrieve right ray container
  string image = 7;
  
  // Required field. This field indicates ray cluster
  ClusterSpec cluster_spec = 8;

  // Output. The time that the Cluster created.
  google.protobuf.Timestamp created_at = 9;

  // Output. The time that the Cluster deleted.
  google.protobuf.Timestamp deleted_at = 10;
}

message ClusterSpec {
  // The ID of the compute template
  string id = 1;
  // The name of the compute template
  string name = 2;
  // The head group configuration
  HeadGroupSpec head_group_spec = 3;
  // The worker group configurations
  repeated WorkerGroupSpec worker_group_spec = 4;
}

message HeadGroupSpec {
  // Optional
  ComputeTemplate compute_template = 1;
  // Optional
  map<string, string> ray_start_params = 2;
  // Optional
  string service_type = 3;
  // Output: internal/external service endpoint
  string service_address = 4;
}

message WorkerGroupSpec {
  // Optional input field.
  string group_name = 1;
  // Required input field
  int32 replicas = 2;
  // Optional
  int32 min_replicas = 3;
  // Optional
  int32 max_replicas = 4;
  // Optional
  ComputeTemplate compute_template = 5;
  // Optional
  map<string, string> ray_start_params = 6;
}

message ComputeTemplate {
  // The ID of the compute template
  string id = 1;
  // The ID of the compute template
  string name = 2;
  // Number of cpus
  uint32 cpu = 3;
  // Number of memory
  uint32 memory = 4;
  // Number of gpus
  uint32 gpu = 5;
  // The detail gpu accelerator type
  string gpu_accelerator = 6;
}

message Image {
  string id = 1;
  string name = 2;
  string base_image = 3;
  repeated string pip_packages = 4;
  repeated string conda_packages = 5;
  repeated string system_packages = 6;
  map<string, string> environment_variables = 7;
  string custom_commands = 8;
  // Output
  string image = 9;
}

@Jeffwan Jeffwan self-assigned this Oct 17, 2021
@Jeffwan
Copy link
Collaborator Author

Jeffwan commented Oct 27, 2021

Let's split this story into separate sub issues

  1. Core Cluster, Image message
  2. gRPC and gRPC gateway services
  3. Scripts to generate go clients and swagger files
  4. code generation.

@DmitriGekhtman
Copy link
Collaborator

@Jeffwan
KubeRay API server has been implemented, so we can close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apiserver enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants