Skip to content

Commit

Permalink
Add end to end tests to apiserver (ray-project#1460)
Browse files Browse the repository at this point in the history
  • Loading branch information
z103cb authored Nov 15, 2023
1 parent 1c5f3e8 commit ff45923
Show file tree
Hide file tree
Showing 23 changed files with 4,507 additions and 148 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ jobs:
working-directory: ${{env.working-directory}}

- name: Test
run: go test ./...
run: go test ./pkg/... ./cmd/... -race -parallel 4
working-directory: ${{env.working-directory}}

- name: Set up Docker
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,6 @@

# Any file with a .backup extension
**/*.backup

# Any file with a .log extension
**/*.log
44 changes: 40 additions & 4 deletions apiserver/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,35 @@ make build
make test
```

#### End to End Testing

There are two `make` targets provide execute the end to end test (integration between Kuberay API server and Kuberay Operator):

* `make e2e-test` executes all the tests defined in the [test/e2e package](./test/e2e/). It uses the cluster defined in `~/.kube/config` to submit the workloads.
* `make local-e2e-test` creates a local kind cluster, builds the Kuberay operator and API server images from the current branch and deploys the operator and API server into the kind cluster. It shuts down the kind cluster upon successful execution of the end to end test. If the tests fail the cluster will be left running and will have to manually be shutdown by executing the `make clean-cluster`

The `e2e` test targets use two variables to control what version of Ray images to use in the end to end tests:

* `E2E_API_SERVER_RAY_IMAGE` -- for the ray docker image. Currently set to `rayproject/ray:2.7.0-py310`. On Apple silicon or arm64 development machines the `-aarch64` suffix is added to the image.
* `E2E_API_SERVER_URL` -- for the base URL of the deployed KubeRayAPI server. The default value is: `http://localhost:31888`

The end to end test targets share the usage of the `GO_TEST_FLAGS`. Overriding the make file variable with a `-v` option allows for both unit and end to end tests to print any output / debug messages. By default, only if there's a test failure those messages are shown.

The default values of the variables can be overridden using the `-e` make command line arguments.

Examples:

```bash
# To run end to end test using default cluster
make e2e-test

# To run end to end test in fresh cluster.
# Please note that:
# * the cluster created for this test is the same as the cluster created by make cluster.
# * if the end to end tests fail the cluster will still be up and will have to be explicitly shutdown by executing make clean-cluster
make local-e2e-test
```

#### Swagger UI updates

To update the swagger ui files deployed with the Kuberay API server, you'll need to:
Expand Down Expand Up @@ -117,7 +146,7 @@ make run

#### Access

Access the service at `localhost:8888` for http, and `locahost:8887` for the RPC port.
Access the service at `localhost:8888` for http, and `localhost:8887` for the RPC port.

### Kubernetes Deployment

Expand Down Expand Up @@ -160,10 +189,11 @@ As a convenience for local development the following `make` targets are provided
* `make cluster` -- creates a local kind cluster, using the configuration from `hack/kind-cluster-config.yaml`. It creates a port mapping allowing for the service running in the kind cluster to be accessed on `localhost:31888` for HTTP and `localhost:31887` for RPC.
* `make clean-cluster` -- deletes the local kind cluster created with `make cluster`
* `load-image` -- loads the docker image defined by the `IMG` make variable into the kind cluster. The default value for variable is: `kuberay/apiserver:latest`. The name of the image can be changed by using `make load-image -e IMG=<your image name and tag>`
* `operator-image` -- Build the operator image to be loaded in your kind cluster. The tag for the operator image is `kuberay/operator:latest`. This step is optional.
* `load-operator-image` -- Load the operator image to the kind cluster created with `create-kind-cluster`. The tag for the operator image is `kuberay/operator:latest`, and the tag can be overridden using `make load-operator-image -E OPERATOR_IMAGE_TAG=<operator tag>`. To use the nightly operator tag, set `OPERATOR_IMAGE_TAG` to `nightly`.
* `operator-image` -- Build the operator image to be loaded in your kind cluster. The operator image build is `kuberay/operator:latest`. The image tag can be overridden from the command line: ( example: `make operator-image -e OPERATOR_IMAGE_TAG=foo`)
* `load-operator-image` -- Load the operator image to the kind cluster created with `make cluster`. It should be used in conjunction with the `deploy-operator targe`
* `deploy-operator` -- Deploy operator into your cluster. The tag for the operator image is `kuberay/operator:latest`.
* `undeploy-operator` -- Undeploy operator from your cluster
* `load-ray-test-image` -- Load the ray test images into the cluster.

When developing and testing with kind you might want to execute these targets together:

Expand All @@ -173,8 +203,14 @@ make docker-image cluster load-image deploy
#To create a new API server image, operator image and deploy them on a new cluster
make docker-image operator-image cluster load-image load-operator-image deploy deploy-operator
#To execute end 2 end tests with a local build operator and verbose output
make local-e2e-test -e GO_TEST_FLAGS="-v"
#To execute end 2 end test with the nightly build operator
make local-e2e-test -e OPERATOR_IMAGE_TAG=nightly
```

#### Access API Server in the Cluster

Access the service at `localhost:31888` for http and `locahost:31887` for the RPC port.
Access the service at `localhost:31888` for http and `localhost:31887` for the RPC port.
134 changes: 85 additions & 49 deletions apiserver/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ REPO_ROOT_BIN := $(REPO_ROOT)/bin
IMG_TAG ?=latest
IMG ?= kuberay/apiserver:$(IMG_TAG)

# Allow for additional test flags (-v, etc)
GO_TEST_FLAGS ?=
# Ray docker images to use for end to end tests based upon the architecture
# for arm64 environments (Apple silicon included) pull the architecture specific image
ifeq (arm64,$(shell go env GOARCH))
E2E_API_SERVER_RAY_IMAGE ?=rayproject/ray:2.7.0-py310-aarch64
else
E2E_API_SERVER_RAY_IMAGE ?=rayproject/ray:2.7.0-py310
endif
# Kuberay API Server base URL to use in end to end tests
E2E_API_SERVER_URL ?=http://localhost:31888

# Get the currently used golang install path (in GOPATH/bin, unless GOBIN is set)
ifeq (,$(shell go env GOBIN))
GOBIN=$(shell go env GOPATH)/bin
Expand Down Expand Up @@ -43,42 +55,102 @@ help: ## Display this help.

##@ Development

.PHONY:
fmt: ## Run go fmt against code.
go fmt ./...

.PHONY: vet
vet: ## Run go vet against code.
go vet ./...

.PHONY: fumpt
fumpt: gofumpt ## Run gofmtumpt against code.
$(GOFUMPT) -l -w .

.PHONY: imports
imports: goimports ## Run goimports against code.
$(GOIMPORTS) -l -w .

test: fmt vet fumpt imports lint ## Run unit tests.
go test ./... -race -coverprofile ray-kube-api-server-coverage.out

.PHONY: lint
lint: golangci-lint fmt vet fumpt imports ## Run the linter.
$(GOLANGCI_LINT) run --timeout=3m

##@ Build

build: fmt vet fumpt imports lint ## Build api server binary.
go build -o ${REPO_ROOT_BIN}/kuberay-apiserver cmd/main.go

run: fmt vet fumpt imports lint ## Run the api server from your host.
go run -race cmd/main.go -localSwaggerPath ${REPO_ROOT}/proto/swagger

docker-image: test ## Build image with the api server.
${ENGINE} build -t ${IMG} -f Dockerfile ..

docker-push: ## Push image with the api server.
${ENGINE} push ${IMG}

.PHONY: build-swagger
build-swagger: go-bindata
cd $(REPO_ROOT) && $(GOBINDATA) --nocompress --pkg swagger -o apiserver/pkg/swagger/datafile.go third_party/swagger-ui/...

##@ Testing

.PHONY: test
test: fmt vet fumpt imports lint ## Run all unit tests.
go test ./pkg/... ./cmd/... $(GO_TEST_FLAGS) -race -coverprofile ray-kube-api-server-coverage.out -parallel 4

.PHONY: e2e-test
e2e-test: ## Run end to end tests using a pre-exiting cluster.
go test ./test/e2e/... $(GO_TEST_FLAGS) -timeout 60m -race -count=1 -parallel 4

.PHONY: local-e2e-test ## Run end to end tests on newly created cluster.
local-e2e-test: docker-image operator-image cluster load-image load-operator-image deploy-operator deploy load-ray-test-image e2e-test clean-cluster ## Run end to end tests, create a fresh kind cluster will all components deployed.

##@ Testing Setup
KIND_CONFIG ?= hack/kind-cluster-config.yaml
KIND_CLUSTER_NAME ?= ray-api-server-cluster
OPERATOR_IMAGE_TAG ?= latest
.PHONY: cluster
cluster: kind ## Start kind development cluster.
$(KIND) create cluster -n $(KIND_CLUSTER_NAME) --config $(KIND_CONFIG)

.PHONY: clean-cluster
clean-cluster: kind ## Delete kind development cluster.
$(KIND) delete cluster -n $(KIND_CLUSTER_NAME)

.PHONY: load-image
load-image: ## Load the api server image to the kind cluster created with create-kind-cluster.
$(KIND) load docker-image $(IMG) -n $(KIND_CLUSTER_NAME)

.PHONY: operator-image
operator-image: ## Build the operator image to be loaded in your kind cluster.
cd ../ray-operator && $(MAKE) docker-image -e IMG=kuberay/operator:$(OPERATOR_IMAGE_TAG)

.PHONY: deploy-operator
deploy-operator: ## Deploy operator via helm into the K8s cluster specified in ~/.kube/config.
# Note that you should make your operatorimage available by either pushing it to an image registry, such as DockerHub or Quay, or by loading the image into the Kubernetes cluster.
# If you are using a Kind cluster for development, you can run `make load-image` to load the newly built image into the Kind cluster.
helm upgrade --install raycluster ../helm-chart/kuberay-operator --wait \
--set image.tag=${OPERATOR_IMAGE_TAG} --set image.pullPolicy=IfNotPresent

.PHONY: undeploy-operator
undeploy-operator: ## Undeploy operator via helm from the K8s cluster specified in ~/.kube/config.
helm uninstall raycluster --wait

.PHONY: load-operator-image
load-operator-image: ## Load the operator image to the kind cluster created with make cluster.
ifneq (${OPERATOR_IMAGE_TAG}, latest)
$(ENGINE) pull kuberay/operator:$(OPERATOR_IMAGE_TAG)
endif
$(KIND) load docker-image kuberay/operator:$(OPERATOR_IMAGE_TAG) -n $(KIND_CLUSTER_NAME)

.PHONY: load-ray-test-image
load-ray-test-image: ## Load the ray test images
$(ENGINE) pull $(E2E_API_SERVER_RAY_IMAGE)
$(KIND) load docker-image $(E2E_API_SERVER_RAY_IMAGE) -n $(KIND_CLUSTER_NAME)
$(ENGINE) pull rayproject/ray:latest
$(KIND) load docker-image rayproject/ray:latest -n $(KIND_CLUSTER_NAME)

##@ Docker Build

docker-image: test ## Build image with the api server.
$(ENGINE) build -t ${IMG} -f Dockerfile ..

docker-push: ## Push image with the api server.
$(ENGINE) push ${IMG}

##@ Deployment
.PHONY: install
install: kustomize ## Install the kuberay api server to the K8s cluster specified in ~/.kube/config.
Expand All @@ -100,7 +172,7 @@ deploy: ## Deploy via helm the kuberay api server to the K8s cluster specified i
undeploy: ## Undeploy via helm the kuberay api server to the K8s cluster specified in ~/.kube/config.
helm uninstall kuberay-apiserver --wait

##@ Development Tools
##@ Development Tools Setup

## Location to install dependencies to
$(REPO_ROOT_BIN):
Expand All @@ -118,7 +190,7 @@ GOBINDATA ?= $(REPO_ROOT_BIN)/go-bindata
## Tool Versions
KUSTOMIZE_VERSION ?= v3.8.7
GOFUMPT_VERSION ?= v0.3.1
GOIMPORTS_VERSION ?= latest
GOIMPORTS_VERSION ?= v0.14.0
GOLANGCI_LINT_VERSION ?= v1.54.1
KIND_VERSION ?= v0.19.0
GOBINDATA_VERSION ?= v4.0.2
Expand Down Expand Up @@ -165,39 +237,3 @@ clean-dev-tools: ## Remove all development tools
rm -f $(REPO_ROOT_BIN)/goimports
rm -f $(REPO_ROOT_BIN)/kind
rm -f $(REPO_ROOT_BIN)/go-bindata


##@ Testing Setup and Tools
KIND_CONFIG ?= hack/kind-cluster-config.yaml
KIND_CLUSTER_NAME ?= ray-api-server-cluster
OPERATOR_IMAGE_TAG ?= latest
.PHONY: cluster
cluster: kind ## Start kind development cluster.
$(KIND) create cluster -n $(KIND_CLUSTER_NAME) --config $(KIND_CONFIG)

.PHONY: clean-cluster
clean-cluster: kind ## Delete kind development cluster.
$(KIND) delete cluster -n $(KIND_CLUSTER_NAME)

.PHONY: load-image
load-image: ## Load the api server image to the kind cluster created with create-kind-cluster.
$(KIND) load docker-image $(IMG) -n $(KIND_CLUSTER_NAME)

.PHONY: operator-image
operator-image: ## Build the operator image to be loaded in your kind cluster.
cd ../ray-operator && $(MAKE) docker-image -e IMG=kuberay/operator:$(OPERATOR_IMAGE_TAG)

.PHONY: deploy-operator
deploy-operator: ## Deploy operator via helm into the K8s cluster specified in ~/.kube/config.
# Note that you should make your operatorimage available by either pushing it to an image registry, such as DockerHub or Quay, or by loading the image into the Kubernetes cluster.
# If you are using a Kind cluster for development, you can run `make load-image` to load the newly built image into the Kind cluster.
helm upgrade --install raycluster ../helm-chart/kuberay-operator --wait \
--set image.tag=${OPERATOR_IMAGE_TAG} --set image.pullPolicy=IfNotPresent

.PHONY: undeploy-operator
undeploy-operator: ## Undeploy operator via helm from the K8s cluster specified in ~/.kube/config.
helm uninstall raycluster --wait

.PHONY: load-operator-image
load-operator-image: ## Load the operator image to the kind cluster created with create-kind-cluster.
$(KIND) load docker-image kuberay/operator:$(OPERATOR_IMAGE_TAG) -n $(KIND_CLUSTER_NAME)
5 changes: 2 additions & 3 deletions apiserver/Volumes.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ The code below gives an example of hostPath volume definition:
A Persistent Volume Claim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request
specific size and access modes (e.g., they can be mounted `ReadWriteOnce`, `ReadOnlyMany` or `ReadWriteMany`).

The caveat of using PVC volumes is that the same PVC is mounted to all nodes. As a result only PVCs with access
mode `ReadOnlyMany` can be used in this case.
The caveat of using PVC volumes is that the same PVC is mounted to all nodes. As a result only PVCs with access mode `ReadOnlyMany` can be used in this case.

The code below gives an example of PVC volume definition:

Expand Down Expand Up @@ -121,7 +120,7 @@ The code below gives an example of secret volume definition:

An emptyDir volume is first created when a Pod is assigned to a node, and exists as long as that Pod is running on that node. As the name says, the emptyDir volume is initially empty. All containers in the Pod can read and write the same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.

The code below gives an example of empydir volume definition:
The code below gives an example of empty directory volume definition:

```json
{
Expand Down
6 changes: 4 additions & 2 deletions apiserver/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,16 @@ require (
)

require (
github.com/dustinkirkland/golang-petname v0.0.0-20230626224747-e794b9370d49
github.com/elazarl/go-bindata-assetfs v1.0.1
github.com/grpc-ecosystem/go-grpc-middleware v1.3.0
github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0
github.com/grpc-ecosystem/grpc-gateway/v2 v2.6.0
google.golang.org/genproto v0.0.0-20210909211513-a8c4777a87af
)

require github.com/pmezard/go-difflib v1.0.0 // indirect

require (
github.com/asaskevich/govalidator v0.0.0-20200428143746-21a406dcc535 // indirect
github.com/beorn7/perks v1.0.1 // indirect
Expand All @@ -48,7 +52,6 @@ require (
github.com/mitchellh/mapstructure v1.4.1 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/prometheus/client_model v0.2.0 // indirect
github.com/prometheus/common v0.28.0 // indirect
github.com/prometheus/procfs v0.6.0 // indirect
Expand All @@ -61,7 +64,6 @@ require (
golang.org/x/text v0.13.0 // indirect
golang.org/x/time v0.0.0-20210723032227-1f47c861a9ac // indirect
google.golang.org/appengine v1.6.7 // indirect
google.golang.org/genproto v0.0.0-20210909211513-a8c4777a87af // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
Expand Down
2 changes: 2 additions & 0 deletions apiserver/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSs
github.com/docker/go-units v0.3.3/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/docker/go-units v0.4.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/docopt/docopt-go v0.0.0-20180111231733-ee0de3bc6815/go.mod h1:WwZ+bS3ebgob9U8Nd0kOddGdZWjyMGR8Wziv+TBNwSE=
github.com/dustinkirkland/golang-petname v0.0.0-20230626224747-e794b9370d49 h1:6SNWi8VxQeCSwmLuTbEvJd7xvPmdS//zvMBWweZLgck=
github.com/dustinkirkland/golang-petname v0.0.0-20230626224747-e794b9370d49/go.mod h1:V+Qd57rJe8gd4eiGzZyg4h54VLHmYVVw54iMnlAMrF8=
github.com/elazarl/go-bindata-assetfs v1.0.1 h1:m0kkaHRKEu7tUIUFVwhGGGYClXvyl4RE03qmvRTNfbw=
github.com/elazarl/go-bindata-assetfs v1.0.1/go.mod h1:v+YaWX3bdea5J/mo8dSETolEo7R71Vk1u8bnjau5yw4=
github.com/elazarl/goproxy v0.0.0-20180725130230-947c36da3153/go.mod h1:/Zj4wYkgs4iZTTu3o/KG3Itv/qCCa8VVMlb3i9OVuzc=
Expand Down
13 changes: 2 additions & 11 deletions apiserver/hack/kind-cluster-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,6 @@ nodes:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 30265
hostPort: 8265
listenAddress: "0.0.0.0"
protocol: tcp
- containerPort: 30001
hostPort: 10001
listenAddress: "0.0.0.0"
protocol: tcp
- containerPort: 8000
hostPort: 8000
listenAddress: "0.0.0.0"
- containerPort: 31888
hostPort: 31888
listenAddress: "0.0.0.0"
Expand All @@ -31,3 +20,5 @@ nodes:
image: kindest/node:v1.23.17@sha256:59c989ff8a517a93127d4a536e7014d28e235fb3529d9fba91b3951d461edfdb
- role: worker
image: kindest/node:v1.23.17@sha256:59c989ff8a517a93127d4a536e7014d28e235fb3529d9fba91b3951d461edfdb
- role: worker
image: kindest/node:v1.23.17@sha256:59c989ff8a517a93127d4a536e7014d28e235fb3529d9fba91b3951d461edfdb
Loading

0 comments on commit ff45923

Please sign in to comment.