-
Notifications
You must be signed in to change notification settings - Fork 837
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1633 from cliveseldon/1592_scale_subresource
Add /scale subresource to CRD and replicas to various parts of CRD.
- Loading branch information
Showing
28 changed files
with
1,010 additions
and
476 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{ | ||
"path": "../../../notebooks/scale.ipynb" | ||
} |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
# Scaling Replicas | ||
|
||
## Replica Settings | ||
|
||
Replicas settings can be provided at several levels with the most specific taking precedence, from most general to most specific as shown below: | ||
|
||
* `.spec.replicas` | ||
* `.spec.predictors[].replicas` | ||
* `.spec.predictors[].componentSpecs[].replicas` | ||
|
||
If you use the annotation `seldon.io/engine-separate-pod` you can also set the number of replicas for the service orchestrator in: | ||
|
||
* `.spec.predictors[].svcOrchSpec.replicas` | ||
|
||
As illustration, a contrived example showing various options is shown below: | ||
|
||
``` | ||
apiVersion: machinelearning.seldon.io/v1 | ||
kind: SeldonDeployment | ||
metadata: | ||
name: test-replicas | ||
spec: | ||
replicas: 1 | ||
predictors: | ||
- componentSpecs: | ||
- spec: | ||
containers: | ||
- image: seldonio/mock_classifier_rest:1.3 | ||
name: classifier | ||
- spec: | ||
containers: | ||
- image: seldonio/mock_classifier_rest:1.3 | ||
name: classifier2 | ||
replicas: 3 | ||
graph: | ||
endpoint: | ||
type: REST | ||
name: classifier | ||
type: MODEL | ||
children: | ||
- name: classifier2 | ||
type: MODEL | ||
endpoint: | ||
type: REST | ||
name: example | ||
replicas: 2 | ||
traffic: 50 | ||
- componentSpecs: | ||
- spec: | ||
containers: | ||
- image: seldonio/mock_classifier_rest:1.3 | ||
name: classifier3 | ||
graph: | ||
children: [] | ||
endpoint: | ||
type: REST | ||
name: classifier3 | ||
type: MODEL | ||
name: example2 | ||
traffic: 50 | ||
``` | ||
|
||
* classfier will have a deployment with 2 replicas as specified by the predictor it is defined within | ||
* classifier2 will have a deployment with 3 replicas as that is specified in its componentSpec | ||
* classifier3 will have 1 replica as it takes its value from `.spec.replicas` | ||
|
||
For more details see [a worked example for the above replica settings](../examples/scale.html). | ||
|
||
## Scale replicas | ||
|
||
Its is possible to use the `kubectl scale` command to set the `replicas` value of the SeldonDeployment. For simple inference graphs this can be an easy way to scale them up and down. For example: | ||
|
||
``` | ||
apiVersion: machinelearning.seldon.io/v1 | ||
kind: SeldonDeployment | ||
metadata: | ||
name: seldon-scale | ||
spec: | ||
replicas: 1 | ||
predictors: | ||
- componentSpecs: | ||
- spec: | ||
containers: | ||
- image: seldonio/mock_classifier_rest:1.3 | ||
name: classifier | ||
graph: | ||
children: [] | ||
endpoint: | ||
type: REST | ||
name: classifier | ||
type: MODEL | ||
name: example | ||
``` | ||
|
||
One can scale this Seldon Deployment up using the command: | ||
|
||
``` | ||
kubectl scale --replicas=2 sdep/seldon-scale | ||
``` | ||
|
||
For more details you can follow [a worked example of scaling](../examples/scale.html). | ||
|
||
## Autoscaling Seldon Deployments | ||
|
||
To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are three steps: | ||
|
||
1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory. | ||
1. Add a HPA Spec refering to this Deployment. (We presently support v1beta1 version of k8s HPA Metrics spec) | ||
|
||
To illustrate this we have an example Seldon Deployment below: | ||
|
||
```yaml | ||
apiVersion: machinelearning.seldon.io/v1 | ||
kind: SeldonDeployment | ||
metadata: | ||
name: seldon-model | ||
spec: | ||
name: test-deployment | ||
predictors: | ||
- componentSpecs: | ||
- hpaSpec: | ||
maxReplicas: 3 | ||
metrics: | ||
- resource: | ||
name: cpu | ||
targetAverageUtilization: 70 | ||
type: Resource | ||
minReplicas: 1 | ||
spec: | ||
containers: | ||
- image: seldonio/mock_classifier_rest:1.3 | ||
imagePullPolicy: IfNotPresent | ||
name: classifier | ||
resources: | ||
requests: | ||
cpu: '0.5' | ||
terminationGracePeriodSeconds: 1 | ||
graph: | ||
children: [] | ||
endpoint: | ||
type: REST | ||
name: classifier | ||
type: MODEL | ||
name: example | ||
``` | ||
The key points here are: | ||
* We define a CPU request for our container. This is required to allow us to utilize cpu autoscaling in Kubernetes. | ||
* We define an HPA associated with our componentSpec which scales on CPU when the average CPU is above 70% up to a maximum of 3 replicas. | ||
For a worked example see [this notebook](../examples/autoscaling_example.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.