Skip to content

Commit

Permalink
Merge pull request #1633 from cliveseldon/1592_scale_subresource
Browse files Browse the repository at this point in the history
Add /scale subresource to CRD and replicas to various parts of CRD.
  • Loading branch information
ukclivecox authored Apr 6, 2020
2 parents c2a324a + c172969 commit 5a5e5d9
Show file tree
Hide file tree
Showing 28 changed files with 1,010 additions and 476 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -209,4 +209,5 @@ examples/ambassador/headers/ambassador_headers.py
examples/ambassador/shadow/ambassador_shadow.py
examples/models/metrics/metrics.py
examples/models/custom_metrics/customMetrics.py
examples/models/tracing/tracing.py
examples/models/tracing/tracing.py
examples/models/autoscaling/autoscaling_example.py
6 changes: 4 additions & 2 deletions doc/source/examples/notebooks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -79,20 +79,22 @@ MLOps: Scaling and Monitoring and Observability

.. toctree::
:titlesonly:


Autoscaling Example <autoscaling_example>
Request Payload Logging with ELK <payload_logging>
Custom Metrics with Grafana & Prometheus <metrics>
Distributed Tracing with Jaeger <tracing>
CI / CD with Jenkins Classic <jenkins_classic>
CI / CD with Jenkins X <jenkins_x>
Replica control <scale>


Production Configurations and Integrations
-----

.. toctree::
:titlesonly:

Autoscaling Example <autoscaling_example>
Custom Endpoints <custom_endpoints>
Example Helm Deployments <helm_examples>
Max gRPC Message Size <max_grpc_msg_size>
Expand Down
3 changes: 3 additions & 0 deletions doc/source/examples/scale.nblink
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"path": "../../../notebooks/scale.ipynb"
}
97 changes: 0 additions & 97 deletions doc/source/graph/autoscaling.md

This file was deleted.

154 changes: 154 additions & 0 deletions doc/source/graph/scaling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Scaling Replicas

## Replica Settings

Replicas settings can be provided at several levels with the most specific taking precedence, from most general to most specific as shown below:

* `.spec.replicas`
* `.spec.predictors[].replicas`
* `.spec.predictors[].componentSpecs[].replicas`

If you use the annotation `seldon.io/engine-separate-pod` you can also set the number of replicas for the service orchestrator in:

* `.spec.predictors[].svcOrchSpec.replicas`

As illustration, a contrived example showing various options is shown below:

```
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: test-replicas
spec:
replicas: 1
predictors:
- componentSpecs:
- spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
name: classifier
- spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
name: classifier2
replicas: 3
graph:
endpoint:
type: REST
name: classifier
type: MODEL
children:
- name: classifier2
type: MODEL
endpoint:
type: REST
name: example
replicas: 2
traffic: 50
- componentSpecs:
- spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
name: classifier3
graph:
children: []
endpoint:
type: REST
name: classifier3
type: MODEL
name: example2
traffic: 50
```

* classfier will have a deployment with 2 replicas as specified by the predictor it is defined within
* classifier2 will have a deployment with 3 replicas as that is specified in its componentSpec
* classifier3 will have 1 replica as it takes its value from `.spec.replicas`

For more details see [a worked example for the above replica settings](../examples/scale.html).

## Scale replicas

Its is possible to use the `kubectl scale` command to set the `replicas` value of the SeldonDeployment. For simple inference graphs this can be an easy way to scale them up and down. For example:

```
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: seldon-scale
spec:
replicas: 1
predictors:
- componentSpecs:
- spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
name: classifier
graph:
children: []
endpoint:
type: REST
name: classifier
type: MODEL
name: example
```

One can scale this Seldon Deployment up using the command:

```
kubectl scale --replicas=2 sdep/seldon-scale
```

For more details you can follow [a worked example of scaling](../examples/scale.html).

## Autoscaling Seldon Deployments

To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are three steps:

1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory.
1. Add a HPA Spec refering to this Deployment. (We presently support v1beta1 version of k8s HPA Metrics spec)

To illustrate this we have an example Seldon Deployment below:

```yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: seldon-model
spec:
name: test-deployment
predictors:
- componentSpecs:
- hpaSpec:
maxReplicas: 3
metrics:
- resource:
name: cpu
targetAverageUtilization: 70
type: Resource
minReplicas: 1
spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
imagePullPolicy: IfNotPresent
name: classifier
resources:
requests:
cpu: '0.5'
terminationGracePeriodSeconds: 1
graph:
children: []
endpoint:
type: REST
name: classifier
type: MODEL
name: example
```
The key points here are:
* We define a CPU request for our container. This is required to allow us to utilize cpu autoscaling in Kubernetes.
* We define an HPA associated with our componentSpec which scales on CPU when the average CPU is above 70% up to a maximum of 3 replicas.
For a worked example see [this notebook](../examples/autoscaling_example.html).
2 changes: 1 addition & 1 deletion doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ Documentation Index
Metrics with Prometheus <analytics/analytics.md>
Payload Logging with ELK <analytics/logging.md>
Distributed Tracing with Jaeger <graph/distributed-tracing.md>
Autoscaling in Kubernetes <graph/autoscaling.md>
Replica scaling <graph/scaling.md>

.. toctree::
:maxdepth: 1
Expand Down
19 changes: 10 additions & 9 deletions doc/source/python/api/seldon_core.proto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ seldon\_core.proto package
==========================

.. automodule:: seldon_core.proto
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:

Submodules
----------
Expand All @@ -13,15 +13,16 @@ seldon\_core.proto.prediction\_pb2 module
-----------------------------------------

.. automodule:: seldon_core.proto.prediction_pb2
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:

seldon\_core.proto.prediction\_pb2\_grpc module
-----------------------------------------------

.. automodule:: seldon_core.proto.prediction_pb2_grpc
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:


Loading

0 comments on commit 5a5e5d9

Please sign in to comment.