Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch benchmarking with Argo Workflows #2248

Merged
merged 3 commits into from
Aug 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/examples/notebooks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -149,4 +149,5 @@ Benchmarking and Load Tests

Service Orchestrator <bench_svcOrch>
Tensorflow <bench_tensorflow>
Argo Workflows Benchmarking <vegeta_bench_argo_workflows>

3 changes: 3 additions & 0 deletions doc/source/examples/vegeta_bench_argo_workflows.nblink
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"path": "../../../examples/batch/benchmarking-argo-workflows-vegeta/README.ipynb"
}
7 changes: 7 additions & 0 deletions doc/source/reference/benchmarking.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,10 @@ mean: 259.990239 ms
Throughput: 23.997572337989126/s
Errors: False
```

## Flexible Benchmarking with Argo Workflows

We have also an example that shows how to leverage the batch processing workflow that we showcase in the examples, but to perform benchmarking with Seldon Core models.

* [Notebook](../examples/vegeta_bench_argo_workflows.html)

120 changes: 69 additions & 51 deletions examples/batch/argo-workflows-batch/README.ipynb

Large diffs are not rendered by default.

78 changes: 43 additions & 35 deletions examples/batch/argo-workflows-batch/README.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,23 @@ spec:
spec:
name: "{{ .Values.seldonDeployment.name }}"
predictors:
- graph:
- componentSpecs:
- spec:
containers:
- name: classifier
env:
- name: GUNICORN_THREADS
value: {{ .Values.seldonDeployment.serverThreads }}
- name: GUNICORN_WORKERS
value: {{ .Values.seldonDeployment.serverWorkers }}
resources:
requests:
cpu: {{ .Values.seldonDeployment.requests.cpu }}
memory: {{ .Values.seldonDeployment.requests.memory }}
limits:
cpu: {{ .Values.seldonDeployment.limits.cpu }}
memory: {{ .Values.seldonDeployment.limits.memory }}
graph:
children: []
implementation: {{ .Values.seldonDeployment.server }}
modelUri: {{ .Values.seldonDeployment.modelUri }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,31 @@ seldonDeployment:
# Name to use for the seldon deployment which by default appends generated workflow ID
name: seldon-{{workflow.uid}}
# Image to use for the batch client
image: seldonio/seldon-core-s2i-python37:1.1.1-rc
image: seldonio/seldon-core-s2i-python37:1.3.0-dev
# Prepackaged model server to use [see https://docs.seldon.io/projects/seldon-core/en/latest/servers/overview.html]
server: SKLEARN_SERVER
# The URL for the model that is to be used
modelUri: gs://seldon-models/sklearn/iris
# The number of seldon deployment replicas to launch
replicas: 10
replicas: 2
# Waiting time before checks for deployment to ensure kubernetes cluster registers create
waitTime: 5
# The number of threads spawned by Python Gunicorn Flask server
serverThreads: 10
# The number of workers spawned by Python Gunicorn Flask server
serverWorkers: 1
# Whether to enable resources
enableResources: false
requests:
# Requests for CPU (only added if enableResources is enabled)
cpu: 50m
# Requests for memory (only added if enableResources is enabled)
memory: 100Mi
limits:
# Limits for CPU (only added if enableResources is enabled)
cpu: 50m
# Limits for memory (only added if enableResources is enabled)
memory: 1000Mi
# The batch worker is the component that will send the requests from the files
batchWorker:
# Endpoint of for the batch client to contact the seldon deployment
Expand Down
2 changes: 2 additions & 0 deletions examples/batch/benchmarking-argo-workflows-vegeta/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
assets/input-data.txt
assets/output-data.txt
267 changes: 267 additions & 0 deletions examples/batch/benchmarking-argo-workflows-vegeta/README.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Benchmarking with Argo Worfklows & Vegeta\n",
"\n",
"In this notebook we will dive into how you can run bench marking with batch processing with Argo Workflows, Seldon Core and Vegeta.\n",
"\n",
"Dependencies:\n",
"\n",
"* Seldon core installed as per the docs with Istio as an ingress \n",
"* Argo Workfklows installed in cluster (and argo CLI for commands)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"### Install Seldon Core\n",
"Use the notebook to [set-up Seldon Core with Ambassador or Istio Ingress](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html).\n",
"\n",
"Note: If running with KIND you need to make sure do follow [these steps](https://github.com/argoproj/argo/issues/2376#issuecomment-595593237) as workaround to the `/.../docker.sock` known issue.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install Argo Workflows\n",
"You can follow the instructions from the official [Argo Workflows Documentation](https://github.com/argoproj/argo#quickstart).\n",
"\n",
"You also need to make sure that argo has permissions to create seldon deployments - for this you can just create a default-admin rolebinding as follows:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rolebinding.rbac.authorization.k8s.io/default-admin created\r\n"
]
}
],
"source": [
"!kubectl create rolebinding default-admin --clusterrole=admin --serviceaccount=default:default"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Benchmark Argo Workflow\n",
"\n",
"In order to create a benchmark, we created a simple argo workflow template so you can leverage the power of the helm charts.\n",
"\n",
"Before we dive into the contents of the full helm chart, let's first give it a try with some of the settings.\n",
"\n",
"We will run a batch job that will set up a Seldon Deployment with 1 replicas and 4 cpus (with 100 max workers) to send requests."
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Name: seldon-batch-process\r\n",
"Namespace: default\r\n",
"ServiceAccount: default\r\n",
"Status: Pending\r\n",
"Created: Thu Aug 06 14:54:09 +0100 (now)\r\n"
]
}
],
"source": [
"!helm template seldon-batch-workflow helm-charts/seldon-batch-workflow/ \\\n",
" --set workflow.name=seldon-batch-process \\\n",
" --set seldonDeployment.name=sklearn \\\n",
" --set seldonDeployment.replicas=1 \\\n",
" --set seldonDeployment.serverWorkers=1 \\\n",
" --set seldonDeployment.serverThreads=10 \\\n",
" --set benchmark.cpus=4 \\\n",
" | argo submit -"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"NAME STATUS AGE DURATION PRIORITY\r\n",
"seldon-batch-process Succeeded 1m 1m 0\r\n"
]
}
],
"source": [
"!argo list"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Name: seldon-batch-process\r\n",
"Namespace: default\r\n",
"ServiceAccount: default\r\n",
"Status: Succeeded\r\n",
"Created: Thu Aug 06 14:40:51 +0100 (1 minute ago)\r\n",
"Started: Thu Aug 06 14:40:51 +0100 (1 minute ago)\r\n",
"Finished: Thu Aug 06 14:41:57 +0100 (7 seconds ago)\r\n",
"Duration: 1 minute 6 seconds\r\n",
"\r\n",
"\u001b[39mSTEP\u001b[0m PODNAME DURATION MESSAGE\r\n",
" \u001b[32m✔\u001b[0m seldon-batch-process (seldon-batch-process) \r\n",
" ├---\u001b[32m✔\u001b[0m create-seldon-resource (create-seldon-resource-template) seldon-batch-process-3626514072 1s \r\n",
" ├---\u001b[32m✔\u001b[0m wait-seldon-resource (wait-seldon-resource-template) seldon-batch-process-2052519094 27s \r\n",
" └---\u001b[32m✔\u001b[0m run-benchmark (run-benchmark-template) seldon-batch-process-244800534 33s \r\n"
]
}
],
"source": [
"!argo get seldon-batch-process"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.329Z\" level=info msg=\"Starting Workflow Executor\" version=v2.9.3\r\n",
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.333Z\" level=info msg=\"Creating a docker executor\"\r\n",
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.333Z\" level=info msg=\"Executor (version: v2.9.3, build_date: 2020-07-18T19:11:19Z) initialized (pod: default/seldon-batch-process-3626514072) with template:\\n{\\\"name\\\":\\\"create-seldon-resource-template\\\",\\\"arguments\\\":{},\\\"inputs\\\":{},\\\"outputs\\\":{},\\\"metadata\\\":{},\\\"resource\\\":{\\\"action\\\":\\\"create\\\",\\\"manifest\\\":\\\"apiVersion: machinelearning.seldon.io/v1\\\\nkind: SeldonDeployment\\\\nmetadata:\\\\n name: \\\\\\\"sklearn\\\\\\\"\\\\n namespace: default\\\\n ownerReferences:\\\\n - apiVersion: argoproj.io/v1alpha1\\\\n blockOwnerDeletion: true\\\\n kind: Workflow\\\\n name: \\\\\\\"seldon-batch-process\\\\\\\"\\\\n uid: \\\\\\\"853b463e-cd3f-42f8-b99a-0f82a83cf6f4\\\\\\\"\\\\nspec:\\\\n name: \\\\\\\"sklearn\\\\\\\"\\\\n predictors:\\\\n - componentSpecs:\\\\n - spec:\\\\n containers:\\\\n - name: classifier\\\\n env:\\\\n - name: GUNICORN_THREADS\\\\n value: 10\\\\n - name: GUNICORN_WORKERS\\\\n value: 1\\\\n resources:\\\\n requests:\\\\n cpu: 50m\\\\n memory: 100Mi\\\\n limits:\\\\n cpu: 50m\\\\n memory: 1000Mi\\\\n graph:\\\\n children: []\\\\n implementation: SKLEARN_SERVER\\\\n modelUri: gs://seldon-models/sklearn/iris\\\\n name: classifier\\\\n name: default\\\\n replicas: 1\\\\n\\\"}}\"\r\n",
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.333Z\" level=info msg=\"Loading manifest to /tmp/manifest.yaml\"\r\n",
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.333Z\" level=info msg=\"kubectl create -f /tmp/manifest.yaml -o json\"\r\n",
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.945Z\" level=info msg=default/SeldonDeployment.machinelearning.seldon.io/sklearn\r\n",
"\u001b[35mcreate-seldon-resource\u001b[0m:\ttime=\"2020-08-06T13:54:10.945Z\" level=info msg=\"No output parameters\"\r\n",
"\u001b[32mwait-seldon-resource\u001b[0m:\tWaiting for deployment \"sklearn-default-0-classifier\" rollout to finish: 0 of 1 updated replicas are available...\r\n",
"\u001b[32mwait-seldon-resource\u001b[0m:\tdeployment \"sklearn-default-0-classifier\" successfully rolled out\r\n",
"\u001b[35mrun-benchmark\u001b[0m:\t{\"latencies\":{\"total\":3013824928755,\"mean\":263055331,\"50th\":261546418,\"90th\":289796292,\"95th\":297608972,\"99th\":324027882,\"max\":1364625689,\"min\":22613200},\"bytes_in\":{\"total\":1420668,\"mean\":124},\"bytes_out\":{\"total\":538479,\"mean\":47},\"earliest\":\"2020-08-06T13:54:42.340084846Z\",\"latest\":\"2020-08-06T13:55:12.339980392Z\",\"end\":\"2020-08-06T13:55:12.624864691Z\",\"duration\":29999895546,\"wait\":284884299,\"requests\":11457,\"rate\":381.9013297040498,\"throughput\":378.3088422183642,\"success\":1,\"status_codes\":{\"200\":11457},\"errors\":[]}\r\n"
]
}
],
"source": [
"!argo logs -w seldon-batch-process "
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [],
"source": [
"def print_vegeta_results(results):\n",
" print(\"Latencies:\")\n",
" print(\"\\tmean:\", results[\"latencies\"][\"mean\"] / 1e6, \"ms\")\n",
" print(\"\\t50th:\", results[\"latencies\"][\"50th\"] / 1e6, \"ms\")\n",
" print(\"\\t90th:\", results[\"latencies\"][\"90th\"] / 1e6, \"ms\")\n",
" print(\"\\t95th:\", results[\"latencies\"][\"95th\"] / 1e6, \"ms\")\n",
" print(\"\\t99th:\", results[\"latencies\"][\"99th\"] / 1e6, \"ms\")\n",
" print(\"\")\n",
" print(\"Throughput:\", str(results[\"throughput\"]) + \"/s\")\n",
" print(\"Errors:\", len(results[\"errors\"]) > 0)"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Latencies:\n",
"\tmean: 263.055331 ms\n",
"\t50th: 261.546418 ms\n",
"\t90th: 289.796292 ms\n",
"\t95th: 297.608972 ms\n",
"\t99th: 324.027882 ms\n",
"\n",
"Throughput: 378.3088422183642/s\n",
"Errors: False\n"
]
}
],
"source": [
"import json\n",
"wf_logs = !argo logs -w seldon-batch-process \n",
"wf_bench = wf_logs[-1]\n",
"wf_json_str = wf_bench[24:]\n",
"wf_json = json.loads(wf_json_str)\n",
"print_vegeta_results(wf_json)"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Workflow 'seldon-batch-process' deleted\r\n"
]
}
],
"source": [
"!argo delete seldon-batch-process"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading