Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi model deployment #208

Draft
wants to merge 74 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
4eac006
Removing load balancing config
TosinSeg Jun 19, 2023
c68e999
Reformatting tests
TosinSeg Jun 20, 2023
5ce1a92
Fixed the formatting
TosinSeg Jun 20, 2023
fa10e19
Removed print statement
TosinSeg Jun 20, 2023
f9cbd74
Merging main
TosinSeg Jun 26, 2023
8970f4e
Removing unused import
TosinSeg Jun 26, 2023
517bea8
Fixing tests
TosinSeg Jun 26, 2023
58dd2b2
Fixing merge issue
TosinSeg Jun 26, 2023
bb0d551
Creating hostfile when one is not provided
TosinSeg Jun 26, 2023
e2bb9d5
Merge branch 'main' into Always_enable_load_balancing
TosinSeg Jun 26, 2023
3823534
Fixing import statements removed by merge
TosinSeg Jun 26, 2023
6f9b4ad
Removing load_balancing check
TosinSeg Jun 26, 2023
499b9ad
Removing redudant definitions
TosinSeg Jun 26, 2023
5419ef6
Removing hostfile from test
TosinSeg Jun 26, 2023
a70b6de
Removing hostfile from non-persistent test
TosinSeg Jun 26, 2023
eea658b
initial changes
TosinSeg Jun 27, 2023
20f0878
Merge branch 'main' into multi-model-deployment
TosinSeg Jun 27, 2023
c21c31b
Maintaining current behavior
TosinSeg Jun 28, 2023
f525329
Reading from score file
TosinSeg Jun 28, 2023
3c0937f
fixing syntax errors
TosinSeg Jun 28, 2023
156ac83
Fixing more syntax errors
TosinSeg Jun 28, 2023
38e270e
Fixing more syntax issues
TosinSeg Jun 29, 2023
4d4e0d8
initial lb changes
TosinSeg Jun 29, 2023
01c8e59
Merge branch 'main' into multi-model-deployment
TosinSeg Jun 29, 2023
f801b36
More load balancing changes
TosinSeg Jun 29, 2023
fd4e2ed
LB changes and syntax
TosinSeg Jun 30, 2023
0a3b7e5
Refactor client, and unpack request in load balancer
TosinSeg Jun 30, 2023
6523c04
First working queries
TosinSeg Jul 3, 2023
06b40f5
Fixing conversational and q&a args
TosinSeg Jul 3, 2023
96d0dcb
Updates to _allocate_processes and fixing example
TosinSeg Jul 5, 2023
ab41d24
Adding host map for allocating processes and formatting
TosinSeg Jul 5, 2023
8673a9a
Fixing terminate functionality
TosinSeg Jul 5, 2023
8d09b37
Refactored client
TosinSeg Jul 6, 2023
7a136d6
More Refactoring and q/a example
TosinSeg Jul 6, 2023
2c6ec08
Reformatting to maintain previous syntax
TosinSeg Jul 6, 2023
0cb88a9
Removing print/debug statements
TosinSeg Jul 6, 2023
7c0ee12
Fixing non-persistent deloyments
TosinSeg Jul 6, 2023
7a956d5
Refactoring Load balancer launch
TosinSeg Jul 7, 2023
f8cfe28
Fixing restful gateway client
TosinSeg Jul 10, 2023
079807d
Fixing replica issue
TosinSeg Jul 10, 2023
ea1e47e
Fixing non persistent client
TosinSeg Jul 10, 2023
98b6129
Adding trust_remote_code support (#203)
msinha251 Jul 11, 2023
daab5e6
Refactoring
TosinSeg Jul 12, 2023
84073f9
Update mii/models/score/generate.py
TosinSeg Jul 12, 2023
3ee3410
Merge branch 'multi-model-deployment' of github.com:TosinSeg/DeepSpee…
Jul 13, 2023
b4edc2b
Refactoring Load Balancer and request_proto
Jul 13, 2023
6346194
Formatting
Jul 13, 2023
94b6699
Fixing the client
Jul 14, 2023
710c20b
Initial partial deployment commit
Jul 21, 2023
c2636b7
More partial deploy updates
Jul 21, 2023
189e75c
Partial deploy started
Jul 21, 2023
adee843
fixing add deploy api queries
Jul 24, 2023
a145be5
Support for empty deployment 'group'
Jul 24, 2023
082c05e
Support for empty deployment 'group'
Jul 24, 2023
3ce77d2
Partial Termination
Jul 25, 2023
b40ecbd
Refactoring
Jul 25, 2023
72dd95c
formatting
Jul 25, 2023
a4e3d56
fixing bug for partial termination
Jul 25, 2023
4b5bb47
Removing comments
Jul 25, 2023
30d2b03
Including GPU index map in score file
Jul 26, 2023
c5d5996
Refactoring deployment
Jul 26, 2023
3ae1781
Refactoring and formatting
Jul 26, 2023
4b8f02f
Refactoring
Jul 28, 2023
c51ce37
Fixing Readme
Jul 28, 2023
43479db
Refactoring GRPC
Jul 28, 2023
e1b6d23
Fixing LB process not terminating
Jul 28, 2023
1675bd8
Adding multi_deployment and partial deploy/terminate unit tests
Jul 31, 2023
8684a61
Removing comments
Jul 31, 2023
56a7fce
Fixing spelling issues
Aug 1, 2023
fb70c3d
Update mii/client.py
TosinSeg Aug 1, 2023
e2cfe8a
Update mii/client.py
TosinSeg Aug 1, 2023
1312738
Removing AML from addDeploy
Aug 1, 2023
b0f0da4
Refactoring MIIConfig and DeploymentConfig
Aug 2, 2023
b78068e
Partial deploy/termination example
Aug 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions examples/multi_model/deploy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Copyright (c) Microsoft Corporation.
# SPDX-License-Identifier: Apache-2.0

# DeepSpeed Team
import mii

gpu_index_map1 = {'master': [0]}
gpu_index_map2 = {'master': [1]}
gpu_index_map3 = {'master': [0, 1]}

deployments = []

mii_configs1 = {"tensor_parallel": 2, "dtype": "fp16"}
mii_configs2 = {"tensor_parallel": 1}

name = "bigscience/bloom-560m"
deployments.append(
mii.DeploymentConfig(task='text-generation',
model=name,
deployment_name=name + "_deployment",
GPU_index_map=gpu_index_map3,
mii_config=mii.config.MIIConfig(**mii_configs1)))

# gpt2
name = "microsoft/DialogRPT-human-vs-rand"
deployments.append(
mii.DeploymentConfig(task='text-classification',
model=name,
deployment_name=name + "_deployment",
GPU_index_map=gpu_index_map2))

name = "microsoft/DialoGPT-large"
deployments.append(
mii.DeploymentConfig(task='conversational',
model=name,
deployment_name=name + "_deployment",
GPU_index_map=gpu_index_map1,
mii_config=mii.config.MIIConfig(**mii_configs2)))

name = "deepset/roberta-large-squad2"
deployments.append(
mii.DeploymentConfig(task="question-answering",
model=name,
deployment_name=name + "-qa-deployment",
GPU_index_map=gpu_index_map2))

mii.deploy(deployment_tag="multi_models", deployments=deployments)
46 changes: 46 additions & 0 deletions examples/multi_model/query.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Copyright (c) Microsoft Corporation.
# SPDX-License-Identifier: Apache-2.0

# DeepSpeed Team

import mii

results = []
generator = mii.mii_query_handle("multi_models")
result = generator.query(
{
"query": ["DeepSpeed is",
"Seattle is"],
"deployment_name": "bigscience/bloom-560m_deployment"
},
do_sample=True,
max_new_tokens=30,
)
results.append(result)

result = generator.query({
'query':
"DeepSpeed is the greatest",
"deployment_name":
"microsoft/DialogRPT-human-vs-rand_deployment"
})
results.append(result)

result = generator.query({
'text': "DeepSpeed is the greatest",
'conversation_id': 3,
'past_user_inputs': [],
'generated_responses': [],
"deployment_name": "microsoft/DialoGPT-large_deployment"
})
results.append(result)

result = generator.query({
'question':
"What is the greatest?",
'context':
"DeepSpeed is the greatest",
"deployment_name":
"deepset/roberta-large-squad2" + "-qa-deployment"
})
results.append(result)
TosinSeg marked this conversation as resolved.
Show resolved Hide resolved
7 changes: 7 additions & 0 deletions examples/multi_model/shutdown.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Copyright (c) Microsoft Corporation.
# SPDX-License-Identifier: Apache-2.0

# DeepSpeed Team
import mii

mii.terminate("multi_models")
2 changes: 1 addition & 1 deletion mii/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from .constants import DeploymentType, Tasks
from .aml_related.utils import aml_output_path

from .config import MIIConfig, LoadBalancerConfig
from .config import MIIConfig, LoadBalancerConfig, DeploymentConfig
from .grpc_related.proto import modelresponse_pb2_grpc

__version__ = "0.0.0"
Expand Down
Loading