Error messages spilled from persistent deployment for every request #349

weiqisun · 2023-12-05T22:26:05Z

Hi, in the latest release with version 0.1.2, the persistent server returns responses with more details, including prompt_length, generated_length, and finished_reason. This is awesome and super useful. Thanks for the updates! However, now the persistent server throws error messages on every request:

Traceback (most recent call last):
  File "/home/dyheal1/mambaforge/envs/test/lib/python3.10/site-packages/grpc/_server.py", line 552, in _call_behavior
    response_or_iterator = behavior(argument, context)
  File "/home/dyheal1/mambaforge/envs/test/lib/python3.10/site-packages/mii/grpc_related/modelresponse_server.py", line 96, in GeneratorReply
    return task_methods.pack_response_to_proto(responses)
  File "/home/dyheal1/mambaforge/envs/test/lib/python3.10/site-packages/mii/grpc_related/task_methods.py", line 85, in pack_response_to_proto
    finish_reason=str(r.finish_reason.value),
AttributeError: 'NoneType' object has no attribute 'value'

I can still get valid responses back, with correct generated_text, finish_reason etc. But it will be great to eliminate this error message on the server side. For now as a hacky workaround, I modified line 85 in grpc_related/task_methods.py to finish_reason=str(r.finish_reason.value) if r.finish_reason is not None else "none".

Here is my setup. I started the persistent server by:

mii.serve(
    "meta-llama/Llama-2-70b-hf",
    deployment_name="mii-endpoint",
    max_length=4096,
    tensor_parallel=2,
)

And I call the server using:

client = mii.client("mii-endpoint")
response = client.generate("DeepSpeed is", max_new_tokens=50)

Then I will get the error message on the server side. I'm running this on 2 A100-80GB GPUs and the related module versions are:

mii: 0.1.2
deepspeed: 0.12.4
torch: 2.0.1
CUDA: 11.8

Any help will be appreciated and thanks again for the awesome tool you have built!

The text was updated successfully, but these errors were encountered:

thelongestusernameofall · 2023-12-06T02:37:19Z

遇到了同样的问题。

mrwyattii · 2023-12-06T22:38:04Z

@weiqisun I have a fix in #350. If you would like to try that branch before we merge: pip install git+https://github.com/Microsoft/DeepSpeed-MII@mrwyattii/fix-return-error

weiqisun · 2023-12-07T16:43:01Z

Thanks @mrwyattii! However, I'm still seeing this error message. I confirm I installed the module from your branch since I have the updated _invoke_async function now in the installed lib file.

mrwyattii · 2023-12-07T20:03:54Z

Hmm, I'm not able to reproduce this with the fix I have in #350. Could you try adding a print statement that shows the contents of response? Please add

print(f"RANK {self.inference_pipeline.local_rank} RESPONSE:", [r.to_msg_dict() for r in responses])

just before the return statement here:
https://github.com/microsoft/DeepSpeed-MII/blob/5e12a38520b4cd9eb63a2790c3611c837f723885/mii/grpc_related/modelresponse_server.py#L96

You will want to modify this file on your local system: /home/dyheal1/mambaforge/envs/test/lib/python3.10/site-packages/mii/grpc_related/modelresponse_server.py

Share the output of that print statement. Thanks!

weiqisun · 2023-12-07T21:02:48Z

Actually, nvm. With a clean setup from scratch, now the error message is gone! I'm not sure if it was due to a not-properly terminated process. I previously had a server running overnight. I stopped the server this morning before updating the mii module and I still observed the same error message after the update. But then I realized there were two leftover mii processes after I terminated the server by client.terminate_server():

dyheal1   291962  0.9  0.0 41352944 474320 pts/8 Sl   11:26   0:10 /home/dyheal1/mambaforge/envs/test/bin/python -m mii.launch.multi_gpu_server --deployment-name mii-endpoint --load-balancer-port 50050 --restful-gateway-port 51080 --restful-gateway-procs 32 --load-balancer --model-config eyJtb2RlbF9uYW1lX29yX3BhdGgiOiAibWV0YS1sbGFtYS9MbGFtYS0yLTdiLWhmIiwgInRva2VuaXplciI6ICJtZXRhLWxsYW1hL0xsYW1hLTItN2ItaGYiLCAidGFzayI6ICJ0ZXh0LWdlbmVyYXRpb24iLCAidGVuc29yX3BhcmFsbGVsIjogMiwgImluZmVyZW5jZV9lbmdpbmVfY29uZmlnIjogeyJ0ZW5zb3JfcGFyYWxsZWwiOiB7InRwX3NpemUiOiAyfSwgInN0YXRlX21hbmFnZXIiOiB7Im1heF90cmFja2VkX3NlcXVlbmNlcyI6IDIwNDgsICJtYXhfcmFnZ2VkX2JhdGNoX3NpemUiOiA3NjgsICJtYXhfcmFnZ2VkX3NlcXVlbmNlX2NvdW50IjogNTEyLCAibWF4X2NvbnRleHQiOiA4MTkyLCAibWVtb3J5X2NvbmZpZyI6IHsibW9kZSI6ICJyZXNlcnZlIiwgInNpemUiOiAxMDAwMDAwMDAwfSwgIm9mZmxvYWQiOiBmYWxzZX19LCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJ6bXFfcG9ydF9udW1iZXIiOiAyNTU1NSwgInJlcGxpY2FfbnVtIjogMSwgInJlcGxpY2FfY29uZmlncyI6IFt7Imhvc3RuYW1lIjogImxvY2FsaG9zdCIsICJ0ZW5zb3JfcGFyYWxsZWxfcG9ydHMiOiBbNTAwNTEsIDUwMDUyXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMCwgMV0sICJ6bXFfcG9ydCI6IDI1NTU1fV0sICJtYXhfbGVuZ3RoIjogNDA5NiwgImFsbF9yYW5rX291dHB1dCI6IGZhbHNlLCAic3luY19kZWJ1ZyI6IGZhbHNlLCAicHJvZmlsZV9tb2RlbF90aW1lIjogZmFsc2V9
dyheal1   292647  1.1  0.0 41353024 474248 pts/8 Sl   11:30   0:10 /home/dyheal1/mambaforge/envs/test/bin/python -m mii.launch.multi_gpu_server --deployment-name mii-endpoint --load-balancer-port 50050 --restful-gateway-port 51080 --restful-gateway-procs 32 --load-balancer --model-config eyJtb2RlbF9uYW1lX29yX3BhdGgiOiAibWV0YS1sbGFtYS9MbGFtYS0yLTdiLWhmIiwgInRva2VuaXplciI6ICJtZXRhLWxsYW1hL0xsYW1hLTItN2ItaGYiLCAidGFzayI6ICJ0ZXh0LWdlbmVyYXRpb24iLCAidGVuc29yX3BhcmFsbGVsIjogMiwgImluZmVyZW5jZV9lbmdpbmVfY29uZmlnIjogeyJ0ZW5zb3JfcGFyYWxsZWwiOiB7InRwX3NpemUiOiAyfSwgInN0YXRlX21hbmFnZXIiOiB7Im1heF90cmFja2VkX3NlcXVlbmNlcyI6IDIwNDgsICJtYXhfcmFnZ2VkX2JhdGNoX3NpemUiOiA3NjgsICJtYXhfcmFnZ2VkX3NlcXVlbmNlX2NvdW50IjogNTEyLCAibWF4X2NvbnRleHQiOiA4MTkyLCAibWVtb3J5X2NvbmZpZyI6IHsibW9kZSI6ICJyZXNlcnZlIiwgInNpemUiOiAxMDAwMDAwMDAwfSwgIm9mZmxvYWQiOiBmYWxzZX19LCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJ6bXFfcG9ydF9udW1iZXIiOiAyNTU1NSwgInJlcGxpY2FfbnVtIjogMSwgInJlcGxpY2FfY29uZmlncyI6IFt7Imhvc3RuYW1lIjogImxvY2FsaG9zdCIsICJ0ZW5zb3JfcGFyYWxsZWxfcG9ydHMiOiBbNTAwNTEsIDUwMDUyXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMCwgMV0sICJ6bXFfcG9ydCI6IDI1NTU1fV0sICJtYXhfbGVuZ3RoIjogNDA5NiwgImFsbF9yYW5rX291dHB1dCI6IGZhbHNlLCAic3luY19kZWJ1ZyI6IGZhbHNlLCAicHJvZmlsZV9tb2RlbF90aW1lIjogZmFsc2V9

After manually killing these two processes, I started the server again and the error message is gone. Thanks for the fix!

mrwyattii · 2023-12-08T00:02:38Z

Great to hear. Closing the issue, but please reopen (or create a new issue) if you see this behavior return. I will get this merged into the main branch and it will be part of the next MII release.

mrwyattii mentioned this issue Dec 6, 2023

Fix for error messages in persistent deployment #350

Merged

mrwyattii closed this as completed Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error messages spilled from persistent deployment for every request #349

Error messages spilled from persistent deployment for every request #349

weiqisun commented Dec 5, 2023

thelongestusernameofall commented Dec 6, 2023

mrwyattii commented Dec 6, 2023

weiqisun commented Dec 7, 2023

mrwyattii commented Dec 7, 2023 •

edited

Loading

weiqisun commented Dec 7, 2023

mrwyattii commented Dec 8, 2023

Error messages spilled from persistent deployment for every request #349

Error messages spilled from persistent deployment for every request #349

Comments

weiqisun commented Dec 5, 2023

thelongestusernameofall commented Dec 6, 2023

mrwyattii commented Dec 6, 2023

weiqisun commented Dec 7, 2023

mrwyattii commented Dec 7, 2023 • edited Loading

weiqisun commented Dec 7, 2023

mrwyattii commented Dec 8, 2023

mrwyattii commented Dec 7, 2023 •

edited

Loading