-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error messages spilled from persistent deployment for every request #349
Comments
遇到了同样的问题。 |
Thanks @mrwyattii! However, I'm still seeing this error message. I confirm I installed the module from your branch since I have the updated |
Hmm, I'm not able to reproduce this with the fix I have in #350. Could you try adding a print statement that shows the contents of response? Please add print(f"RANK {self.inference_pipeline.local_rank} RESPONSE:", [r.to_msg_dict() for r in responses]) just before the You will want to modify this file on your local system: Share the output of that print statement. Thanks! |
Actually, nvm. With a clean setup from scratch, now the error message is gone! I'm not sure if it was due to a not-properly terminated process. I previously had a server running overnight. I stopped the server this morning before updating the mii module and I still observed the same error message after the update. But then I realized there were two leftover mii processes after I terminated the server by
After manually killing these two processes, I started the server again and the error message is gone. Thanks for the fix! |
Great to hear. Closing the issue, but please reopen (or create a new issue) if you see this behavior return. I will get this merged into the |
Hi, in the latest release with version
0.1.2
, the persistent server returns responses with more details, includingprompt_length
,generated_length
, andfinished_reason
. This is awesome and super useful. Thanks for the updates! However, now the persistent server throws error messages on every request:I can still get valid responses back, with correct
generated_text
,finish_reason
etc. But it will be great to eliminate this error message on the server side. For now as a hacky workaround, I modified line 85 ingrpc_related/task_methods.py
tofinish_reason=str(r.finish_reason.value) if r.finish_reason is not None else "none"
.Here is my setup. I started the persistent server by:
And I call the server using:
Then I will get the error message on the server side. I'm running this on 2 A100-80GB GPUs and the related module versions are:
Any help will be appreciated and thanks again for the awesome tool you have built!
The text was updated successfully, but these errors were encountered: