You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use two GPUs with tensor_parallel=2. It seems it only releases memory on one gpu. There is some process still running. The client.terminate_server doesn't seem to kill all processes. I can kill the process manually, but how can I do it properly in the python code?
This was a bug that has been fixed in #262. Please update to the latest main (we will also do a patch release with this and other bug fixes later this week).
I am trying to use two GPUs with tensor_parallel=2. It seems it only releases memory on one gpu. There is some process still running. The client.terminate_server doesn't seem to kill all processes. I can kill the process manually, but how can I do it properly in the python code?
The text was updated successfully, but these errors were encountered: