Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Non-Zero Return Codes in Process Execution in start_pytorch_task.py #226

Closed
clemente0731 opened this issue Aug 30, 2023 · 2 comments

Comments

@clemente0731
Copy link
Contributor

描述:

在分析位于training/run_benchmarks/pytorch目录下的start_pytorch_task.py文件(第193行)时,可能有必要为正在执行的进程实现返回代码检查。目前,代码使用proc.wait()等待每个进程完成,但缺乏对非零返回代码的适当处理。

建议更改:

修改位于start_pytorch_task.py文件第193行开始的代码块,以为每个进程执行引入返回代码检查。
1 如果进程的返回代码为非零:
2 引发一个带有信息性错误消息的异常。
3 在错误消息中包含进程ID,以识别有问题的进程。

@clemente0731
Copy link
Contributor Author

现有问题是 不管是 模型启动失败还是成功 进程返回值都是0

@shh2000
Copy link
Collaborator

shh2000 commented Aug 31, 2023

#225

@shh2000 shh2000 closed this as completed Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants