Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to confirm nccl environment variables #305

Open
sunyq1995 opened this issue Mar 18, 2020 · 5 comments
Open

How to confirm nccl environment variables #305

sunyq1995 opened this issue Mar 18, 2020 · 5 comments

Comments

@sunyq1995
Copy link

Hi all,
I want to confirm nccl environment variables now I'm using ,but I can't find the way to get it .what should I do? Thank a lot for your help ~

@kwen2501
Copy link
Contributor

Which environment variable do you want to check? If you set NCCL_DEBUG=INFO, you would be able to see most environment variables you set.

@sunyq1995
Copy link
Author

NCCL_DEBUG=INFO
Thanks for your quickily reply~
I want to confirm NCCL_P2P_LEVEL or some IB information ,I can not see these infomation when I setted NCCL_DEBUG=INFO.

@kwen2501
Copy link
Contributor

Which NCCL version are you using? If you are using the v2.6 branch, you would indeed need to set NCCL_DEBUG=INFO AND NCCL_DEBUG_SUBSYS=GRAPH to see NCCL_P2P_LEVEL. But in the master branch as of now, setting NCCL_DEBUG=INFO only will do the job.

Which IB environment variable are you looking for? We can perhaps provide a short patch for you to see it.

Also please note that if you have not set the environment variable, NCCL will not print it out.

@sunyq1995
Copy link
Author

NCCL version 2.4.8+cuda10.0
It didn't print anything when I typed NCCL_DEBUG=INFO.

I'm training a pytorch bert model using muilt GPU, but it doesn't speed up as expected, and the IB flow is very low, so I want to check whether there is anything wrong on NCCL settings , so I need all of IB environment variables if you can provide .

thanks a lot for your help again~

@kwen2501
Copy link
Contributor

NCCL 2.4.8 should print NCCL_P2P_LEVEL too (if it is set).

If you are debugging performance issues, you are welcome to post your NCCL_DEBUG=INFO logs here and/or run nccl perf test (https://github.com/NVIDIA/nccl-tests).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants