Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the tuner's selecting debug print to INFO level #373

Merged
merged 1 commit into from
Apr 4, 2024

Conversation

rajachan
Copy link
Member

@rajachan rajachan commented Apr 3, 2024

This is a useful print to debug issues with algorithm selection at scale without having to rebuild the plugin. While the actual costs calculated are more developer-focused and can stay under TRACE, the final choice should be available to INFO when the TUNING debug subsystem is enabled.

NCCL 2.20 moved their algo/proto selection print from a TRACE to INFO for this reason as well.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This is a useful print to debug issues with algorithm selection at scale
without having to rebuild the plugin. While the actual costs calculated
are more developer-focused and can stay under TRACE, the final choice
should be available to INFO when the TUNING debug subsystem is enabled.

NCCL 2.20 moved their algo/proto selection print from a TRACE to INFO
for this reason as well.

Signed-off-by: Raghu Raja <[email protected]>
@rajachan rajachan requested a review from AmedeoSapio April 3, 2024 23:37
Copy link

@AmedeoSapio AmedeoSapio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@rajachan rajachan merged commit b00760e into aws:master Apr 4, 2024
9 of 13 checks passed
@rajachan rajachan deleted the tune-print branch April 4, 2024 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants