Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trtllm0.9 changes #149

Merged

Conversation

jiemingz
Copy link
Collaborator

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

  • Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

Checklist when contributing a new algorithm

  • Does the trainer resume and restore model state all states?
  • Does the trainer support all parallelism techniques(PP, TP, DP)?
  • Does the trainer support max_steps=-1 and validation?
  • Does the trainer only call APIs defined in alignable_interface.py?
  • Does the trainer have proper logging?

Additional Information

  • Related to # (issue)

Signed-off-by: jiemingz <=>
@github-actions github-actions bot added the Utils label Apr 10, 2024
Comment on lines 89 to 92
if parallel_state.get_pipeline_model_parallel_world_size() > 1 and not self.reshard_model:
group = parallel_state.get_pipeline_model_parallel_group()
src = parallel_state.get_pipeline_model_parallel_first_rank()
output_ids = broadcast_2d_tensor(output_ids, src, group, dtype=output_ids.dtype)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/NVIDIA/NeMo-Aligner/blob/geshen/main_trt/nemo_aligner/utils/distributed.py#L102

I think you can use this function instead. The groups are unchanged when we don't have resharding and when there's resharding this should be fairly trivial(though if you think we have perf hit lmk)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldnt this also broadcast to gpus in the same TP group? Which TRTLLM already has equal across TP

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah do you think that's too much of a perf hit or doesn't matter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think it would affect perf but it does make the code less clear
This broadcast only happens without resharding, do we need to guard it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay let's guard on no reshard and then use that function then?

@@ -38,6 +38,12 @@ trainer:
trt_llm:
enable: False
reshard: False # if True then reshard the model into TP only for inf
max_context_len: ${int_div:${model.encoder_seq_length}, 2}
model_type: "LLaMAForCausalLM"
Copy link
Collaborator

@gshennvm gshennvm Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... forgot about this. we need to change this for mixtral etc?

Copy link

@JimmyZhang12 JimmyZhang12 Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah with the TRTLLM unified builder we can only build models that TRTLLM supports, and have to specify that model.
NeMo export had a way to automatically detect the model type but it was prone to failure so I think its best if the user explicitly defines the TRTLLM model type

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay it'll be good to just list which ones people can specify

jiemingz added 3 commits April 10, 2024 15:03
Signed-off-by: jiemingz <=>
Signed-off-by: jiemingz <=>
Signed-off-by: jiemingz <=>
jiemingz added 4 commits April 11, 2024 12:43
fix
Signed-off-by: jiemingz <=>
Signed-off-by: jiemingz <=>
Signed-off-by: jiemingz <=>
Signed-off-by: jiemingz <=>
Copy link
Collaborator

@gshennvm gshennvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once you resolve the 2 small comments

Signed-off-by: jiemingz <=>
@gshennvm gshennvm merged commit 0983164 into NVIDIA:geshen/main_trt Apr 17, 2024
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants