-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deepseek v2 16b enablement with IFU #51
base: rocm_dev
Are you sure you want to change the base?
Conversation
@lcskrishna @wenchenvincent updated PR after IFU merge. |
@@ -136,7 +136,7 @@ def forward( | |||
position_ids=None, | |||
): | |||
"""Forward pass for multi-latent attention""" | |||
assert rotary_pos_emb is None, "Rotary position embeddings should not be passed into MLA." | |||
#assert rotary_pos_emb is None, "Rotary position embeddings should not be passed into MLA." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this is Okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be Okay, it is not used in the method.
Line 165 , self._adjust_key_value_for_inference uses rotary_pos_emb=None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rotary_pos_emb is not really used in this forward function, could we just pass None to rotary_pos_emb when this forward function is called?
@@ -1,5 +1,4 @@ | |||
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved. | |||
# Copyright (c) 2024, Advanced Micro Devices, Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we made some minor changes on this file? So it seems that we still need to put the AMD Copyright statement here...
@@ -0,0 +1,243 @@ | |||
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file from NV upstream or PAI?
@hakankiymaz-amd Could you run through the files again to check if the copyright statements are fine? |
@wenchenvincent sure. and here are the test results. |
Deepseek v2 16b enablement
MI300X-DeepSeek-V2-Lite-bf16-seq2048-tp1pp1ep8-mbsgbs-ac_sel-do_true-fa_true-sp_true-20250127_141425.log
throughput per GPU: 601.616