Deepseek v2 16b enablement with IFU #51

hakankiymaz-amd · 2025-01-27T16:05:02Z

Deepseek v2 16b enablement
MI300X-DeepSeek-V2-Lite-bf16-seq2048-tp1pp1ep8-mbsgbs-ac_sel-do_true-fa_true-sp_true-20250127_141425.log
throughput per GPU: 601.616

hakankiymaz-amd · 2025-01-27T16:07:09Z

@lcskrishna @wenchenvincent updated PR after IFU merge.

wenchenvincent · 2025-01-28T02:34:04Z

megatron/core/transformer/multi_latent_attention.py

@@ -136,7 +136,7 @@ def forward(
        position_ids=None,
    ):
        """Forward pass for multi-latent attention"""
-        assert rotary_pos_emb is None, "Rotary position embeddings should not be passed into MLA."
+        #assert rotary_pos_emb is None, "Rotary position embeddings should not be passed into MLA."


Are we sure this is Okay?

This should be Okay, it is not used in the method.
Line 165 , self._adjust_key_value_for_inference uses rotary_pos_emb=None.

If rotary_pos_emb is not really used in this forward function, could we just pass None to rotary_pos_emb when this forward function is called?

megatron/core/transformer/dot_product_attention.py

megatron/core/transformer/mlp.py

megatron/training/utils.py

wenchenvincent · 2025-01-29T20:18:00Z

megatron/core/transformer/dot_product_attention.py

@@ -1,5 +1,4 @@
 # Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
-# Copyright (c) 2024, Advanced Micro Devices, Inc. All rights reserved.


It seems that we made some minor changes on this file? So it seems that we still need to put the AMD Copyright statement here...

wenchenvincent · 2025-01-29T20:21:37Z

megatron/core/models/deepseekv2/model.py

@@ -0,0 +1,243 @@
+# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.


Is this file from NV upstream or PAI?

wenchenvincent · 2025-01-29T20:25:22Z

@hakankiymaz-amd Could you run through the files again to check if the copyright statements are fine?
The guideline is that if we port the files from NV or PAI as is, we don't add AMD copyright statement there. But if we make changes, we will need to add AMD copyright statement.

hakankiymaz-amd · 2025-01-29T21:51:23Z

@wenchenvincent sure. and here are the test results.
test_report_dsv2_ifu.csv

hakankiymaz-amd added 30 commits January 22, 2025 15:19

refactor deepseekv2

86e35f3

edit args file

1c2a2ce

transformer rms_norm fix

433aea9

attention layer, transformer config

5819960

layer specs

3638ad3

layer norm fix

a9c9ab6

attention args

fdefee4

layer specs and attention fix

6528af8

add get qkv tensors

58f070f

forward method position id fix

7076c31

import path modify

906c44e

fix

12023f7

fix transformer config param

f7ebad2

update on transformers

3a59ced

transformer block update

8c45703

rename submodules

eb025b6

rms norm

9176adf

attention submodules

15ee1c0

MLLA self attn for deepseekv2

c37155a

modify dsv2 config

3c454f9

add copyright statements

ce9e38f

copyright

e02b242

add TE for MLA

eac6edf

MLA config fix

bb3ac15

add markers for MLA unit test

dba8aa1

update TE layers

fd1759d

te correction in layer specs

61f98fd

rebase ifu and deepseek

0968e7c

updates sync with ifu

65047d2

fix args for tokenizer

e2e2682

hakankiymaz-amd requested review from lcskrishna and wenchenvincent January 27, 2025 16:06

wenchenvincent reviewed Jan 28, 2025

View reviewed changes

lcskrishna requested changes Jan 28, 2025

View reviewed changes

megatron/core/transformer/dot_product_attention.py Outdated Show resolved Hide resolved

megatron/core/transformer/mlp.py Outdated Show resolved Hide resolved

megatron/training/utils.py Outdated Show resolved Hide resolved

hakankiymaz-amd added 2 commits January 28, 2025 02:30

remove copyright and unused functions

5a092fd

rotary emb and mla fix

e2404ad

wenchenvincent reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepseek v2 16b enablement with IFU #51

Deepseek v2 16b enablement with IFU #51

hakankiymaz-amd commented Jan 27, 2025

hakankiymaz-amd commented Jan 27, 2025

wenchenvincent Jan 28, 2025

hakankiymaz-amd Jan 28, 2025

wenchenvincent Jan 28, 2025

wenchenvincent Jan 29, 2025

wenchenvincent Jan 29, 2025

wenchenvincent commented Jan 29, 2025

hakankiymaz-amd commented Jan 29, 2025 •

edited

Loading

		@@ -1,5 +1,4 @@
		# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
		# Copyright (c) 2024, Advanced Micro Devices, Inc. All rights reserved.

		@@ -0,0 +1,243 @@
		# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.

Deepseek v2 16b enablement with IFU #51

Are you sure you want to change the base?

Deepseek v2 16b enablement with IFU #51

Conversation

hakankiymaz-amd commented Jan 27, 2025

hakankiymaz-amd commented Jan 27, 2025

wenchenvincent Jan 28, 2025

Choose a reason for hiding this comment

hakankiymaz-amd Jan 28, 2025

Choose a reason for hiding this comment

wenchenvincent Jan 28, 2025

Choose a reason for hiding this comment

wenchenvincent Jan 29, 2025

Choose a reason for hiding this comment

wenchenvincent Jan 29, 2025

Choose a reason for hiding this comment

wenchenvincent commented Jan 29, 2025

hakankiymaz-amd commented Jan 29, 2025 • edited Loading

hakankiymaz-amd commented Jan 29, 2025 •

edited

Loading