Fixed configuration of both llama2 and llama3 #50

indianspeedster · 2025-01-22T19:27:05Z

Corrected the configuration for llam2-7b in train_llama2.sh.

The old configuration were different from what mentioned in Original llama2 and on Huggingface.

Modified changes:

FFN_HIDDEN_SIZE = 11008
NUM_KV_HEADS = 32

lcskrishna

@indianspeedster Thanks for fixing the values of 7B inside our scripts. Can you review train_llama3.sh as well just to avoid such errors in future.

indianspeedster · 2025-01-24T01:29:31Z

@indianspeedster Thanks for fixing the values of 7B inside our scripts. Can you review train_llama3.sh as well just to avoid such errors in future.

@lcskrishna I checked all the configurations. max_position_embeddings needed to be changed to match the original configuration of llama2 and llama3. I have changed those and rest all other configuration looks fine to me.

For reviewers:

Official llama2 70 b config : https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json
Official llama2 7 b config : https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/config.json

Official llama3 8b config: https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/config.json
Official llama3 70b config: https://huggingface.co/meta-llama/Llama-3.1-70B/blob/main/config.json

gurpreet-dhami · 2025-01-24T15:31:21Z

examples/llama/train_llama3.sh

@@ -114,7 +114,8 @@ if [[ $MODEL_SIZE -eq 8 ]]; then #llama2-7B
        NUM_LAYERS=32 # e.g. llama-13b: 40
        NUM_HEADS=32 # e.g. llama-13b: 40
        SEQ_LENGTH=$SEQ_LENGTH
-        NUM_KV_HEADS=8 # llama2 70B uses GQA
+        NUM_KV_HEADS=8 
+        MAX_POSITION_EMBEDDINGS=$MAX_POSITION_EMBEDDINGS


We can omit this line if it is using the same value as defined initially at the top. isn't it ?
MAX_POSITION_EMBEDDINGS

Hi @gurpreet-dhami,

Yes, MAX_POSITION_EMBEDDINGS is being repeated. I have removed it and also realized that SEQ_LENGTH is also being repetitive. Will remove that too.

wenchenvincent · 2025-01-24T17:44:18Z

@indianspeedster Could you write clear description in the commit message telling what each commit does instead of the general message "Update train_llama2.sh"?

…original config of llama2-7b

…nfig of llama-3

…nfig of llama-2

indianspeedster · 2025-01-24T19:46:59Z

@wenchenvincent Modified the commit message of all the commits.

indianspeedster requested a review from lizamd January 22, 2025 19:27

lcskrishna approved these changes Jan 23, 2025

View reviewed changes

lcskrishna requested a review from wenchenvincent January 23, 2025 05:45

gurpreet-dhami reviewed Jan 24, 2025

View reviewed changes

lizamd approved these changes Jan 24, 2025

View reviewed changes

indianspeedster changed the title ~~Update train_llama2.sh~~ Removed Repetitive MAX_POSITION_EMBEDDINGS Jan 24, 2025

indianspeedster changed the title ~~Removed Repetitive MAX_POSITION_EMBEDDINGS~~ Fixed configuration of both llam2 and llama3 Jan 24, 2025

indianspeedster changed the title ~~Fixed configuration of both llam2 and llama3~~ Fixed configuration of both llama2 and llama3 Jan 24, 2025

indianspeedster force-pushed the rocm_dev branch 2 times, most recently from c86af52 to 5503fa4 Compare January 24, 2025 18:29

indianspeedster added 5 commits January 24, 2025 18:49

Corrected the value of FFN_HIDDEN_SIZE and NUM_KV_HEADS to match the …

36ee5ab

…original config of llama2-7b

Changed the value of MAX_POSITION_EMBEDDINGS to match the original co…

79c30af

…nfig of llama-3

Changed the value of MAX_POSITION_EMBEDDINGS to match the original co…

5660735

…nfig of llama-2

Removed the repititive occurence of MAX_POSITION_EMBEDDING.

5271cbf

Removed the repititive occurence of MAX_POSITION_EMBEDDING.

56720ee

indianspeedster force-pushed the rocm_dev branch from 5503fa4 to 56720ee Compare January 24, 2025 18:54

Removed the repetitive occurrence of SEQ_LEN

1a1a8a5

gurpreet-dhami merged commit fe353fd into ROCm:rocm_dev Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed configuration of both llama2 and llama3 #50

Fixed configuration of both llama2 and llama3 #50

indianspeedster commented Jan 22, 2025

lcskrishna left a comment

indianspeedster commented Jan 24, 2025

gurpreet-dhami Jan 24, 2025

indianspeedster Jan 24, 2025

wenchenvincent commented Jan 24, 2025

indianspeedster commented Jan 24, 2025 •

edited

Loading

Fixed configuration of both llama2 and llama3 #50

Fixed configuration of both llama2 and llama3 #50

Conversation

indianspeedster commented Jan 22, 2025

lcskrishna left a comment

Choose a reason for hiding this comment

indianspeedster commented Jan 24, 2025

gurpreet-dhami Jan 24, 2025

Choose a reason for hiding this comment

indianspeedster Jan 24, 2025

Choose a reason for hiding this comment

wenchenvincent commented Jan 24, 2025

indianspeedster commented Jan 24, 2025 • edited Loading

indianspeedster commented Jan 24, 2025 •

edited

Loading