Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed configuration of both llama2 and llama3 #50

Merged
merged 6 commits into from
Jan 28, 2025

Conversation

indianspeedster
Copy link

Corrected the configuration for llam2-7b in train_llama2.sh.

The old configuration were different from what mentioned in Original llama2 and on Huggingface.

Modified changes:

FFN_HIDDEN_SIZE = 11008
NUM_KV_HEADS = 32

Copy link
Collaborator

@lcskrishna lcskrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@indianspeedster Thanks for fixing the values of 7B inside our scripts. Can you review train_llama3.sh as well just to avoid such errors in future.

@indianspeedster
Copy link
Author

@indianspeedster Thanks for fixing the values of 7B inside our scripts. Can you review train_llama3.sh as well just to avoid such errors in future.

@lcskrishna I checked all the configurations. max_position_embeddings needed to be changed to match the original configuration of llama2 and llama3. I have changed those and rest all other configuration looks fine to me.

For reviewers:

Official llama2 70 b config : https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json
Official llama2 7 b config : https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/config.json

Official llama3 8b config: https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/config.json
Official llama3 70b config: https://huggingface.co/meta-llama/Llama-3.1-70B/blob/main/config.json

@@ -114,7 +114,8 @@ if [[ $MODEL_SIZE -eq 8 ]]; then #llama2-7B
NUM_LAYERS=32 # e.g. llama-13b: 40
NUM_HEADS=32 # e.g. llama-13b: 40
SEQ_LENGTH=$SEQ_LENGTH
NUM_KV_HEADS=8 # llama2 70B uses GQA
NUM_KV_HEADS=8
MAX_POSITION_EMBEDDINGS=$MAX_POSITION_EMBEDDINGS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can omit this line if it is using the same value as defined initially at the top. isn't it ?
MAX_POSITION_EMBEDDINGS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gurpreet-dhami,

Yes, MAX_POSITION_EMBEDDINGS is being repeated. I have removed it and also realized that SEQ_LENGTH is also being repetitive. Will remove that too.

@wenchenvincent
Copy link
Collaborator

@indianspeedster Could you write clear description in the commit message telling what each commit does instead of the general message "Update train_llama2.sh"?

@indianspeedster indianspeedster changed the title Update train_llama2.sh Removed Repetitive MAX_POSITION_EMBEDDINGS Jan 24, 2025
@indianspeedster indianspeedster changed the title Removed Repetitive MAX_POSITION_EMBEDDINGS Fixed configuration of both llam2 and llama3 Jan 24, 2025
@indianspeedster indianspeedster changed the title Fixed configuration of both llam2 and llama3 Fixed configuration of both llama2 and llama3 Jan 24, 2025
@indianspeedster indianspeedster force-pushed the rocm_dev branch 2 times, most recently from c86af52 to 5503fa4 Compare January 24, 2025 18:29
@indianspeedster
Copy link
Author

indianspeedster commented Jan 24, 2025

@wenchenvincent Modified the commit message of all the commits.

@gurpreet-dhami gurpreet-dhami merged commit fe353fd into ROCm:rocm_dev Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants