-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed configuration of both llama2 and llama3 #50
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@indianspeedster Thanks for fixing the values of 7B inside our scripts. Can you review train_llama3.sh as well just to avoid such errors in future.
@lcskrishna I checked all the configurations. max_position_embeddings needed to be changed to match the original configuration of llama2 and llama3. I have changed those and rest all other configuration looks fine to me. For reviewers: Official llama2 70 b config : https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json Official llama3 8b config: https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/config.json |
examples/llama/train_llama3.sh
Outdated
@@ -114,7 +114,8 @@ if [[ $MODEL_SIZE -eq 8 ]]; then #llama2-7B | |||
NUM_LAYERS=32 # e.g. llama-13b: 40 | |||
NUM_HEADS=32 # e.g. llama-13b: 40 | |||
SEQ_LENGTH=$SEQ_LENGTH | |||
NUM_KV_HEADS=8 # llama2 70B uses GQA | |||
NUM_KV_HEADS=8 | |||
MAX_POSITION_EMBEDDINGS=$MAX_POSITION_EMBEDDINGS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can omit this line if it is using the same value as defined initially at the top. isn't it ?
MAX_POSITION_EMBEDDINGS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gurpreet-dhami,
Yes, MAX_POSITION_EMBEDDINGS is being repeated. I have removed it and also realized that SEQ_LENGTH
is also being repetitive. Will remove that too.
@indianspeedster Could you write clear description in the commit message telling what each commit does instead of the general message "Update train_llama2.sh"? |
MAX_POSITION_EMBEDDINGS
MAX_POSITION_EMBEDDINGS
c86af52
to
5503fa4
Compare
…original config of llama2-7b
5503fa4
to
56720ee
Compare
@wenchenvincent Modified the commit message of all the commits. |
Corrected the configuration for llam2-7b in train_llama2.sh.
The old configuration were different from what mentioned in Original llama2 and on Huggingface.
Modified changes:
FFN_HIDDEN_SIZE = 11008
NUM_KV_HEADS = 32