-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on. #722
Comments
Hi @bosmart |
Thanks @younesbelkada, makes a lot of sense now - I am getting a new error now however:
|
@lewtun I'm not even using Is disabling checkpointing just a workaround or is there a reason why peft+ddp+4bit can't work with checkpointing enabled? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Hi @bosmart |
For anyone getting the "You can't train" error in dpo_llama2.py, you can fix by adding the following to the configs for the model and model-ref: |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Sorry, what do you mean by configs for the model and model-ref? I know I can include |
Getting the above error (
You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on.
) when trying to run the Llama2 SFT example:accelerate launch sft_llama2.py --output_dir="sft"
My accelerate config file:
Library versions:
I have a dual 3090 machine.
The text was updated successfully, but these errors were encountered: