Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various errors in orpo_finetuning_example.ipynb notebook #63

Open
fairydreaming opened this issue Dec 7, 2024 · 3 comments
Open

Various errors in orpo_finetuning_example.ipynb notebook #63

fairydreaming opened this issue Dec 7, 2024 · 3 comments

Comments

@fairydreaming
Copy link

In orpo_finetuning_example.ipynb the currently the used dataset is "trl-lib/ultrafeedback_binarized" and it's loaded with split="train". Previously it was "mlabonne/orpo-dpo-mix-40k" with split="all". The current dataset causes failure during trainer creation:

orpo-error

Shouldn't it be split=None when loading the dataset?

Also the title in the Exercise section of the notebook is:

Exercise: Aligning SmolLM2 with DPOTrainer

Did you mean ORPOTrainer?

Finally, I noticed that you changed the optimizer to optim="paged_adamw_8bit". This optimizer requires bitsandbytes library, but this library is not listed in requirements.txt.

@fairydreaming
Copy link
Author

After fixing these problems and finishing the training run I examined the metrics:

orpo-metrics

Accuracy is 0.5 the whole time, margins barely budged. Does it indicate failure of the training run (at least regarding the preference alignment part)?

@fairydreaming
Copy link
Author

@mmeendez8 Did you manage to fix orpo_finetuning_example.ipynb notebook so that the preference alignment training process actually works based on the reported metrics? I see that you fixed most of issues that I reported here, did you get the same metrics values as me?

@mmeendez8
Copy link
Contributor

No sorry, I just went trough the notebook quickly to check and found out a couple of issues. Need to spend some time checking that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants