Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing GenerationConfig in the GRPO Trainer #2702

Open
Superskyyy opened this issue Jan 30, 2025 · 4 comments
Open

Exposing GenerationConfig in the GRPO Trainer #2702

Superskyyy opened this issue Jan 30, 2025 · 4 comments
Labels
✨ enhancement New feature or request 🏋 GRPO Related to GRPO

Comments

@Superskyyy
Copy link
Contributor

Feature request

Often people need to customize the generation config, now it's embedded in the training loop. Should be easy to extract it out.

Motivation

Customization

Your contribution

I can help to contribute.

@github-actions github-actions bot added ✨ enhancement New feature or request 🏋 GRPO Related to GRPO labels Jan 30, 2025
@qgallouedec
Copy link
Member

More control on the generation does make sense. A reasonable way to allow for more control is probably to add more generation args in the GRPOConfig.
Are you willing to contribute?

@Superskyyy
Copy link
Contributor Author

More control on the generation does make sense. A reasonable way to allow for more control is probably to add more generation args in the GRPOConfig. Are you willing to contribute?

Yes I will contribute this feature.

@Benjoyo
Copy link

Benjoyo commented Jan 31, 2025

More control on the generation does make sense. A reasonable way to allow for more control is probably to add more generation args in the GRPOConfig. Are you willing to contribute?

Yes I will contribute this feature.

Please add stop_strings or stopping criteria :)

Although I don’t see why not exposing the full generation config to avoid the next issue of this type in a few weeks.

@Superskyyy
Copy link
Contributor Author

More control on the generation does make sense. A reasonable way to allow for more control is probably to add more generation args in the GRPOConfig. Are you willing to contribute?

Yes I will contribute this feature.

Please add stop_strings or stopping criteria :)

Although I don’t see why not exposing the full generation config to avoid the next issue of this type in a few weeks.

@qgallouedec wdyt? Either we should just directly expose the entire generation config because there are all kinds of tricks that people might want to tune there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request 🏋 GRPO Related to GRPO
Projects
None yet
Development

No branches or pull requests

3 participants