Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to build a dataset for finetuning CogVideoX I2V 1.5 #169

Open
royvelich opened this issue Dec 31, 2024 · 1 comment
Open

How to build a dataset for finetuning CogVideoX I2V 1.5 #169

royvelich opened this issue Dec 31, 2024 · 1 comment

Comments

@royvelich
Copy link

Hi,
I want to finetune the CogVideoX I2V 1.5 (5B) model. I have a set of videos that I want to use, but first I need to preprocess them so they meet the requirements of the model. Do I have to make sure that my fine-tuning dataset meets the generation properties of the model? That is, in the case of CogVideoX 1.5, the videos should be:

  • Min(W, H) = 768
  • 768 ≤ Max(W, H) ≤ 1360
  • Max(W, H) % 16 = 0
  • Video Length: 5 seconds or 10 seconds
  • Frame Rate: 16 frames / second

Do I need to make sure that all my fine-tuning videos follow those guidelines?

@sayakpaul
Copy link
Collaborator

These are recommended settings to ensure the results don't deviate because of discrepancies. But I think we can still fine-tune it hard enough to adapt to particular settings. Some models are easier to adapt than others (LTX, for example, is quite adaptable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants