How to build a dataset for finetuning CogVideoX I2V 1.5 #169

royvelich · 2024-12-31T19:55:00Z

Hi,
I want to finetune the CogVideoX I2V 1.5 (5B) model. I have a set of videos that I want to use, but first I need to preprocess them so they meet the requirements of the model. Do I have to make sure that my fine-tuning dataset meets the generation properties of the model? That is, in the case of CogVideoX 1.5, the videos should be:

Min(W, H) = 768
768 ≤ Max(W, H) ≤ 1360
Max(W, H) % 16 = 0
Video Length: 5 seconds or 10 seconds
Frame Rate: 16 frames / second

Do I need to make sure that all my fine-tuning videos follow those guidelines?

sayakpaul · 2025-01-02T14:13:09Z

These are recommended settings to ensure the results don't deviate because of discrepancies. But I think we can still fine-tune it hard enough to adapt to particular settings. Some models are easier to adapt than others (LTX, for example, is quite adaptable).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to build a dataset for finetuning CogVideoX I2V 1.5 #169

How to build a dataset for finetuning CogVideoX I2V 1.5 #169

royvelich commented Dec 31, 2024

sayakpaul commented Jan 2, 2025

How to build a dataset for finetuning CogVideoX I2V 1.5 #169

How to build a dataset for finetuning CogVideoX I2V 1.5 #169

Comments

royvelich commented Dec 31, 2024

sayakpaul commented Jan 2, 2025