You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm struggling managing a dataset of only 10s of files, I can only imagine what it would be like with 100s of samples. Having path + prompt separate and unnumbered makes it difficult to ensure that each sample has the right prompt, or if one is missing (where?).
I propose changing the json dataset format to something like:
This makes it easy to also add additional information later on, if only for house-keeping, like "fps", and "frames", "width", etc etc.
Maybe even wrap the array in an additional layer, if a dataset could use other information too?
Your contribution / 您的贡献
If the proposal is accepted, I can make a pr for this.
The text was updated successfully, but these errors were encountered:
You need to pass --dataset_root as the location where the videos are located, --dataset_file as the json file, --video_column as the name of the attribute that contains the path (path must be relative to dataset_root), and --caption_column as the attribute containing the prompt. LMK if something doesn't work and please feel free to modify the code and submit PRs to make it better suited for general use
Feature request / 功能建议
I'm struggling managing a dataset of only 10s of files, I can only imagine what it would be like with 100s of samples. Having path + prompt separate and unnumbered makes it difficult to ensure that each sample has the right prompt, or if one is missing (where?).
I propose changing the json dataset format to something like:
Motivation / 动机
This makes it easy to also add additional information later on, if only for house-keeping, like "fps", and "frames", "width", etc etc.
Maybe even wrap the array in an additional layer, if a dataset could use other information too?
Your contribution / 您的贡献
If the proposal is accepted, I can make a pr for this.
The text was updated successfully, but these errors were encountered: