CLASS_IDX and SPLIT_IDX mean？ #4

ha1ha2hahaha · 2024-10-29T10:17:08Z

I dont know how to how to set the CLASS_IDX and SPLIT_IDX，can you give a example，please

jabader97 · 2024-11-04T16:18:08Z

Hello,

The flag --target_class_idx is for parallelizing the generation across different GPUs, as the process happens separately for each class in the cls-wise version of training the DataDream weights, and for both cls- and dset-wise generation. i.e. if you want to split N classes among M GPUs, this is how you can assign individual classes to GPUs.

e.g. to put class 0 on GPU 1, you could use
CUDA_VISIBLE_DEVICES=1, accelerate launch datadream.py
--target_class_idx=0
...

To generate the full dataset, you would need to execute this code for each individual class target_class_idx = 0 - (N - 1).

To use CLASS_IDX, you would specify each class individually (this could be convenient if you are using slurm).

On the other hand, SPLIT_IDX provides a way to split the classes evenly among M available GPUs. In bash_run.sh, SET_SPLIT defines M (currently set to M = 5). It will allocate 1 / 5th of the jobs to a given GPU.
e.g. if you have 100 classes and 5 GPUs, then
bash bash_run.sh 2 0
would submit classes 0 - 19 to GPU 2. To generate the full dataset, you would need to run this for SPLIT_IDX = 0 - 4, with the desired GPUs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLASS_IDX and SPLIT_IDX mean？ #4

CLASS_IDX and SPLIT_IDX mean？ #4

ha1ha2hahaha commented Oct 29, 2024

jabader97 commented Nov 4, 2024

CLASS_IDX and SPLIT_IDX mean？ #4

CLASS_IDX and SPLIT_IDX mean？ #4

Comments

ha1ha2hahaha commented Oct 29, 2024

jabader97 commented Nov 4, 2024