Please follow the installation instructions in INSTALL.
You can find the dataset instructions in DATASET. We have provide all the metadata files of our data.
- All the
config.yaml
in ourexp
are NOT the training config actually used, since some hyperparameters are changed in therun.sh
ortest.sh
. - For more config details, you can read the comments in
slowfast/config/defaults.py
. - We adopt sparse sampling for all the datasets.
- For those scene-related datasets (e.g., Kinetics), we ONLY add global UniBlocks.
- For those temporal-related datasets (e.g., Sth-Sth), we adopt ALL the designs, including local UniBlocks, global UniBlocks and temporal downsampling.
- If you meet problem when running the backward process, please see issue#4.
N_LAYERS: 4 # number of global UniBlocks
MLP_DROPOUT: [0.5, 0.5, 0.5, 0.5] # dropout for each global UniBlocks
CLS_DROPOUT: 0.5 # dropout for the final classification layer
RETURN_LIST: [8, 9, 10, 11] # layer index for inserting global UniBlocks
NO_LMHRA: True # whether adding local MHRA in the local UniBlocks
TEMPORAL_DOWNSAMPLE: False # whether using temporal downsampling in the patch embedding
FROZEN: False # whether freeze backbone
Our models are based on pretrained ViTs, and we use CLIP pretrained models by default:
- Follow
extract_clip
to extract visual encoder from CLIP. - Change
MODEL_PATH
inslowfast/models/uniformerv2_model.py
.
For training, you can simply run the training scripts in exp
as follows:
bash ./exp/k400/k400_b16_f8x224/run.sh
For testing, you can simply run the training scripts in exp
as follows:
bash ./exp/k400/k400_b16_f8x224/test.sh
Make sure TRAIN.ENABLE=False
. You can set the number of crops and clips (intest.sh
) as follows:
TEST.NUM_ENSEMBLE_VIEWS 4
TEST.NUM_SPATIAL_CROPS 3
You can also set the checkpoint path as follows:
TEST.CHECKPOINT_FILE_PATH your_model_path