CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/finetune_imagenet_vicmae_base.sh vit_base_patch16 vicmae_pretrain_vit_base_in1k_k400.pth imagenet/
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/finetune_imagenet_vicmae_large.sh vit_large_patch16 vicmae_pretrain_vit_large_in1k_k400.pth imagenet/
Coming soon.
Coming soon.
The pre-trained models we provide are trained with normalized pixels --norm_pix_loss
(800 epochs). The fine-tuning hyper-parameters are slightly different from the default baseline using unnormalized pixels.