Skip to content

VMamba v2 Classification checkpoints

Compare
Choose a tag to compare
@MzeroMiko MzeroMiko released this 16 Mar 08:47
· 65 commits to main since this release

Classification on ImageNet-1K

name pretrain resolution acc@1 #params FLOPs TP. Train TP. configs/logs/ckpts
VMamba-T[s2l5] ImageNet-1K 224x224 82.5 31M 4.9G 1340 464 config/log/ckpt
VMamba-S[s2l15] ImageNet-1K 224x224 83.6 50M 8.7G 877 314 config/log/ckpt
VMamba-B[s2l15] ImageNet-1K 224x224 83.9 89M 15.4G 646 247 config/log/ckpt
VMamba-T[s1l8] ImageNet-1K 224x224 82.6 30M 4.9G 1686 571 config/log/ckpt
VMamba-S[s1l20] ImageNet-1K 224x224 83.3 49M 8.6G 1106 390 config/log/ckpt
VMamba-B[s1l20] ImageNet-1K 224x224 83.8 87M 15.2G 827 313 config/log/ckpt
  • Models in this subsection is trained from scratch with random or manual initialization. The hyper-parameters are inherited from Swin, except for drop_path_rate and EMA. All models are trained with EMA except for the Vanilla-VMamba-T.
  • TP.(Throughput) and Train TP. (Train Throughput) are assessed on an A100 GPU paired with an AMD EPYC 7542 CPU, with batch size 128. Train TP. is tested with mix-resolution, excluding the time consumption of optimizers.
  • FLOPs and parameters are now gathered with head (In previous versions, without head, so the numbers raise a little bit).
  • we calculate FLOPs with the algorithm @ albertgu provides, which will be bigger than previous calculation (which is based on the selective_scan_ref function, and ignores the hardware-aware algorithm).