This is the implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov4_csp -it -v your_coco_path/:/coco/ -v your_code_path/:/yolo --shm-size=64g nvcr.io/nvidia/pytorch:20.06-py3
# install mish-cuda, if you use different pytorch version, you could try https://github.com/JunnYu/mish-cuda
cd /
git clone https://github.com/thomasbrandon/mish-cuda
cd mish-cuda
python setup.py build install
# go to code folder
cd /yolo
For using pre-trained models:
cd /yolo
git clone https://github.com/thomasbrandon/mish-cuda mc
cd mc
# change all of name which is mish_cuda to mish_mish and build.
# 1. mc/src/mish_cuda -> mc/src/mish_mish
# 2. mc/csrc/mish_cuda.cpp -> mc/csrc/mish_mish.cpp
# 3. in mc/setup.py
# 3.1 line 5 -> 'csrc/mish_mish.cpp'
# 3.2 line 11 -> name='mish_mish'
# 3.3 line 20 -> 'mish_mish._C'
python setup.py build
# rename mc/build/lib.xxx folder to mc/build/lib
# modify import in models/common.py
# line 7 ->
from mc.build.lib.mish_mish import MishCuda as Mish
yolov4-p5.pt
yolov4-p6.pt
yolov4-p7.pt
# download {yolov4-p5.pt, yolov4-p6.pt, yolov4-p7.pt} and put them in /yolo/weights/ folder.
python test.py --img 896 --conf 0.001 --batch 8 --device 0 --data coco.yaml --weights weights/yolov4-p5.pt
python test.py --img 1280 --conf 0.001 --batch 8 --device 0 --data coco.yaml --weights weights/yolov4-p6.pt
python test.py --img 1536 --conf 0.001 --batch 8 --device 0 --data coco.yaml --weights weights/yolov4-p7.pt
You will get following results:
# yolov4-p5
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.51244
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.69771
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.56180
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35021
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.56247
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.63983
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.38530
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.64048
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.69801
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.55487
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.74368
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.82826
# yolov4-p6
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.53857
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.72015
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.59025
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.39285
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.58283
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66580
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.39552
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.66504
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.72141
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.59193
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.75844
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.83981
# yolov4-p7
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.55046
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.72925
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.60224
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.39836
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.59854
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.68405
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.40256
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.66929
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.72943
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.59943
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.76873
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.84460
We use multiple GPUs for training. {YOLOv4-P5, YOLOv4-P6, YOLOv4-P7} use input resolution {896, 1280, 1536} for training respectively.
# yolov4-p5
python -m torch.distributed.launch --nproc_per_node 4 train.py --batch-size 64 --img 896 896 --data coco.yaml --cfg yolov4-p5.yaml --weights '' --sync-bn --device 0,1,2,3 --name yolov4-p5
python -m torch.distributed.launch --nproc_per_node 4 train.py --batch-size 64 --img 896 896 --data coco.yaml --cfg yolov4-p5.yaml --weights 'runs/exp0_yolov4-p5/weights/last_298.pt' --sync-bn --device 0,1,2,3 --name yolov4-p5-tune --hyp 'data/hyp.finetune.yaml' --epochs 450 --resume
If your training process stucks, it due to bugs of the python.
Just Ctrl+C
to stop training and resume training by:
# yolov4-p5
python -m torch.distributed.launch --nproc_per_node 4 train.py --batch-size 64 --img 896 896 --data coco.yaml --cfg yolov4-p5.yaml --weights 'runs/exp0_yolov4-p5/weights/last.pt' --sync-bn --device 0,1,2,3 --name yolov4-p5 --resume
@article{wang2020scaled,
title={{Scaled-YOLOv4}: Scaling Cross Stage Partial Network},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2011.08036},
year={2020}
}