args to reproduce tiny-yolo metrics on img_size=320 #1111

akshaychawla · 2020-04-28T20:05:38Z

Hello,

I'm trying to reproduce the following tiny-yolo mAP metric:

Are the following arguments correct?

python train.py --data data/coco2014.data --weights '' --img-size 320 --epochs 300 --cfg cfg/yolov3-tiny.cfg --batch-size 64 --accumulate 1

note that the img-size is set to 320, instead of 416, since I will test it only on imgs of size=320x320.

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2020-04-28T20:16:34Z

@akshaychawla to test pretrained weights:

$ python test.py --cfg yolov3-tiny.cfg --weights yolov3-tiny.pt --img 320

To train from scratch see https://github.com/ultralytics/yolov3#reproduce-our-results

$ python train.py --weights '' --cfg yolov3-tiny.cfg --epochs 300 --batch-size 64 --img 320 640

Reduce --batch size if you get cuda out of memory errors.

akshaychawla · 2020-04-28T21:26:12Z

Thanks @glenn-jocher , I'll test out these arguments and post the results here.

glenn-jocher · 2020-04-28T21:53:09Z

FYI the training is multi-scale 320-640, which produces better results at 320 (actually at all resolutions) than simply training at 320.

akshaychawla · 2020-05-01T17:20:30Z

NOTE: The results shown in this comment are misleading, please look at the latest comments below which contain correct information.

Update on training:

commit f1d73a29e549654c99674bf07dd8f7a2f5c19d18
python train.py --weights '' --cfg yolov3-tiny.cfg --epochs 300 --batch-size 64 --img-size 320 640 --data 'data/coco2017.data' --device '0,1' --name 'baseline'

(this will be updated when the training finishes)

In comparison to #696 , the differences are:

They reach ~0.20 mAP at 100 epochs, this one's hitting ~0.15 mAP / 100 epochs. But i'll wait till the training finishes since there seems to be a large jump coming up at 200 epochs.
I don't set the --multi-scale flag, but it seems to be doing that anyway since the image sizes range from 320 - 640.
The --prebias flag isn't available in the current version of train.py. Is it the same as burn in?

Other things:
Running without mixed precision on 2 Tesla V100.

glenn-jocher · 2020-05-01T18:40:01Z

@akshaychawla yes there's been a lot of updates since #696, which train to higher mAPs. Your command looks fine, but your results do not look right. This is the most recent training of the two models we have.

glenn-jocher · 2020-05-01T18:42:10Z

@akshaychawla if you have two GPUs at your disposal I would simply install apex on your system, and train one model on each. I think you'll find that yolov3-spp works much better and is still extremely fast.

About your discrepancy, there must be some difference in your code. I would cancel your training, git clone a new copy and start from scratch. I've attached the results.txt files, you should be seeing similar results.
results_yolov3-spp-ultralytics131.txt
results_yolov3-tiny_ultralytics132.txt

akshaychawla · 2020-05-01T18:55:44Z

Sure, I'll pull a fresh copy of the repository and train yolov3-tiny from scratch with mixed precision. However, we're a little bit inflexible on the architecture right now because the code is being modified to support distillation from model A (yolov3) to model B (yolov3-tiny). Will post an update as soon as training finishes.

Also, just to confirm, we're training coco2017 right? the one from get_coco2017.sh ?

Thanks!

glenn-jocher · 2020-05-01T19:00:57Z

@akshaychawla 2014 and 2017 use the same images, just a different breakdown between trian/val. The above plots are for 2014 (to compare to original yolov3 paper results), but you will see identical results when training 2017.

Just make sure later on you test with the same dataset that you trained on. So train with 2017, then if you want to use test.py later on, specify python test.py --data coco2017.data

akshaychawla · 2020-05-05T18:22:50Z

Dev box: 1x Tesla V100, Ubuntu 18.04

Training & testing on COCO2017 (from data/get_coco2017.sh)

python train.py --weights '' --cfg yolov3-tiny.cfg --epochs 300 --batch-size 64 --data 'data/coco2017.data' --device '0' --img 320 640

Last epoch's [email protected]: 0.339

Link to results and weights: https://drive.google.com/drive/folders/1Q_lUdBiLnh7VNWm8heXf3BJuBFIcAIRM?usp=sharing

glenn-jocher · 2020-05-05T18:26:29Z

@akshaychawla ah, great! Yes this all looks correct. When you test this model directly with test.py, pycocotools will give the official coco mAP, which tends to be a bit higher than our locally produced mAP. i.e.:

python3 test.py --data data/coco2017.data --cfg yolov3-tiny.cfg --weights weights/last.pt

akshaychawla · 2020-05-05T20:17:32Z

Thanks Glenn! We were hoping to test get the official coco mAP at the end of training but it seems that the version of pycocotools installed in our environment has an issue (cocodataset/cocoapi#356) with the installed version of numpy.

I'll be sure to post the cocotools metrics later with the corrected environment, for now we're moving on to test knowledge distillation.

Again, thank you so much for your work!

glenn-jocher · 2020-05-05T20:31:40Z

@akshaychawla ah yes. You likely need to enforce numpy == 1.17 I believe in order for pycocotools to function properly. pycocotools mAP is typically about 1% higher than ours (for unknown reasons), so your result is just about in line with the readme. The only other 'catch' is that [email protected] is highest at --conf 0.5, while [email protected]:0.95 is highest at --conf 0.7, so the training results show you mAP at the middle ground --conf 0.6. If you really want to maximize one or the other (as in the readme table), you should set your --conf accordingly (but we are talking about fractions of a percent here, so not a big difference).

github-actions · 2020-06-05T00:15:36Z

This issue is stale because it has been open 30 days with no activity. Remove Stale label or comment or this will be closed in 5 days.

github-actions bot added the Stale Stale and schedule for closing soon label Jun 5, 2020

github-actions bot closed this as completed Jun 11, 2020

sannidhipk mentioned this issue Jun 27, 2020

Split of COCO 2014 dataset? #1326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

args to reproduce tiny-yolo metrics on img_size=320 #1111

args to reproduce tiny-yolo metrics on img_size=320 #1111

akshaychawla commented Apr 28, 2020

glenn-jocher commented Apr 28, 2020

akshaychawla commented Apr 28, 2020

glenn-jocher commented Apr 28, 2020

akshaychawla commented May 1, 2020 •

edited

Loading

glenn-jocher commented May 1, 2020

glenn-jocher commented May 1, 2020

akshaychawla commented May 1, 2020 •

edited

Loading

glenn-jocher commented May 1, 2020 •

edited

Loading

akshaychawla commented May 5, 2020

glenn-jocher commented May 5, 2020

akshaychawla commented May 5, 2020

glenn-jocher commented May 5, 2020 •

edited

Loading

github-actions bot commented Jun 5, 2020

args to reproduce tiny-yolo metrics on img_size=320 #1111

args to reproduce tiny-yolo metrics on img_size=320 #1111

Comments

akshaychawla commented Apr 28, 2020

glenn-jocher commented Apr 28, 2020

akshaychawla commented Apr 28, 2020

glenn-jocher commented Apr 28, 2020

akshaychawla commented May 1, 2020 • edited Loading

glenn-jocher commented May 1, 2020

glenn-jocher commented May 1, 2020

akshaychawla commented May 1, 2020 • edited Loading

glenn-jocher commented May 1, 2020 • edited Loading

akshaychawla commented May 5, 2020

glenn-jocher commented May 5, 2020

akshaychawla commented May 5, 2020

glenn-jocher commented May 5, 2020 • edited Loading

github-actions bot commented Jun 5, 2020

akshaychawla commented May 1, 2020 •

edited

Loading

akshaychawla commented May 1, 2020 •

edited

Loading

glenn-jocher commented May 1, 2020 •

edited

Loading

glenn-jocher commented May 5, 2020 •

edited

Loading