-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different mAP from cocoAPI #171
Comments
Sorry I think you mentioned in #161 it will be in TODO list to align the results right? |
@xiao1228 to compute COCOAPI mAP properly you need to set You have yours set to 0.30, which is good for real world results but produces lower mAP. |
@xiao1228 also note the mAP computation in the current repo is not properly aligned to the COCO metric since it averages the image dimension rather than the class dimension. We have a branch with modifications to align these more closely, you may want to use it instead, or wait a few days untill we merge it with the master branch: https://github.com/ultralytics/yolov3/tree/map_update |
From #7: UPDATE: difference narrowed down to 0.531 (repo calculation) vs 0.551 (pycocotools). The rm -rf yolov3 && git clone -b map_update --depth 1 https://github.com/ultralytics/yolov3 yolov3
python3 test.py --conf-thres 0.001 --save-json
Namespace(batch_size=32, cfg='cfg/yolov3.cfg', conf_thres=0.001, data_cfg='cfg/coco.data', img_size=416, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3.weights')
Using cuda _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16130MB, multi_processor_count=80)
Image Total P R mAP
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 157/157 [07:00<00:00, 2.09s/it]
5000 5000 0.0865 0.727 0.531
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.308
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.551
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.308
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.143
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.334
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.267
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.407
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.432
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.240
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.470
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.590 |
Hi @glenn-jocher
|
@xiao1228 did you train to 270 epochs? If not then yes of course your mAP will be lower. |
ok. this is after 110 epochs. I will update you then. |
Oh ok. Still that seems strangely low. We were getting about 0.45 mAP on pycocotools after 70 epochs before, which was the longest we managed to trian. It takes so long to train to 270 that we haven't had time to try. 0.162 at epoch 110 seems too low to me. Are you just running default training from the darknet53 backbone (i.e. all settings default)? |
I didnot change anything except I lower down the lr at 50th epoch, like your previous code. |
If I dont lower down the lr at 50th epoch, the results at 72th epoch from COCOAPI with
|
@xiao1228 oh boy, you're going down the rabbit hole now. The question of self-trained mAP is always a big one, especially since we are not sure completely of the optimal loss function to use (if we use the darknet default loss, we see worse results in our tests of the first 10 epochs). I'll see if I can find one of our old checkpoints. This is from release v1.0 (https://github.com/ultralytics/yolov3/releases), which is a bit old now. I think this checkpoint was around epoch 65. I tested it using our map_update branch (https://github.com/ultralytics/yolov3/branches), which we will merge with the master soon. There have been changes of course since v1.0, but these should not change the mAP as much as you are seeing. You've likely altered some other setting that is causing your mAP drop, or perhaps not initialized with the darknet53 backbone. In this result, test.py natively return 0.41 mAP, and pycocotools returns 0.425 mAP on best_v1_0.pt around epoch 65. python3 test.py --save-json --weights weights/best_v1_0.pt
Namespace(batch_size=32, cfg='cfg/yolov3.cfg', conf_thres=0.001, data_cfg='cfg/coco.data', img_size=416, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/best_v1_0.pt')
Using cuda _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16130MB, multi_process
or_count=80)
Image Total P R mAP
100%|████████████████████████████████████████████████████████████████████████████| 157/157 [07:29<00:00, 2.24s/it]
5000 5000 0.0653 0.654 0.41
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.213
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.425
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.193
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.090
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.220
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.307
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.216
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.339
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.364
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.197
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.365
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.504 |
First you dont need to worry now. Something wrong on my side. Because I just run with COCOAPI with your best_v1_0.pt with
I did not change any settings...also I initialized with the darknet53 backbone. I can produce around 50 mAP from your test.py for image size 416. But however it seems not aligned with COCOAPI. |
@xiao1228 ah I see. You may want to clone the repo again and try training without modifications. Though I would use the |
@xiao1228 we've made great efforts to align the repo mAP with the COCO mAP. It's not perfect, but the current result seems to steadily track about 2% lower than the COCO mAP as the epochs trend higher. This plot shows v4.0 training with the repo mAP (blue) overlaid with the pycocotools mAP (orange), using |
Hi @glenn-jocher Thank you for the help.
I also tested the train from scratch one (best.pt) from this branch after 135 epochs results shown below:
Also I have tested the model I trained in Feb. Clone around 11st Feb. Results shows below at 70th epoch:
No modification was made from the code that I clone from your repo. Trained using darknet53 backbone for weight initialization. |
Final results are in, and PR #176 complete. Repo mAP now aligns with COCO mAP under most circumstances to within 1%. Also mAP output now exceeds yolov3 darknet published results.
sudo rm -rf yolov3 && git clone https://github.com/ultralytics/yolov3
# bash yolov3/data/get_coco_dataset.sh
sudo rm -rf cocoapi && git clone https://github.com/cocodataset/cocoapi && cd cocoapi/PythonAPI && make && cd ../.. && cp -r cocoapi/PythonAPI/pycocotools yolov3
cd yolov3
python3 test.py --save-json --conf-thres 0.001 --img-size 416
Namespace(batch_size=32, cfg='cfg/yolov3.cfg', conf_thres=0.001, data_cfg='cfg/coco.data', img_size=416, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3.weights')
Using cuda _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16130MB, multi_processor_count=80)
Image Total P R mAP
Calculating mAP: 100%|█████████████████████████████████| 157/157 [08:34<00:00, 2.53s/it]
5000 5000 0.0896 0.756 0.555
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.312
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.554
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.317
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.145
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.343
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.452
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.268
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.411
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.435
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.244
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.477
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.587
python3 test.py --save-json --conf-thres 0.001 --img-size 608 --batch-size 16
Namespace(batch_size=16, cfg='cfg/yolov3.cfg', conf_thres=0.001, data_cfg='cfg/coco.data', img_size=608, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3.weights')
Using cuda _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16130MB, multi_processor_count=80)
Image Total P R mAP
Calculating mAP: 100%|█████████████████████████████████| 313/313 [08:54<00:00, 1.55s/it]
5000 5000 0.0966 0.786 0.579
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.331
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.582
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.344
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.198
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.362
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.427
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.281
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.437
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.463
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.309
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.494
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.577 |
hi @glenn-jocher thank you very much for the update for mAP. I wonder the next step will be improve the training from scratch then? I am training from scratch at the moment as well. And will update the results. |
@xiao1228 yes, you are correct. Training from scratch (well, from darknet53 backbone) is the final frontier, not just for this repo but for all object detection. It's also the source of greatest confusion for me, because if I implement the darknet loss function as I understand it, then I get very poor results. The current loss function is the result of a hyperparameter search I did a few months ago #2 (comment). I run full COCO training for epoch 0, and tweak the parameters to get the highest mAP at the end of epoch 0. BUT this used the old mAP code, so its possible I was optimizing for the wrong metric. The main areas I was changing was the loss function, though the image augmentation could also be a source of investigation. The current loss function has two very odd weightings: k/4 on CE and k*64 on BCE. The reason they are there is because the results improved significantly with this change. I believe original darknet essentually uses no weights anymore on the loss components, and also uses BCE for class_conf loss, but I can't get good results this way. If you have access to GPU time, it might be useful to test out various loss function changes on the first few epochs, or on a subset of the COCO dataset, and we could implement the results of the best changes. I was thinking of creating a subset using the first 1000 images, and training and testing on those to more rapidly prototype changes than using the full dataset, which takes a lot of time. Lines 264 to 277 in 09b02d2
|
@glenn-jocher At least with the current loss function and default settings, after 196 epochs the mAP is only 0.414. What tweak I should apply on the loss function then? |
@xiao1228 yes its probably a good idea to stop training then, and to instead do a hyperparameter search. Or you could try with darknet defaults, though as I said those produce worse results for the first 3 epochs. You can see what I was trying before in #2 (comment). But just quickly off the top of my head the possible parameters to vary are here. I'm investigating the first few currently, but you can do this all on your own as well.
|
Hi @glenn-jocher
|
@xiao1228 ah excellent! This is the first time I've seen results from full training. A few items:
|
I am training the code from scratch and the results from best.pt are shown below
However from COCOAPI the results are shown below
So from the test.py it gives mAP 50+ however from COCOAPI it seems like only around 10
Am I missing something or what?
Thank you very much @glenn-jocher sorry for the long results.
The text was updated successfully, but these errors were encountered: