-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaussian YOLOv3 (+3.1% [email protected] on COCO) , (+3.0% [email protected] on KITTI) , (+3.5% [email protected] on BDD) #4147
Comments
What do you mean? |
This is object detection by new method |
It seems Gaussian YOLOv3 is better than YOLOv3 on COCO dataset, especially in strict metric. |
So
https://arxiv.org/pdf/1902.09630v2.pdf
GIoU: #3249 |
@AlexeyAB Hi, Have you tried to combine Gaussian and GIoU? If yes, Could you share the result? Thanks |
where I can get the Gaussian YOLOv3 config file? |
cfg files: https://github.com/jwchoi384/Gaussian_YOLOv3/tree/master/cfg |
I added yolo_v3_tiny_pan3 matrix_gaussian aa_ae_mixup.cfg.txt model. More: #3114 (comment)
|
@AlexeyAB Hi, I trained
I think maybe in the test, Three some video results, extract code : |
Just set
You should use lower
It is necessary if you want to have high [email protected] |
@zpmmehrdad I added GIoU to the So now you can use for training:
|
@AlexeyAB |
@yrc08 |
@AlexeyAB |
Also try then to add params to each of [Gaussian_yolo] layer and train. It requires the latest version of Darknet
So compare |
So add the same to every gaussian_yolo? As opposed to scale_x_y=1.05 , 1.1, 1.2 in the different layers? |
what is the iou_thresh? |
Or you can add different values for different yolo-layers: scale_x_y=1.05 (for 17x17) , 1.1 (34x34) 1.2 (68x68).
Read about iou_thresh=0.213:
|
Just train again Also show chart.png of both models. |
I fixed it. Now |
It seems that we should use
I won’t be surprised if the main effect of improving the AP75 (and decreasing AP50) is not from the good GIoU algorithm, but just from higher values of iou_loss. I.e. the same effect we can achieve by using default
|
so just multiplying the error is what make the network learn faster, but I think the way the loss is calculated result in better ap75 |
@HagegeR C/D/GIoU increases AP75, but decreases AP50. |
Yes, normalizers have a tremendous effect on performance. In our repo we've spent thousands of GPU hours evolving these normalizers on COCO along with all of our other hyperparameters. See ultralytics/yolov3#392 Our 3 main balancing hyperparameters are here, though they are changing every few weeks as new evolution results come in. They must change depending on image size, as well as class count, and occupancy (how many objects per image). hyp = {'giou': 3.31, # giou loss gain
'cls': 42.4, # cls loss gain
'obj': 52.0, # obj loss gain (*=img_size/416 if img_size != 416)} |
I have updated mAPs now from https://github.com/ultralytics/yolov3#map using the default hyperparameters and default training settings on COCO2014, starting yolov3-spp.cfg from scratch.
|
is
So good hyper parameters for GIoU:
|
@AlexeyAB 416 is the network size. So for example if the network is trained at 320, the objectness hyperparameter would be multiplied by 320/416. I found this helped keep the balance when network size changed. Yes, roughly speaking we found an optimal GIoU gain is about 10 times smaller than cls and obj, and that obj and cls seem to optimally be of similar magnitude. I average the elements in each yolo layer, and sum the 3 values for the 3 yolo layers. So for example obj loss = mean(obj_loss_yolo_layer1) + mean(obj_loss_yolo_layer2) + mean(obj_loss_yolo_layer3). The chart below shows the loss after multiplying by the hyps: |
I think the fundamental concept is that if each of the 3 loss components are equally important to the solution (and I believe they are), then they should roughly each be represented equally in the total loss (i.e. 1/3, 1/3, 1/3). This is probably the best place to start when in doubt, and then fine tune from there if you can. |
Are these hyperparameters good for C/D/GIoU only, or also the same values for the default MSE-loss? |
@AlexeyAB these values are only for C/D/GIoU. I abandoned the MSE loss a while ago, as I had problems balancing the x-y losses with the width-height losses, since their loss functions are quite different. Also the MSE w-h loss tended to be unstable, which is solved by GIoU. So the ultralytics/yolov3 repo does not have MSE loss capability anymore, GIoU is the new default for box losses. |
@AlexeyAB Hello,
Do you mean:
@glenn-jocher Hello, Do you have an automatic or systematical method to find optimal value of hyper-parameters? #4147 (comment) |
@WongKinYiu yes we have an automatic hyperparameter search method. It's based on a genetic algorithm, which evolves the hyperparameters from an initial point to a minima. It does not compute a gradient, it simply mutates successive generations based on combinations of the most successful parents. Hyperparameter evolution is run the same as the training command, except with the while true
do
python3 train.py --data data/coco.data --epochs 27 --weights '' --evolve
done See ultralytics/yolov3#392 for full details. |
An example snapshot of the results are here. Mutation fitness (weighted mAP and F1 combination) is on the y axis (higher is better), hyperparameter value is on the x axis. Each orange point is a random genetic mutation. The highest fitness offspring is in blue. These are dynamically changing results, with better fitness evolving week to week etc. |
BTW, all of this does take a long time on large datasets. If you assume you can train COCO on 1 GPU in 10 days, then training to 10% of the full epochs (each orange point above) requires 1 GPU-day. To get good results you want to train for at least 100 generations, which on COCO would be 100 GPU-days, though on smaller datasets would obviously work much faster. I would say 100 generations is a minimum evolution time (starting from a good, manually selected starting point). 200 to 300 generations is more ideal, if you have the hardware available. |
I have only few gpus, I think I can not afford the computation of training hyper-parameter searching. |
@WongKinYiu yes, unfortunately it requires significant resources. If you leave something like this (https://lambdalabs.com/deep-learning/workstations/4-gpu/basic/customize) running 24/7 you can produce about 50 orange points a week on COCO. We tried using shortcuts, like evolving from the result after just 1% of epochs: python3 train.py --data data/coco.data --epochs 3 --weights '' --evolve but it did not work well. It evolved hyps for best results after 3 epochs, but they did not generalize well to 273 epochs. |
Yes, but @glenn-jocher found the best
|
Use: #4430
|
@AlexeyAB
|
@nyj-ocean Yes you can. Since it doesn't affect during training, you can leave or comment these 2 lines after training, and check the mAP. |
Have you tried the Gaussian object detection method?
The text was updated successfully, but these errors were encountered: