-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretrained Convolutional Weights from darknet53 #6
Comments
There are two training modes:
In both cases it uses yolov3.cfg to initialize darknet. It uses all 788 lines though, why do you say up to line 549? |
The author mentioned in section 3 of YOLO9000 that they have trained Darknet-19 for classification on ImageNet 1000 class classification dataset with 224x224 images for 160 epochs. Then, the same network is fine-tuned with 448x448 images for 10 epochs. For the detection task, the last CONV layer of Darknet-19 is removed and some extra layers are added to create YOLO9000 detection architecture. Extra layers are probably initialized with random weights as mentioned in section 2.2 of You Only Look Once: Unified, Real-Time Object Detection. YOLOV3 uses Darknet-53 instead of Darknet-19 (section 2.4 of YOLOv3). I have assumed that the last layers of Darknet-53 is discarded and the resulting weights are used to initialize YOLOV3 (up to line 549 in yolov3.cfg). Then, some extra layers (randomly initialized) are added to create YOLOV3. As you have mentioned, |
@okanlv , Hi. I too have a similar query. I have a dataset which is small (1.3K) and significantly different from COCO dataset. I wanted to use the pretrained darknet53. @glenn-jocher The pretrained darkent53 has weights upto conv_73. Now I did the following:
Now to train, I trained all the layers. Is that incorrect. |
@LalitPradhan these lines show how to do your option b), transfer learning the pretrained weights. If you uncomment them then all the layers except the 3 YOLO layers are frozen, so only the 3 YOLO layers (which have 650 rows each) will change. You can modify this section accordingly to your needs. Lines 59 to 62 in 0058431
I don't understand your option a). Whats the difference between the 2 pretrained weights? How many layers does each have? |
@glenn-jocher If you download https://pjreddie.com/media/files/darknet53.conv.74, This has weights which support the yolo3.cfg file upto line 549 (excluding the YOLO layers) is what I meant. While yolo3 weights has weights for all the layers including the 3 YOLO layers. And thanks for the transfer learning query. Do I have to comment out any other part of the code if I uncomment the 3 lines under transfer learning comment in your code. |
I did as you mentioned.
I'm guessing there is a mismatch between default COCO classes (80) and my custom classes (1). Can you help me resolve this?
There is nothing I could figure from this. Can you figure out what might the problem be? Update: I know the mistake now. In the cfg file i didn't change the num classes and filters in YOLO and conv layer prior to the respective yolo layers. But now, since I have to train with a different number of class, I think I would have to initialize some of the weights by myself. |
@LalitPradhan You could use the following steps as a guide to train yolov3 on your dataset:
|
@okanlv Thanks for the advice. It sorted my issue out. |
Guys, Can you please guide me, How to do transfer learning in Yolov3? |
@BaijuMishra if you uncomment these lines and resume training from the official yolov3 weights then only the 3 yolo layers will train: Lines 66 to 69 in ab9ee6a
|
Hi Glenn, Thank you for the response :) I have a confusion ? Do we need delete or change last layers of yolov3.config files? Regards, |
@BaijuMishra No, no need to change yolov3.cfg. |
Hello |
@alvin-p I think we had a misunderstanding of the darknet batch count, so we've corrected down a factor of 4, so 67 epochs would be the nominal training time on COCO. |
Hi, thanks a lot for the quick reply! So the tiny model needs only 68 epochs on full COCO, without using pretrained weights and multiscale training? Do you use then 64 as a batch_size? |
@alvin-p darknet training is multiscale. I would not advise training without a backbone. |
@glenn-jocher thanks! Are the weights of the backbone also adapted during gradient descent or are they frozen? |
@alvin-p all the parameters in the model are modified by the optimizer when training under default settings, including those making up the backbone layers. |
@glenn-jocher Thank you for your time and advice, I really appreciate it :) |
@sanazss ah that's interesting. You can read more about backbones here: Their utility is debatable. Can you demonstrate repeatable results on an open source dataset? |
@glenn-jocher Sorry, I have a question for the transfer learning. Why yolo.shape[0]=650? I do not understand why it is 650? how is it calculated? Thanks |
@duyao-art your question seems to lack the minimum requirements for a proper response, or is insufficiently detailed for us to help you. Please note that most technical problems are due to:
sudo rm -rf yolov3 # remove existing
git clone https://github.com/ultralytics/yolov3 && cd yolov3 # clone latest
python3 detect.py # verify detection
python3 train.py # verify training (a few batches only)
# CODE TO REPRODUCE YOUR ISSUE HERE
If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you! |
Thanks for sharing your work.
yolov3 initializes model weights (up to line 549 in yolov3.cfg) from darknet53 classifier if I am not mistaken. Your model might not converge at epoch 160 if that is the case. Have you tried initializing yolov3 with darknet53?
The text was updated successfully, but these errors were encountered: