-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No detection is found after training #294
Comments
Did you try -thresh 0 to see if you get any detections, a threshold on 10% might not be low enough after only 500 iterations (I assume you mean iterations and not epochs, if you have e.g. 1000 images and batch size of 500 it will take 2 iterations to train one epoch). You can check the output of the console while training. to see how well it trains on your current data e.g. look at avg loss, or avg recall which tells you how many objects yolo detected out of positive outputs in this iteration. It might get 8 objects as input but only detects 1 or maybe even 0? Which means it needs more training. To learn more about darknet output you can check this https://timebutt.github.io/static/understanding-yolov2-training-output/ This is a nice guide which helped me to train yolo on custom dataset: https://timebutt.github.io/static/how-to-train-yolov2-to-detect-custom-objects/ |
Thank you for your response. Yes, I mean iterations. |
Seems odd you don't get any detections at all, have you tried to run an example to see if everything works as it should ? |
Yes, if I use the pretrained weights to detect, it detects correctly. (I use VOC datasets. ) |
Okay I've just looked at the dataset and it seems like the images are really small compared to what yolo expects. Not sure what input resolution you have in your cfg but you should have images with resolution greater than network resolution (i.e. greater than width=416 height=416), but what I can see it that the dataset have images with resolution 156x195 (didn't look all through just took one sample). YOLO might not be the best framework for detecting numbers and symbols, you might have better luck using something like tesseract ocr but if you want to continue with YOLO you should consider another dataset. |
Thank you for your response. Actually our goal is to recognise printed numbers shot by a fixed camera. I don't think YOLO can't handle such simple mission, as I actually observed the average loss dropping from 28 to 0.4, and average recall rising from 0.5 to nearly 1. |
If you want to that, you should train YOLO on data like that, you could e.g. generate your own small dataset and annotate it yourself and do image augmentations to increase the size of the dataset. A good rule is to have at least 2000 images per class to train with, but the more the better since it will be harder to overfit the data. |
Well...perhaps I'm too optimistic about training. On 500 batches, the average loss is still on 1.0-1.2. The ideal loss is slightly above 0.06. I don't know but I think I need much more training. |
I tried Tesseract OCR, but it doesn't fit our needs for now. I'll keep it as an alternative solution anyway. |
On my dataset after 50k iterations I had an avg loss ~1.0 and my test works really well, getting pretty good detections, but sometimes I get false negatives though, but I believe it is my training set that lacks that specific case. I think you should stay at 288x288 but you can try to go lower and see if it works out, the problem is the grid size of YOLO, and it will be hard to detect more than one image. YOLO is looking at the entire image in context and not image batches such as Faster-RCNN. You should modify in the config.cfg you are loading at the very top there is width and height definition, change those to e.g. 288x288. At the very bottom of the config there is a An idea is to take the letters you want to train on and make a script that but them on a random background image (pulled from google etc. or database) and then saving the location of the letter. Use random crops, rotating, skewing to the letter before inserting them on a random background. That way you can get bigger dataset and also make images in the size of what a digital photo will be. Remember the smaller an object you want to detect the bigger of a network you want. 288x288 sized network is much worse at detecting small objects compared to 608x608. I can recommend that you read the papers if you didn't already, it explains a lot on how YOLO works and it's pros and cons. YOLO and YOLOv2 |
First, thank you for your kind response. |
The problems I found in my previous experiments are:
|
@AurusHuang For anchor values, you don't have to recalculate them, from my understanding YOLO is automatically deciding on new anchors based on the training data (I've myself experimenting with calculating my own anchors, but it resulted in worse detections than yolo's own). I might be wrong though, since this is from my own experience and how I interpret the paper. For last filter (output based on classes) You are definitely right, the last filter size should be 80, in the default tiny-yolo.cfg it is line 119 that needs to be changed to 80.
You classes.names file I would expect to look like this, not sure how you did it and the order doesn't matter, but it is important that your detections number matches the line number of the class in the classes.names file to get correct labeling:
Your annotation file should be formatted as:
This is an example of one frame in my dataset frame 2321.txt, in this frame there are two different classes, class 0 which is person and class 4 which is aeroplane class. The rest is as listed above the information on where the objects are located in frame 2321.jpg
Hope this provide some more insight and if you need more help feel free to ask. |
I think I'm doing right about the above things. I'm now trying to run a much simpler demo to find out what is wrong. |
Well, I must admit that I'm too naive...@TheMikeyR |
I'm happy that you managed to get some results now! Yeah, the training time can be long, I'm using a K40 on AWS and I usually let it train for 2-3 days before I get satisfying results to compare it with my older trainings. |
I'm now fighting with possible overfitting problem... |
It depends on your dataset, you can try to let it run and take out the .weights file for each backup step and test it. |
What do you mean by "image augmentation"? |
@AurusHuang Random croppings, rotations, skewing to your dataset to make neural network generalise better, there have been many articles on gaining better performance by doing so. Some augmentations helps, other don't. I've had quite success using a small dataset and then rotating every image 90, 180 and 270 degrees. |
Speaking of character recognition, what if I |
The bigger dataset the better is a general thumb of rule. Can't really help you that much with what thing is better, is all about trial and error process, try it out and see what you get. You can read papers within the subject to get an idea on what other people did and had success with. |
Well, I think adjusting learning strategy is really important! |
It's indeed a good idea. |
Try PlotYoloLog.py |
It seems that I must generate a more "arbitrary" dataset if I want to detect text on arbitrary positions. |
If the original is 128 * 128, is it fast to fix the network to 128 * 128? probabry [net] |
Maybe, but I'd better modify the dataset in order to provide a context similar to actual detection. |
I think @wakanawakana has inspired me about Obj and No Obj. |
I think Logs under training are taken by redirect (>) etc. |
@AurusHuang So Obj is YOLO's confidence of the detected object being a target, No Obj is YOLOs confidence of object not being a target? And the number is representing the subdivision e.g. 8 images? Hmm, I still think it is easier to look at count and avg. recall, but the confidence might tell more about if the training is overfitting. |
Sorry for closing it accidentally. |
Do you learn all the space of 416 x 416 at the center of the number (the position where YOLO generates predictions)? |
Implementation you should refer to |
Well...a Japanese document... |
In order to generate prediction with random coordinates of 416 x 416, automatic generation of images and coordinate generation for learning are controlled by a program. |
Well...the above picture of a number 5 is only one of the thousands of samples. The position and size of numbers are randomised by me using a Python script. The size of numbers vary from 64 * 64 to 256 * 256, but the canvas size is a fixed 416 * 416. |
Unfortunately, I can not understand what you want to do. |
Does YOLO calculate the new coordinates based on our own annotated training data? Aren't the pre calculated anchors supposed to be manually entered in the cfg file for the network to read while training? |
It working for me when i set thresh 0.1 and i have error loss is 0.28 when training |
basic question regarding anchor boxes. what are these values relative to, i mean its units
|
@bicepjai the units of "anchors" are width and height for every anchor box in percentage: |
Try to use batch=1 for test/valid? |
Please use this link for single object training in YOLO |
Well, I have the same problem when trying to recognize digits. I generate my train dataset by randomly creating some numbers with different background. |
hi |
Hello!I trained my model on custom datasets .But my trained model does not predict anything whatsoever.What is it that might have gone wrong? |
Try to set the thresh value 0.10, then see prediction..
…On Fri, 8 Oct, 2021, 11:21 am atharvaagate, ***@***.***> wrote:
My computer: i5 4210M, GTX850M, Windows 10, CUDA 8, Visual Studio 2017
(with 2015 toolset installed) Training with a dataset called Chars74K,
selecting a subset of number 0-9 and letter E, totally 11176 pictures.
Divided into two roughly equal parts, for training and test respectively.
Since training is too slow, I'd like to perform an intermediate check.
After 500 epochs, I ran the following command: (Note: I train with GPU, but
detect with CPU) .\darknet_no_gpu detector test cfg\chars74k.data
tiny-yolo-chars74k-test.cfg backup\tiny-yolo-chars74k_500.weights -thresh
0.1 img001-00002.png But it returns no bounding boxes. I'm sure that
chars74k.data is correct, batch count and subdivision count are set to 1
in tiny-yolo-chars74k-test.cfg file (but for training, I'm using slightly
modified cfg file, and they're 48 and 8 respectively). There is a similar
issue #257 <#257> , but not a
solution for my case. Is it true that even for character detection (a much
simpler problem compared to VOC or COCO), it's imperative that you run
10,000 epochs before you can see the result(even if the result is not so
correct)? Or, are there any mistakes when I'm training or detecting? P.S.
Chars74K can be found here:
http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/ <http://url> I'll post
more details if any of you ask.
Hello!I trained my model on custom datasets .But my trained model does not
predict anything whatsoever.What is it that might have gone wrong?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#294 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE5POGWJPLNHXQIBKZCDNOLUF2BHDANCNFSM4EC6X4OQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
My computer: i5 4210M, GTX850M, Windows 10, CUDA 8, Visual Studio 2017 (with 2015 toolset installed)
Training with a dataset called Chars74K, selecting a subset of number 0-9 and letter E, totally 11176 pictures. Divided into two roughly equal parts, for training and test respectively.
Since training is too slow, I'd like to perform an intermediate check. After 500 epochs, I ran the following command: (Note: I train with GPU, but detect with CPU)
.\darknet_no_gpu detector test cfg\chars74k.data tiny-yolo-chars74k-test.cfg backup\tiny-yolo-chars74k_500.weights -thresh 0.1 img001-00002.png
But it returns no bounding boxes.
I'm sure that
chars74k.data
is correct, batch count and subdivision count are set to 1 intiny-yolo-chars74k-test.cfg
file (but for training, I'm using slightly modified cfg file, and they're 48 and 8 respectively). There is a similar issue #257 , but not a solution for my case.Is it true that even for character detection (a much simpler problem compared to VOC or COCO), it's imperative that you run 10,000 epochs before you can see the result(even if the result is not so correct)? Or, are there any mistakes when I'm training or detecting?
P.S. Chars74K can be found here:
http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
I'll post more details if any of you ask.
The text was updated successfully, but these errors were encountered: