get_sliced_prediction giving different results with videoframes vs images #1102
Replies: 1 comment
-
Problem is located and solved. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi !
I have a small project for a sports detection where I am trying to detect a small ball on a sports-pitch (floorball).
I have a custom model which contains only a ball category. The dataset has aprox 2500 images, and it is trained for 300 epochs.
To try to work my way out of the problem with missing detections I have added images and training for more epochs, but I have been unable to get a brute-force solution.
The problem is that I am getting very low hit-rate on predictions when using video as input, and I notice that there are some differences
between using get_sliced_prediction for video (frames) and still frames (imgs).
The same problem was addressed by a different user in a previous threadin the ultralytics library (https://github.com/orgs/ultralytics/discussions/8121#discussioncomment-8871506). The reply to this thread suggest that the problem is caused by not using the same prediction code for images and video.
To verify that this is not the problem I am using the same code for both predictions (frames and imgs) based on the examples/YOLOv8-SAHI-Inference-Video/yolov8_sahi.py code
and
If i split a video into images using (200 frames)
When I do sliced_prediction on these images I get a hit with confidence in high 80's and above for 80 % of the frames. (The ball is occluded for some frames so this hit rate is fine )
If I do a sliced_prediction on the video i get a hit for 2% of the frames with a confidence in the low 40s.
But if I drop the confidence_threshold to 0.001 for the model, and run the prediction on the video again I get some very interesting result if I compare predictions on images ( conf.thres=0.3) and frames (conf.thres=0.001).
Filtering out all other objects which gets a false positive for the low confidence threshold
(random frame selection)
The results here may be coincidental, but to me it looks like there might be a basic math problem in the get_sliced_predictions method when using video frames as input.
I have tested on both Windows and on Linux. The same result exist on both platforms
My environment is ( relevant packages on windows)
On windows I am using python 3.9.13 and CUDA 12.2
On Linux I am using python 3.10.11 and CUDA 12.4 in a anaconda environment
Beta Was this translation helpful? Give feedback.
All reactions