Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding chapter for object detection. #202

Merged
merged 6 commits into from
Feb 12, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions chapters/en/Unit 6 - Basic CV Tasks/object_detection.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Object Detection:
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

In this guide, we'll explore the fascinating world of object detection—a vital component in modern computer vision systems. We'll demystify essential concepts, discuss popular methods, examine applications, and touch upon evaluation metrics. By the end, you'll have a solid foundation and be ready to venture further into advanced topics.
Copy link
Collaborator

@michaelshekasta michaelshekasta Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we shouldn't use "guide"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I will change that.

merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this guide, we'll explore the fascinating world of object detectiona vital component in modern computer vision systems. We'll demystify essential concepts, discuss popular methods, examine applications, and touch upon evaluation metrics. By the end, you'll have a solid foundation and be ready to venture further into advanced topics.
In this guide, we will explore the fascinating world of object detectiona vital component in modern computer vision systems. We will breakdown essential concepts, discuss popular methods, examine applications, and touch upon evaluation metrics. By the end, you will have a solid foundation and be ready to venture further into advanced topics.

|![Object Detection](https://huggingface.co/datasets/hf-vision/course-assets/resolve/main/Object_Detection.png)|
|:--:|
| *Image displaying the bounding boxes around multiple objects in the frame along with the confidence score of their classification* |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can write this in alt text instead of making this in a table format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merveenoyan , I wanted it to be like a caption to the image, alt text won't be visible I believe. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when people scroll on it they'll see so it should be ok

## 1. Object Detection Overview
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

### 1.1 Introduction
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Object detection empowers computers to identify and locate specific objects within digital images or video frames. It has far-reaching implications across diverse sectors, including self-driving cars, facial recognition systems, and medical diagnosis tools.
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

### 1.2 Classification vs Localization
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Classification distinguishes objects based on their unique attributes, while localization determines an object's location within an image. Object detection combines both approaches, encapsulating identified entities with bounding boxes and assigning corresponding class labels. Imagine recognizing different fruit types and pinpointing their exact locations in a single image. That's object detection at play!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all object detection involves using bounding boxes. It not unusual (at least in the biomedical field) to predict points within-in regions and centroids.

merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a pipeline snippet here just like the page for image segmentation?

## 2. DETR Model

### 2.1 DETR's Approach

The DEtection TRansformer [(DETR)](https://huggingface.co/docs/transformers/model_doc/detr) model revolutionized object detection by treating it as a direct set prediction problem, discarding conventional anchor boxes and feature pyramids. Employing a Transformer architecture, DETR simplifies the process and enhances overall performance. Envision scanning a cluttered desk filled with random objects. Traditionally, we would inspect small regions meticulously. But with DETR, we quickly overview the entire scene, effortlessly identifying and locating every object—similar to human visual inspection!
miniMaddy marked this conversation as resolved.
Show resolved Hide resolved

## 3. Object Detection Techniques

### 3.1 YOLO: Single-Stage Detectors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have this as a separate chapter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, will do that.


[YOLO](https://pjreddie.com/darknet/yolo/) (You Only Look Once) provides fast and accurate object detection by analyzing the entire input image simultaneously rather than observing fragments sequentially. Improved versions like YOLOv2, YOLOv3, YOLOv4, and YOLOv5 build upon predecessors to increase speed and accuracy. Ideal for monitoring high-speed processes, e.g., assembly lines, where rapid identification matters.

### 3.2 Faster R-CNN: Two-Stage Detectors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will make section 3 as a new chapter.


[Faster R-CNN](https://arxiv.org/abs/1506.01497) significantly improves earlier strategies such as Fast R-CNN and R-CNN. Organized into two primary stages (generating region proposals and then refining them), it integrates a Region Proposal Network (RPN) for efficient proposal generation. Unlike previous techniques, Faster R-CNN substantially reduces computational complexity, making it suited for complex tasks demanding increased precision. Perfect for challenging problems like reading handwriting where variations require sophisticated algorithms.

## 4. Object Detection Applications
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Object detection impacts numerous industries, offering valuable insights and automation opportunities. Representative examples include autonomous vehicles navigating roads, surveillance systems covering vast public spaces, healthcare imaging systems detecting diseases, manufacturing plants maintaining output consistency, and augmented reality enriching user experiences.

## 5. Evaluation Metrics
miniMaddy marked this conversation as resolved.
Show resolved Hide resolved

Evaluating object detection performance requires proper metrics:
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

### 5.1 Intersection over Union (IoU)
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Intersection over Union (IoU) measures the overlap between predicted and actual bounding boxes as a percentage ranging from 0% to 100%. Higher IoU percentages indicate better alignments, i.e., improved accuracy. Useful when assessing tracker performance under changing conditions, e.g., following wild animals during migration.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Intersection over Union (IoU) measures the overlap between predicted and actual bounding boxes as a percentage ranging from 0% to 100%. Higher IoU percentages indicate better alignments, i.e., improved accuracy. Useful when assessing tracker performance under changing conditions, e.g., following wild animals during migration.
Intersection over Union (IoU) measures the overlap between predicted and reference labels as a percentage ranging from 0% to 100%. Higher IoU percentages indicate better alignments, i.e., improved accuracy. Useful when assessing tracker performance under changing conditions, e.g., following wild animals during migration.

merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

### 5.2 Mean Average Precision (mAP)
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Mean Average Precision (mAP) estimates object detection efficiency using both precision (correct prediction ratio) and recall (true positive identification ability). Calculated across varying IoU thresholds, mAP functions as a holistic assessment tool for object detection algorithms. Helpful when measuring sentiment analysis models' effectiveness, e.g., processing extensive customer reviews.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you change this to a computer vision example?

merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

## 6. Hands-On Tutorial: Implementing DETR for Object Detection
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

To get started with DETR, follow along with [this](https://github.com/johko/computer-vision-course/blob/main/notebooks/Unit%203%20-%20Vision%20Transformers/Fine-tuning%20Vision%20Transformers%20for%20Object%20detection.ipynb) detailed tutorial, walking you through installation, dataset preparation, model training, and visualization steps.

## 7. Conclusion and Future Work
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

Understanding object detection lays the groundwork for mastering advanced computer vision techniques, enabling the construction of powerful and accurate solutions addressing rigorous needs. Some future research areas include developing lightweight object detection models which are fast and easily deployable. Exploration in the field of object detection in 3D space, e.g., for augmented reality applications, is another avenue to explore.

## 8. References and Additional Resources
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding common datasets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the references? Datasets like VOC, MS-COCO?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johko What do you think about it?

merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

- [Hugging Face Object Detection Guide](https://huggingface.co/docs/transformers/tasks/object_detection)
- [Object Detection in 20 Years: A Survey](https://arxiv.org/abs/1905.05055)
- [Papers with Code - Real-Time Object Detection](https://paperswithcode.com/task/real-time-object-detection)
- [Papers with Code - Object Detection](https://paperswithcode.com/task/object-detection)