[models] Add detection & recognition models with much lighter backbones #255
Labels
help wanted
Extra attention is needed
module: models
Related to doctr.models
topic: character classification
Related to the task of character classification
topic: text detection
Related to the task of text detection
topic: text recognition
Related to the task of text recognition
type: enhancement
Improvement
Milestone
Comparing similar vision tasks, the segmentation part of text detection is easier than real-life scene segmentation (on Cityscapes for instance) which uses ResNet-like backbones, and even easier than medical segmentation which uses often lighter feature extractor (cf. UNet).
Additionally, the feature extraction of text recognition is similar to image classification for a single character image (MNIST but with more classes) which uses very light backbones.
So here is a suggestion:
This means that DocTR will extend the list of supported tasks by adding:
The text was updated successfully, but these errors were encountered: