[models] Add detection & recognition models with much lighter backbones #255

fg-mindee · 2021-05-12T09:57:26Z

Comparing similar vision tasks, the segmentation part of text detection is easier than real-life scene segmentation (on Cityscapes for instance) which uses ResNet-like backbones, and even easier than medical segmentation which uses often lighter feature extractor (cf. UNet).

Additionally, the feature extraction of text recognition is similar to image classification for a single character image (MNIST but with more classes) which uses very light backbones.

So here is a suggestion:

train a very light image classifier for character classification on a specific vocab (it will be used as a pretrained backbone for text recognition) (feat: Added pretrained PyTorch mobilenets #415 feat: add mobilenet weights for character classification (TF) #421)
train a text recognition model with this (feat: Added rectangular stride MobileNets #483, feat: add pytorch ckpts for crnn & mobilenet_v3_large #487, feat: add crnn_mobilenet_v3_small tf ckpts #517, feat: add crnn_mobilenet small & large ckpts for torch backend #516)
train a light text segmentation model (feat: add db_mobilenet_v3_large (TF) ckpt & benchmark #485, feat: add pytorch ckpts for crnn & mobilenet_v3_large #487)

This means that DocTR will extend the list of supported tasks by adding:

character classification (feat: Added character classification training scripts #414)
text segmentation

fg-mindee · 2021-09-30T15:36:32Z

Closed by #516 & #517

fg-mindee added type: enhancement Improvement help wanted Extra attention is needed module: models Related to doctr.models labels May 12, 2021

fg-mindee added this to the 0.3.0 milestone May 12, 2021

fg-mindee modified the milestones: 0.3.0, 0.4.0 Jul 1, 2021

fg-mindee self-assigned this Jul 1, 2021

charlesmindee mentioned this issue Aug 11, 2021

feat: add MNIST-like characters dataset generator #408

Closed

charlesmindee pinned this issue Aug 11, 2021

charlesmindee unpinned this issue Aug 11, 2021

fg-mindee mentioned this issue Aug 13, 2021

feat: Added pretrained PyTorch mobilenets #415

Merged

fg-mindee added topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition topic: character classification Related to the task of character classification labels Aug 25, 2021

fg-mindee closed this as completed Sep 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[models] Add detection & recognition models with much lighter backbones #255

[models] Add detection & recognition models with much lighter backbones #255

fg-mindee commented May 12, 2021 •

edited

Loading

fg-mindee commented Sep 30, 2021

[models] Add detection & recognition models with much lighter backbones #255

[models] Add detection & recognition models with much lighter backbones #255

Comments

fg-mindee commented May 12, 2021 • edited Loading

fg-mindee commented Sep 30, 2021

fg-mindee commented May 12, 2021 •

edited

Loading