Skip to content

v0.4.1: Enables AMP training and adds support of artefact object detection

Compare
Choose a tag to compare
@fg-mindee fg-mindee released this 22 Nov 11:22
· 486 commits to main since this release
74ff9ff

This patch release brings the support of AMP for PyTorch training to docTR along with artefact object detection.

Note: doctr 0.4.1 requires either TensorFlow 2.4.0 or PyTorch 1.8.0.

Highlights

Automatic Mixed Precision (AMP) ⚡

Training scripts with PyTorch back-end now benefit from AMP to reduce the RAM footprint and potentially increase the maximum batch size! This comes especially handy on text detection which require high spatial resolution inputs!

Artefact detection 🛸

Document understanding goes beyond textual elements, as information can be encoded in other visual forms. For this reason, we have extended the range of supported tasks by adding object detection. This will be focused on non-textual elements in documents, including QR codes, barcodes, ID pictures, and logos.

Here are some early results:

2x3_art(1)

This release comes with a training & validation set DocArtefacts, and a reference training script. Keep an eye for models we will be releasing in the next release!

Get more of docTR with Colab tutorials 📖

You've been waiting for it, from now on, we will be adding regularly new tutorials for docTR in the form of jupyter notebooks that you can open and run locally or on Google Colab for instance!

Check the new page in the documentation to have an updated list of all our community notebooks: https://mindee.github.io/doctr/latest/notebooks.html

Breaking changes

Deprecated support of FP16 for datasets

Float-precision can be leveraged in deep learning to decrease the RAM footprint of trainings. The common data type float32 has a lower resolution counterpart float16 which is usually only supported on GPU for common deep learning operations. Initially, we were planning to make all our operations available in both to reduce memory footprint in the end.

However, with the latest development of Deep Learning frameworks, and their Automatic Mixed Precision mechanism, this isn't required anymore and only adds more constraints on the development side. We thus deprecated this feature from our datasets and predictors:

0.4.0 0.4.1
>>> from doctr.datasets import FUNSD
>>> ds = FUNSD(train=True, download=True, fp16=True)
>>> print(getattr(ds, "fp16"))
True
>>> from doctr.datasets import FUNSD
>>> ds = FUNSD(train=True, download=True)
>>> print(getattr(ds, "fp16"))
None

Detailed changes

New features

Bug fixes

Improvements

New Contributors

Our thanks & warm welcome to the following persons for their first contributions: @mzeidhassan @K-for-Code @felixdittrich92 @SiddhantBahuguna @RBMindee @thentgesMindee 🙏

Full Changelog: v0.4.0...v0.4.1