Feat: Make detection training and inference Multiclass #1097

aminemindee · 2022-10-13T15:33:39Z

Hello, this PR is to make the detection part of DocTR multiclass.

⚠️ This PR has a breaking change on the use of DocTR. The attribute blocks of class Page is no longer a list of blocks but a dictionnary with keys being the class names.

Training detection model tf/torch working with new data format to include class and also old data format
Inference with ocr_predictor working tf/torch
visulization and export
middleware and hf api not changed should work without changes
minimal API: some changes in the docker file and doc to make it work

A first review of the code was done on my fork. you can find it here: aminemindee#4

Any feedback is welcome!

odulcy-mindee · 2022-10-13T19:01:21Z

Test in CI docker / pytest-api will work after DocTR release 👍

felixdittrich92 · 2022-10-13T19:04:36Z

Test in CI docker / pytest-api will work after DocTR release 👍

After release ? 😅

odulcy-mindee · 2022-10-13T19:38:35Z

@felixdittrich92 Yes, it seems weird written like that haha.
It's due to these:

https://github.com/mindee/doctr/pull/1097/files#diff-40f8f985c548e2b57b803d17f42833f8a4d94603ca5b5e0bf48de51872809400R16 -> Docker Image tries to install doctr from main which does not support multiclass
https://github.com/mindee/doctr/pull/1097/files#diff-fac0b9b6096c5c832d53bfd0a96fd1311dc556b4dfaea23f4b3415a6cef33cd4 and https://github.com/mindee/doctr/pull/1097/files#diff-32f93bf544faa578de445d7ac8fe1428e2062d2bc0f854ed511bc96fa8236d35 which are files that @aminemindee needed to change for multiclass support. These files are not compatible anymore with main.

@aminemindee following @felixdittrich92's remark, I just saw that we need to enforce DocTR to be the latest version in the pyproject.toml in the API, otherwise it may still try to install an older version from main not supporting Multiclass

felixdittrich92 · 2022-10-13T19:52:48Z

@odulcy-mindee @aminemindee you need also to increase the major version specifier in root pyproject.toml (breaking change) and sort the milestones/open tickets after merge/release :)
After scrolling trough the changes i saw also that the xml export changes will break the PDF/A notebook fyi :)
I will try to review this tomorrow careful but for my understanding what is the general use case for multiclass text detection ? printed/handwritten ?

felixdittrich92

Hi @aminemindee @odulcy-mindee (seems you have both worked on it 😅) thanks for the PR i added a first general review without going to deep and left some questions and remarks :)

Due to the breaking change and the amount of changes, someone should have a second look at it. @frgfm ?

felixdittrich92 · 2022-10-14T05:50:47Z

doctr/datasets/datasets/pytorch.py

@@ -25,6 +25,11 @@ def _read_sample(self, index: int) -> Tuple[torch.Tensor, Any]:
        if isinstance(target, dict):
            assert "boxes" in target, "Target should contain 'boxes' key"
            assert "labels" in target, "Target should contain 'labels' key"
+        elif isinstance(target, tuple):


We have to be careful with this currently we have two cases:

more then 1 annotation: dict (ocr dataset with labels and boxes or object detection dataset)
otherwise label (recognition) or boxes (text detection)

We need to be as clear as possible here so we don't run into problems with the transformations

In here, for now the only moment when target is a tuple is coming from detection dataset when there is multiple classes to keep that information and include it into the target in the _pre_transform.

yes we should be careful but for now this is the only case where it will be used.

felixdittrich92 · 2022-10-14T05:51:01Z

doctr/datasets/datasets/tensorflow.py

@@ -25,6 +25,11 @@ def _read_sample(self, index: int) -> Tuple[tf.Tensor, Any]:
        if isinstance(target, dict):
            assert "boxes" in target, "Target should contain 'boxes' key"
            assert "labels" in target, "Target should contain 'labels' key"
+        elif isinstance(target, tuple):


I see this still troublesome (transformations) better to keep dict for all which has multible annotations 🤔

felixdittrich92 · 2022-10-14T05:58:40Z

doctr/file_utils.py

@@ -10,13 +10,16 @@
 import os
 import sys

+CLASS_NAME: str = "words"


This is a text detection specific attribute.
We have to add it to cfg of each text detection model as it is also done for object detection and then work with model.cfg['class_names'] afterwards:

doctr/doctr/models/obj_detection/faster_rcnn/pytorch.py

Line 20 in acb9f64

"classes": ["background", "qr_code", "bar_code", "logo", "photo"],

this makes the names also flexible for training

NOTE: this change will need also an update for HF Api (factory) and update the config.json for uploaded text detection models

I think it is ok to have this default single class and use the class of the datasets otherwise to build the model, but the user should be able to modify this CLASS_NAME to a specific value like "words", "numbers", "default_class", "handwritten"... so it shouldn't be hardcoded here but I rather see it as a parameter given by the user either in the config or when building the mode, WDYT ?

Hi @charlesmindee yep sounds good to me 👍

I see it much more problematic to modify the blocks output structure .. the results should be added to each predicted word like

'house', conf: '0.99', class: 'word'

Instead of:

Blocks:
'word': { Line .. Words ..},
'salary': { Line .. Words ..}

Wdyt ?
(Wrote from mobile phone sry 😅 )

felixdittrich92 · 2022-10-14T06:01:58Z

doctr/io/elements.py

@@ -311,65 +316,66 @@ def export_as_xml(self, file_title: str = "docTR - XML export (hOCR)") -> Tuple[
            },
        )
        # iterate over the blocks / lines / words and create the XML elements in body line by line with the attributes
-        for block in self.blocks:


It is a while ago i wrote this export as xml functionality 😅 but we should test that it still works as expected with the PDF/A notebook

i looked at it a little bit. in google colab the installation using git doesn't seem to work for me. However, the code should work just the output will be slightly different i think.

doctr/models/_utils.py

felixdittrich92 · 2022-10-14T06:19:33Z

references/detection/train_pytorch.py

-            val_metric.update(gts=boxes_gt, preds=boxes_pred[:, :4])
+        for target, loc_pred in zip(targets, loc_preds):
+            if isinstance(target, np.ndarray):
+                target = {CLASS_NAME: target}


Should this name not come from the dataset specified names ? 🤔

for the training part, you can still use a data format where you don't give any class name. in that case, targets are not a dict so in that case i change it here. But now that i'm looking at it i think it's better to just add the CLASS_NAME in the dataset and not handle it outside. right ?

I think we need to avoid this CLASS_NAME completly .. class names should come from the dataset and update the default class_names in model.cfg ... to explain a box can be annotated with different labels like:
box 1: name, box 2: salary i do not understand correctly why we need to modify in the end the blocks output structure
i think what we want is this:

(words): [ Word(value='ABC Gmbh.', confidence=0.91, class_name='name'), Word(value='1200', confidence=0.99, class_name='salary), ]

And not something like:

(blocks): {'salaray' : (lines): [Line( (words): [ Word(value='1200.', confidence=0.91), ] )}, {'name': .... ]

right ?

The multiclass detection model is prediction bounding boxes that represent the blocks. so in theory the whole block is of that class and the words inside are of that class. A user will want to get all the words of the same block of that class and he can get that directly with the blocks as dict approach. different words in the same block will always have the same class also.

felixdittrich92 · 2022-10-14T06:19:45Z

references/detection/train_tensorflow.py

-            val_metric.update(gts=boxes_gt, preds=boxes_pred[:, :4])
+        for target, loc_pred in zip(targets, loc_preds):
+            if isinstance(target, np.ndarray):
+                target = {CLASS_NAME: target}


felixdittrich92 · 2022-10-14T06:29:42Z

doctr/io/elements.py

@@ -215,7 +220,7 @@ class Page(Element):
    """Implements a page element as a collection of blocks

    Args:
-        blocks: list of block elements
+        blocks: Dictionary with list of block elements for each detection class


I still try to find a solution to avoid this breaking change but i think in front of this a short explanation would help me to understand the use case for multi class in text detection or better what you want to reach with it 😅

Is it not possible to add another key value pair like 'class_names': 'words' instead of changing the output structure ?

hmm something like a class_names is the same len as blocks and give the class name of each block ?

Mh are you sure that block is the correct part ? If i understood it correctly each box can have a different label so it is the words part and not block (blocks can contain multible lines which can contains multible words)

Document( (pages): [Page( dimensions=(340, 600) (blocks): [Block( (lines): [Line( (words): [ Word(value='No.', confidence=0.91), Word(value='RECEIPT', confidence=0.99), Word(value='DATE', confidence=0.96), ] )] (artefacts): [] )] )] )

Actually blocks are not resolved, so this sub object in the Page might be obsolete, we could either rename it to Layer to have a more meaningful page structure and keep it like that or add a Layer object between Page and Block in the hierarchy, so that a class would be on a layer and inside each layer we can resolve blocks and lines.

felixdittrich92 · 2022-10-14T06:32:30Z

doctr/datasets/datasets/base.py

-            img, target = self.sample_transforms(img, target)
+            if isinstance(target, dict):
+                img_transformed = copy_tensor(img)
+                for class_name, bboxes in target.items():


An OCRDataset comes also with dict as target and has no class names (string labels and bbox annotations) so if we transform samples in this case this should normally fail

Yes that's true 😨 i'm trying to think how to handle it . open to any suggestions x)

As described in the other comments if we handle it like it is done with object_detection (classes comes from model.cfg which is flexible and can be updated with classes from dataset) we could remove this
Maybe thinking again about this dataset format:
polygons: [....]
classes: ['printed', 'handwritten', 'printed', 'printed', ...] ? @frgfm @odulcy-mindee @charlesmindee wdyt ?

I think we should split the topics of this PR, and discuss this in several issues/discussions to find a proper implementations. There are several breaking change modifications in the PR over quite a lot of files, it's quite hard to ensure the review quality will be both high and exhaustive 😅

I added another check in addition to if it's a dict to ensure it's the target coming from detection.

felixdittrich92 · 2022-10-14T06:49:27Z

references/detection/README.md

@@ -57,7 +57,30 @@ labels.json
     ...
 }
 ```
+If you want to train a model with multiple classes, you can use the following format


This should be a bit more clear:
multiclass box a ['printed', 'handwritten'] or box a ['printed'] box b ['handwritten']

ok i think now i got it 😅

for example class_name_1 == 'printed' contains all boxes for printed and class_name_2 == 'handwritten' contains all boxes for handwritten.

Maybe a better format:

polygons: [....]
classes: ['printed', 'handwritten', 'printed', 'printed', ...]

So we can avoid to introduce a new dataset format then it should also be possible to avoid the breaking change only to add a new key value pair like 'class': 'printed' @aminemindee @odulcy-mindee wdyt ?

aminemindee · 2022-10-14T07:41:35Z

@odulcy-mindee @aminemindee you need also to increase the major version specifier in root pyproject.toml (breaking change) and sort the milestones/open tickets after merge/release :) After scrolling trough the changes i saw also that the xml export changes will break the PDF/A notebook fyi :) I will try to review this tomorrow careful but for my understanding what is the general use case for multiclass text detection ? printed/handwritten ?

Hello @felixdittrich92 the general use case of multiclass text detection is when you want to find not just the words in a document but some specific infomations. An example, in a receipt from a restaurant you want to find the name of the restaurant and the total you paid. then you can have two classes and predict different bounding boxes for each class.

felixdittrich92 · 2022-10-14T07:45:32Z

@odulcy-mindee @aminemindee you need also to increase the major version specifier in root pyproject.toml (breaking change) and sort the milestones/open tickets after merge/release :) After scrolling trough the changes i saw also that the xml export changes will break the PDF/A notebook fyi :) I will try to review this tomorrow careful but for my understanding what is the general use case for multiclass text detection ? printed/handwritten ?

Hello @felixdittrich92 the general use case of multiclass text detection is when you want to find not just the words in a document but some specific infomations. An example, in a receipt from a restaurant you want to find the name of the restaurant and the total you paid. then you can have two classes and predict different bounding boxes for each class.

Ok got it so in the end like the LayoutLM model (without the text input)

codecov · 2022-10-14T08:34:49Z

Codecov Report

Merging #1097 (73d10a0) into main (acb9f64) will decrease coverage by 0.25%.
The diff coverage is 91.82%.

@@            Coverage Diff             @@
##             main    #1097      +/-   ##
==========================================
- Coverage   95.23%   94.97%   -0.26%     
==========================================
  Files         142      146       +4     
  Lines        6017     6389     +372     
==========================================
+ Hits         5730     6068     +338     
- Misses        287      321      +34

Flag	Coverage Δ
unittests	`94.97% <91.82%> (-0.26%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
doctr/file_utils.py	`72.22% <37.50%> (-6.51%)`	⬇️
doctr/datasets/datasets/base.py	`91.22% <50.00%> (-6.78%)`	⬇️
doctr/datasets/detection.py	`81.08% <71.42%> (-14.58%)`	⬇️
doctr/utils/visualization.py	`88.16% <85.18%> (-1.49%)`	⬇️
doctr/models/detection/linknet/base.py	`88.88% <88.63%> (+0.19%)`	⬆️
doctr/models/_utils.py	`96.96% <88.88%> (-1.34%)`	⬇️
...dels/detection/differentiable_binarization/base.py	`93.92% <90.56%> (+0.39%)`	⬆️
...s/detection/differentiable_binarization/pytorch.py	`97.77% <90.90%> (-0.68%)`	⬇️
...etection/differentiable_binarization/tensorflow.py	`95.16% <92.30%> (-0.61%)`	⬇️
doctr/models/detection/linknet/pytorch.py	`98.18% <92.30%> (-0.86%)`	⬇️
... and 39 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

frgfm

Hello everyone 👋

Thanks a lot of the PR! It's only my opinion but I think we have too many things at stake in the PR to be able to say whether it's OK to merge or not. I would suggest pausing this one shortly and opening GH issues for the following topics:

dataset target format modification for multiple class of text/objects
versioning of docTR (when is 1.0.0 within reach, etc.)
minimal API modifications
export format modifications
detection model modifications
training/evaluation

I might have forgotten a few

Once a topic has reached a good implementation idea / way to handle it, we can cherry pick from this PR or reimplement it :)

I already added comments for the section I checked, but then thought there is too much to handle in a single PR 😅 If you prefer to go with that one in a single PR, no worries!

Cheers ✌️

frgfm · 2022-10-15T12:24:48Z

api/pyproject.toml

-version = "0.5.2a0"
+version = "1.0.1a0"


⚠️ This is the version of the package, not of poetry
I'd suggest reverting back

Actually, I realize you want to bump docTr to 1.0.0. Perhaps that might be easier to discuss this in an issue and lower the amount of modifications in this specific PR?

frgfm · 2022-10-15T12:25:37Z

api/pyproject.toml

-python = ">=3.8,<3.11"
+python = ">=3.8.2,<3.11"


Could you add a comment to specify the reason of the minor version? (much easier to reassess later when we have a comment saying why)

Yes good idea ! added it.

frgfm · 2022-10-15T12:28:29Z

api/pyproject.toml

-python-doctr = ">=0.2.0"
+python-doctr = { version = ">=1.0.0", extras = ['tf'] }


Correct me if I'm wrong, but I think this will result as double version specifiers. If we consider extras, we need to remove the "tensorflow" & cie specifiers

i don't think i understand your comment can you be more clear?
we added the extras because now doctr if you don't install the extras "tf" you don't have tf2onnx library and things like that, that are needed to run the code.

Yes sure, I wasn't referring to the extra but the version index specifier
You switched it from >=0.2.0 to >=1.0.0 which is a big deal 😅

Regarding the extra, I'd argue it's up for discussion but the API doesn't need to convert models to ONNX

Ideally, we'd like it to run on ONNX, but some postprocessing ops are not compatible yet. So the alternative is to use doctr, and then carefully add as little deps as possible to run it

I agree. but for now how doctr is done to run the code in the api tf2onnx is needed

frgfm · 2022-10-15T12:30:10Z

doctr/datasets/datasets/base.py

-            img, target = self.sample_transforms(img, target)
+            if isinstance(target, dict):
+                img_transformed = copy_tensor(img)
+                for class_name, bboxes in target.items():


I think we should split the topics of this PR, and discuss this in several issues/discussions to find a proper implementations. There are several breaking change modifications in the PR over quite a lot of files, it's quite hard to ensure the review quality will be both high and exhaustive 😅

frgfm · 2022-10-15T12:31:43Z

doctr/datasets/datasets/pytorch.py

+        elif isinstance(target, tuple):
+            assert isinstance(target[0], str) or isinstance(
+                target[0], np.ndarray
+            ), "first element of the tuple should be a string or a numpy array"
+            assert isinstance(target[1], list), "second element of the tuple should be a list"


The problem with this type of condition is that if the target is a tuple with a single element the code will crash. Either we have to add assert len(target) == 2 or change that a bit

charlesmindee

A few remarks, but minor

charlesmindee · 2022-10-17T08:38:04Z

doctr/io/elements.py

@@ -215,7 +220,7 @@ class Page(Element):
    """Implements a page element as a collection of blocks

    Args:
-        blocks: list of block elements
+        blocks: Dictionary with list of block elements for each detection class


Actually blocks are not resolved, so this sub object in the Page might be obsolete, we could either rename it to Layer to have a more meaningful page structure and keep it like that or add a Layer object between Page and Block in the hierarchy, so that a class would be on a layer and inside each layer we can resolve blocks and lines.

charlesmindee · 2022-10-17T08:44:14Z

doctr/models/_utils.py

@@ -161,3 +161,18 @@ def get_language(text: str) -> Tuple[str, float]:
    if len(text) <= 1 or (len(text) <= 5 and lang.prob <= 0.2):
        return "unknown", 0.0
    return lang.lang, lang.prob
+
+
+def invert_between_dict_of_lists_and_list_of_dicts(


invert_data_structure with a clear docstring which explains the typing maybe ? so that we keep a short and clear name ?

Sounds good :)

charlesmindee · 2022-10-17T08:49:42Z

doctr/file_utils.py

@@ -10,13 +10,16 @@
 import os
 import sys

+CLASS_NAME: str = "words"


I think it is ok to have this default single class and use the class of the datasets otherwise to build the model, but the user should be able to modify this CLASS_NAME to a specific value like "words", "numbers", "default_class", "handwritten"... so it shouldn't be hardcoded here but I rather see it as a parameter given by the user either in the config or when building the mode, WDYT ?

aminemindee · 2022-10-18T09:40:58Z

Hello everyone wave

Thanks a lot of the PR! It's only my opinion but I think we have too many things at stake in the PR to be able to say whether it's OK to merge or not. I would suggest pausing this one shortly and opening GH issues for the following topics:
* dataset target format modification for multiple class of text/objects

* versioning of docTR (when is 1.0.0 within reach, etc.)

* minimal API modifications

* export format modifications

* detection model modifications

* training/evaluation
I might have forgotten a few

Once a topic has reached a good implementation idea / way to handle it, we can cherry pick from this PR or reimplement it :)

I already added comments for the section I checked, but then thought there is too much to handle in a single PR sweat_smile If you prefer to go with that one in a single PR, no worries!

Cheers v

Hello @frgfm i understand that this PR is kind of big with a lot of changes. The goal is to have it available as soon as possible. i know that the review won't be exhaustive but we can see this 1.0.0 release as experimental and once it's there we can improve things. After this PR, i will work on improving the training part ( changing losses maybe and also adding some augmentations and things of the sort)

frgfm · 2022-10-18T10:22:47Z

Hi @aminemindee :)

i know that the review won't be exhaustive but we can see this 1.0.0 release as experimental and once it's there we can improve things.

I could agree with your suggestion if it were 0.7.0, but 1.0.0 in software dev, at least to the best of my experience & knowledge, is a mark of stability in the API (in the broad sense) and performance milestone. So jumping from 0.6.0 to 1.0.0 looks a bit hasty to me (even the 0.7.0 release tracker is full of things that would be worth considering before releasing a 1.0.0)

To put this into context, the day the community saw PyTorch jump from 0.4.0 to 1.0.0, it was because they added the C++ API, the JIT compiler (which was their target for the first 1.0.0 release). We knew that this could now be considered to run in production. Here, at least I think the release roadmap should follow a structure readable by the community (cf. release trackers) 😅

After this PR, i will work on improving the training part ( changing losses maybe and also adding some augmentations and things of the sort)

Sounds good 👍

felixdittrich92 · 2022-10-19T16:56:48Z

About the versioning, i agree with @frgfm 1.0.0 should be the first stable (production ready) version specifier.
On the other hand with the breaking change we need to make it clear for end users.
I would still suggest to rethink some parts so that we can implement it without modifying the Page structure and avoid the breaking change (for example as extra attribute class for each Word) cf.

doctr/doctr/io/elements.py

Line 57 in acb9f64

class Word(Element):

aminemindee · 2022-10-20T15:27:39Z

#1097 (comment)

About the versioning, i agree with @frgfm 1.0.0 should be the first stable (production ready) version specifier. On the other hand with the breaking change we need to make it clear for end users. I would still suggest to rethink some parts so that we can implement it without modifying the Page structure and avoid the breaking change (for example as extra attribute class for each Word) cf.

doctr/doctr/io/elements.py

Line 57 in acb9f64

class Word(Element):

I don't think adding an attribute class to Word would be ideal, because for a user who wants all the blocks of words from a certain class will have to iterate over every block line and words to get them.

I think the main problem here is that, once the detection model is not trying to detect all words in a page, the fact that Page is a list of Block is not logical anymore because it doesn't constitute the page, but just give the information we found on the page.

So i agree with @charlemindee, we can add a new object Layer but instead of between Page and Blocks but at the same level of Blocks with the same information just as a dictionary. WDYT @fg-mindee @charlesmindee @felixdittrich92 ?

felixdittrich92 · 2022-10-20T16:56:39Z

#1097 (comment)

About the versioning, i agree with @frgfm 1.0.0 should be the first stable (production ready) version specifier. On the other hand with the breaking change we need to make it clear for end users. I would still suggest to rethink some parts so that we can implement it without modifying the Page structure and avoid the breaking change (for example as extra attribute class for each Word) cf.

doctr/doctr/io/elements.py

Line 57 in acb9f64

class Word(Element):

I don't think adding an attribute class to Word would be ideal, because for a user who wants all the blocks of words from a certain class will have to iterate over every block line and words to get them.

I think the main problem here is that, once the detection model is not trying to detect all words in a page, the fact that Page is a list of Block is not logical anymore because it doesn't constitute the page, but just give the information we found on the page.

So i agree with @charlemindee, we can add a new object Layer but instead of between Page and Blocks but at the same level of Blocks with the same information just as a dictionary. WDYT @fg-mindee @charlesmindee @felixdittrich92 ?

Mh .. we should keep in mind that it is an feature task ... core is still OCR (where a user wants to get all the extracted text at once) so i think it would be totally ok if the user needs to iterate trough the results ... but yeah that's my opinion 😅 let's wait on @frgfm @charlesmindee 👍

aminemindee · 2022-10-26T16:25:11Z

Hello again ! After discussing it internally, i have a new suggestion to add the detection multiclass without a big breaking change.

I added a new kie_predictor ( like ocr_predictor) that will handle the different classes and has a dict like output, while keeping ocr_predictor as it was.
Also, we're not going major anymore but just to 0.7.0 and maybe what was already tracked for the 0.7.0 will become for 0.8.0 i guess.

So a summary of the changes:

Addition of kie_predictor and some io classes to use it (KIEDocument, KIEPage, ..)
ocr_predictor has the same output as before, no breaking change here
Detection model training as it was before with new dataformat in input and outputs a dict
Test were fixed and added tests for kie_predictor
kie route was added on api

WDYT @felixdittrich92 @odulcy-mindee @frgfm @charlesmindee ?

felixdittrich92

Hi @aminemindee thanks for the changes i added a few comments but in general there are some questions / remarks and the PR is still much to big to review it careful 😅

api/app/routes/kie.py

doctr/models/detection/differentiable_binarization/pytorch.py

doctr/models/detection/linknet/base.py

doctr/models/kie_predictor/base.py

felixdittrich92 · 2022-10-27T09:14:58Z

doctr/datasets/datasets/tensorflow.py

@@ -25,6 +25,11 @@ def _read_sample(self, index: int) -> Tuple[tf.Tensor, Any]:
        if isinstance(target, dict):
            assert "boxes" in target, "Target should contain 'boxes' key"
            assert "labels" in target, "Target should contain 'labels' key"
+        elif isinstance(target, tuple):


I see this still troublesome (transformations) better to keep dict for all which has multible annotations 🤔

doctr/models/kie_predictor/pytorch.py

…impler

…de from it

…as dict

…ith it

…ates

…lements for kie predictor (#6) * feat: ✨ add load backbone * feat: change kie predictor out * fix new elements for kie, dataset when class is empty and fix and add tests * fix api kie route * fix evaluate kie script * fix black * remove commented code * update README

aminemindee requested review from felixdittrich92, frgfm, charlesmindee and odulcy-mindee October 13, 2022 15:33

felixdittrich92 reviewed Oct 14, 2022

View reviewed changes

frgfm reviewed Oct 15, 2022

View reviewed changes

charlesmindee reviewed Oct 17, 2022

View reviewed changes

felixdittrich92 reviewed Oct 27, 2022

View reviewed changes

aminemindee and others added 24 commits December 5, 2022 17:10

refactor: refactor invert dict list and list dict function into one s…

1cc2204

…impler

fix: style and mypy

3b22ebe

docs: make it more clear for new data format

3747094

explain why python version was upped

a96b83c

add assert on length of tuple

b48a243

add doc and simple name to invert data structure function

f5f4dce

feat: add class names can be obtained from model config

47bcdbf

fix: prioritize class_names from dataset over model config

b8c1418

fix: fix show samples in training

b9810a0

fix: add check when target is dict and all values are numpy arrays

829e266

fix: make detection target always dict and remove unnecessary made co…

223722d

…de from it

fix: script detection evaluation tests and dataset tests with target …

f4f4aed

…as dict

fix tests also on pytorch

be40ef6

feat: Add kie predictor and io elements and visualization that come w…

2d24441

…ith it

fix: revert ocr predictor to old format

20bfd41

fix tests and add test for kie predictor

b07abc0

up project version to 0.7.0

b74d7f2

update api to fix it and add kie route

f3c5d65

fix api version

aac4347

feat: sort class names to always have the same order.

d112c2d

sort imports to avoid cyclic imports

d8991d9

fix class_names default, use of tf_is_available avoid and copyright d…

2442a4f

…ates

feat: update readme and doc with kie predictor

7cb23dd

aminemindee dismissed charlesmindee’s stale review via b396f6d December 5, 2022 16:11

aminemindee force-pushed the feat/multiclass branch from aa15f43 to b396f6d Compare December 5, 2022 16:11

fix mypy

73d10a0

charlesmindee approved these changes Dec 6, 2022

View reviewed changes

aminemindee merged commit e66ce01 into mindee:main Dec 19, 2022

felixdittrich92 added this to the 0.7.0 milestone Jul 24, 2023

		python-doctr = ">=0.2.0"
		python-doctr = { version = ">=1.0.0", extras = ['tf'] }

Feat: Make detection training and inference Multiclass #1097

Feat: Make detection training and inference Multiclass #1097

Conversation

aminemindee commented Oct 13, 2022

odulcy-mindee commented Oct 13, 2022

felixdittrich92 commented Oct 13, 2022

odulcy-mindee commented Oct 13, 2022 • edited Loading

felixdittrich92 commented Oct 13, 2022 • edited Loading

felixdittrich92 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

felixdittrich92 Oct 17, 2022 • edited Loading

Choose a reason for hiding this comment

felixdittrich92 Oct 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

felixdittrich92 Oct 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aminemindee Oct 14, 2022 • edited by felixdittrich92 Loading

Choose a reason for hiding this comment

felixdittrich92 Oct 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aminemindee commented Oct 14, 2022

felixdittrich92 commented Oct 14, 2022

codecov bot commented Oct 14, 2022 • edited Loading

Codecov Report

frgfm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

charlesmindee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aminemindee commented Oct 18, 2022

frgfm commented Oct 18, 2022 • edited Loading

felixdittrich92 commented Oct 19, 2022 • edited Loading

aminemindee commented Oct 20, 2022

felixdittrich92 commented Oct 20, 2022 • edited Loading

aminemindee commented Oct 26, 2022

felixdittrich92 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odulcy-mindee commented Oct 13, 2022 •

edited

Loading

felixdittrich92 commented Oct 13, 2022 •

edited

Loading

felixdittrich92 Oct 17, 2022 •

edited

Loading

felixdittrich92 Oct 14, 2022 •

edited

Loading

felixdittrich92 Oct 14, 2022 •

edited

Loading

aminemindee Oct 14, 2022 •

edited by felixdittrich92

Loading

felixdittrich92 Oct 14, 2022 •

edited

Loading

codecov bot commented Oct 14, 2022 •

edited

Loading

frgfm commented Oct 18, 2022 •

edited

Loading

felixdittrich92 commented Oct 19, 2022 •

edited

Loading

felixdittrich92 commented Oct 20, 2022 •

edited

Loading