Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ICDAR dataset format #2866

Merged
merged 17 commits into from
Mar 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [Market-1501](https://www.aitribune.com/dataset/2018051063) format support (<https://github.com/openvinotoolkit/cvat/pull/2869>)
- Ability of upload manifest for dataset with images (<https://github.com/openvinotoolkit/cvat/pull/2763>)
- Annotations filters UI using react-awesome-query-builder (https://github.com/openvinotoolkit/cvat/issues/1418)
- [ICDAR](https://rrc.cvc.uab.es/?ch=2) format support (<https://github.com/openvinotoolkit/cvat/pull/2866>)

### Changed

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ For more information about supported formats look at the
| [WIDER Face](http://shuoyang1213.me/WIDERFACE/) | X | X |
| [VGGFace2](https://github.com/ox-vgg/vgg_face2) | X | X |
| [Market-1501](https://www.aitribune.com/dataset/2018051063) | X | X |
| [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2) | X | X |

## Deep learning serverless functions for automatic labeling

Expand Down
90 changes: 80 additions & 10 deletions cvat/apps/dataset_manager/formats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- [WIDER Face](#widerface)
- [VGGFace2](#vggface2)
- [Market-1501](#market1501)
- [ICDAR13/15](#icdar)

## How to add a new annotation format support<a id="how-to-add"></a>

Expand Down Expand Up @@ -817,17 +818,17 @@ Downloaded file: a zip archive of the following structure:
```bash
# if we save images:
taskname.zip/
── label1/
├── label1_image1.jpg
└── label1_image2.jpg
── label1/
| ├── label1_image1.jpg
| └── label1_image2.jpg
└── label2/
├── label2_image1.jpg
├── label2_image3.jpg
└── label2_image4.jpg

# if we keep only annotation:
taskname.zip/
── <any_subset_name>.txt
── <any_subset_name>.txt
└── synsets.txt

```
Expand All @@ -849,12 +850,12 @@ Downloaded file: a zip archive of the following structure:
```bash
taskname.zip/
├── labelmap.txt # optional, required for non-CamVid labels
── <any_subset_name>/
├── image1.png
└── image2.png
── <any_subset_name>annot/
├── image1.png
└── image2.png
── <any_subset_name>/
| ├── image1.png
| └── image2.png
── <any_subset_name>annot/
| ├── image1.png
| └── image2.png
└── <any_subset_name>.txt

# labelmap.txt
Expand Down Expand Up @@ -974,3 +975,72 @@ s1 - sequence
Uploaded file: a zip archive of the structure above

- supported annotations: Label `market-1501` with atrributes (`query`, `person_id`, `camera_id`)

### [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2)<a id="icdar" />

#### ICDAR13/15 Dumper

Downloaded file: a zip archive of the following structure:

```bash
# word recognition task
taskname.zip/
└── word_recognition/
└── <any_subset_name>/
├── images
| ├── word1.png
| └── word2.png
└── gt.txt
# text localization task
taskname.zip/
└── text_localization/
└── <any_subset_name>/
├── images
| ├── img_1.png
| └── img_2.png
├── gt_img_1.txt
└── gt_img_1.txt
#text segmentation task
taskname.zip/
└── text_localization/
└── <any_subset_name>/
├── images
| ├── 1.png
| └── 2.png
├── 1_GT.bmp
├── 1_GT.txt
├── 2_GT.bmp
└── 2_GT.txt
```

**Word recognition task**:

- supported annotations: Label `icdar` with attribute `caption`

**Text localization task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attribute `text`

**Text segmentation task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attributes `index`, `text`, `color`, `center`

#### ICDAR13/15 Loader

Uploaded file: a zip archive of the structure above

**Word recognition task**:

- supported annotations: Label `icdar` with attribute `caption`

**Text localization task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attribute `text`

**Text segmentation task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attributes `index`, `text`, `color`, `center`
131 changes: 131 additions & 0 deletions cvat/apps/dataset_manager/formats/icdar.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

import zipfile
from tempfile import TemporaryDirectory

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import (AnnotationType, Caption, Label,
LabelCategories, Transform)

from cvat.apps.dataset_manager.bindings import (CvatTaskDataExtractor,
import_dm_annotations)
from cvat.apps.dataset_manager.util import make_zip_archive

from .registry import dm_env, exporter, importer


class AddLabelToAnns(Transform):
def __init__(self, extractor, label):
super().__init__(extractor)

assert isinstance(label, str)
self._categories = {}
label_cat = self._extractor.categories().get(AnnotationType.label)
if not label_cat:
label_cat = LabelCategories()
self._label = label_cat.add(label)
self._categories[AnnotationType.label] = label_cat

def categories(self):
return self._categories

def transform_item(self, item):
annotations = item.annotations
for ann in annotations:
if ann.type in [AnnotationType.polygon,
AnnotationType.bbox, AnnotationType.mask]:
ann.label = self._label
return item.wrap(annotations=annotations)

class CaptionToLabel(Transform):
def __init__(self, extractor, label):
super().__init__(extractor)

assert isinstance(label, str)
self._categories = {}
label_cat = self._extractor.categories().get(AnnotationType.label)
if not label_cat:
label_cat = LabelCategories()
self._label = label_cat.add(label)
self._categories[AnnotationType.label] = label_cat

def categories(self):
return self._categories

def transform_item(self, item):
annotations = item.annotations
captions = [ann for ann in annotations
if ann.type == AnnotationType.caption]
for ann in captions:
annotations.append(Label(self._label,
attributes={'text': ann.caption}))
annotations.remove(ann)
return item.wrap(annotations=annotations)

class LabelToCaption(Transform):
def transform_item(self, item):
annotations = item.annotations
anns = [p for p in annotations
if 'text' in p.attributes]
for ann in anns:
annotations.append(Caption(ann.attributes['text']))
annotations.remove(ann)
return item.wrap(annotations=annotations)

@exporter(name='ICDAR Recognition', ext='ZIP', version='1.0')
def _export_recognition(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
dataset.transform(LabelToCaption)
with TemporaryDirectory() as temp_dir:
dataset.export(temp_dir, 'icdar_word_recognition', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR Recognition', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)
dataset = Dataset.import_from(tmp_dir, 'icdar_word_recognition', env=dm_env)
dataset.transform(CaptionToLabel, 'icdar')
zhiltsov-max marked this conversation as resolved.
Show resolved Hide resolved
import_dm_annotations(dataset, task_data)


@exporter(name='ICDAR Localization', ext='ZIP', version='1.0')
def _export_localization(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
with TemporaryDirectory() as temp_dir:
dataset.export(temp_dir, 'icdar_text_localization', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR Localization', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)

dataset = Dataset.import_from(tmp_dir, 'icdar_text_localization', env=dm_env)
dataset.transform(AddLabelToAnns, 'icdar')
import_dm_annotations(dataset, task_data)


@exporter(name='ICDAR Segmentation', ext='ZIP', version='1.0')
def _export_segmentation(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
with TemporaryDirectory() as temp_dir:
dataset.transform('polygons_to_masks')
dataset.transform('boxes_to_masks')
dataset.transform('merge_instance_segments')
dataset.export(temp_dir, 'icdar_text_segmentation', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR Segmentation', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)
dataset = Dataset.import_from(tmp_dir, 'icdar_text_segmentation', env=dm_env)
dataset.transform(AddLabelToAnns, 'icdar')
dataset.transform('masks_to_polygons')
import_dm_annotations(dataset, task_data)
1 change: 1 addition & 0 deletions cvat/apps/dataset_manager/formats/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,3 +98,4 @@ def make_exporter(name):
import cvat.apps.dataset_manager.formats.widerface
import cvat.apps.dataset_manager.formats.vggface2
import cvat.apps.dataset_manager.formats.market1501
import cvat.apps.dataset_manager.formats.icdar
11 changes: 10 additions & 1 deletion cvat/apps/dataset_manager/tests/test_formats.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,9 @@ def test_export_formats_query(self):
'WiderFace 1.0',
'VGGFace2 1.0',
'Market-1501 1.0',
'ICDAR Recognition 1.0',
'ICDAR Localization 1.0',
'ICDAR Segmentation 1.0',
})

def test_import_formats_query(self):
Expand All @@ -306,6 +309,9 @@ def test_import_formats_query(self):
'WiderFace 1.0',
'VGGFace2 1.0',
'Market-1501 1.0',
'ICDAR Recognition 1.0',
'ICDAR Localization 1.0',
'ICDAR Segmentation 1.0',
})

def test_exports(self):
Expand All @@ -319,7 +325,7 @@ def check(file_path):

format_name = f.DISPLAY_NAME
if format_name == "VGGFace2 1.0":
self.skipTest("Format does not support multiple shapes for one item")
self.skipTest("Format is disabled")

for save_images in { True, False }:
images = self._generate_task_images(3)
Expand Down Expand Up @@ -349,6 +355,9 @@ def test_empty_images_are_exported(self):
('WiderFace 1.0', 'wider_face'),
('VGGFace2 1.0', 'vgg_face2'),
('Market-1501 1.0', 'market1501'),
('ICDAR Recognition 1.0', 'icdar_word_recognition'),
('ICDAR Localization 1.0', 'icdar_text_localization'),
('ICDAR Segmentation 1.0', 'icdar_text_segmentation'),
]:
with self.subTest(format=format_name):
if not dm.formats.registry.EXPORT_FORMATS[format_name].ENABLED:
Expand Down
Loading