A PyTorch implementation of the ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation paper. Parts of the code have been adapted from the official implementation of the paper. The purpose of this repository is to provide a clear and simple way to understand and replicate the results of the paper.
- PyTorch v1.6.0 - for all the deep learning components
- PyTorch-FID - for FID score calculation
- OpenCV 3 - for image processing (not required for generating new images)
A complete requirements.txt
file will be added soon.
-
Download the IAM dataset or the RIMES database and keep them in the data
/data/
directory as shown below:├── data | ├── IAM | └──ascii | └──words.txt | └──words | └──a01 | └──a02 | . | . | └──original_partition | └──te.lst, tr.lst, va1.lst, va2.lst | ├── RIMES | └──ground_truth_training_icdar2011.txt | └──training | └──lot_1 | └──lot_2 | . | . | └──ground_truth_validation_icdar2011.txt | └──validation | └──lot_14 | └──lot_15 | └──lot_16 | . | . | └── prepare_data.py
-
Modify the
/config.py
file to change dataset, model architecture , image height, etc. The default parameters indicate the ones used in the paper. -
From the
data
directory, run:python prepare_data.py
This will process the ground-truth labels and images, and create a pickle file to be used for training.
-
Start model training by running the below command from the main directory:
python train.py
This will start training the model. A sample generated image will be saved in the
output
directory after every epoch. Tensorboard logging has also been enabled.
The easiest way to generate images is to use this demo; it has options for generating random text, specific text, random styles, consistent style, etc. Another option is to download these files:
- Pretrained models for English (IAM) or French (RIMES).
- Character mapping for English (IAM) or (French (RIMES).
- Lexicon files for English or French.
After downloading the required files, follow the below steps:
- Change the
dataset
andlexicon_file
path inconfig.py
. - Run:
This will generate random images. You can also check the arguments in
python generate_images.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file'
generate_images.py
to see more options.
Create the preprocessed data file as described in steps 1-3 of "Steps for training the ScrabbleGAN model from scratch".
Also, either download the model checkpoints for English (IAM)
or French (RIMES), or
train your own model and save the checkpoints. To check the FID score, run:
bash python calculate_metrics.py -c 'path_to_checkpoint_file'
One of the motivation in the paper was to boost the HTR performance using synthetic data generated by ScrabbleGAN. The code for HTR training has not been provided in this repository for consistency with the author's approach of using this code for HTR training. You can follow the below steps for HTR training:
- Create your own models or download all the files listed in "Steps for generating new images". Also, create the preprocessed data file as described in steps 1-3 of "Steps for training the ScrabbleGAN model from scratch".
- If required, change
dataset
,partition
,data_file
,lexicon_file
inconfig.py
- To create LMDB data files required for HTR training, run:
to create lmdb dataset without any synthetic images, or
python create_lmdb_dataset.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file'
to add generated images to the original dataset.python create_lmdb_dataset.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file' -n 100000
- Train the HTR model as described here