Document OCR

About the Project

This project is part of the SJTU ICE4309 - Image Processing & Content Analysis course. We implemented an OCR framework for converting in-the-wild documents to digitally readable and recognizable text.

Features

The model architecture of Document OCR is shown below:

The images undergo preprocessing, including edge detection, contour detection, perspective transformation and binarization to further enhance the image.
The text detection module uses the DBNet model with MobileNetV3 as the backbone network.
The text recognition module uses the CRNN model with MobileNetV3 as the backbone network.

Getting Started

To get started with your project, follow the steps below to set up your environment, install the necessary dependencies.

Create and activate new conda environment

conda create -n ocr python=3.9
conda activate ocr

Install pip requirements

pip install -r requirements.txt

Usage

Run the script

python run.py --img <IMG_DIR> --preprocess

Replace <IMG_DIR> with the path to a single image. Specify --preprocess to preprocess the input image

Example

python run.py --img input_img/receipt.jpg --preprocess

Demonstrations

Edge Detection

Input Image	Grayscale Conversion	Gaussian Blur	Closing	Canny

Contour Detection

LSD	Horizontal Line Segments	Vertical Line Segments	Final Contour

Perspective Transformation & Binarization

Perspective Transformation	Binarization

Text Detection & Recognition

Text Detection	Text Recognition

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docscan		docscan
doctr		doctr
input_img		input_img
result		result
src		src
.gitignore		.gitignore
README.md		README.md
document_ocr.jpg		document_ocr.jpg
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document OCR

About the Project

Features

Getting Started

Create and activate new conda environment

Install pip requirements

Usage

Run the script

Example

Demonstrations

Edge Detection

Contour Detection

Perspective Transformation & Binarization

Text Detection & Recognition

About

Releases

Packages

Languages

junwai7159/Document-OCR

Folders and files

Latest commit

History

Repository files navigation

Document OCR

About the Project

Features

Getting Started

Create and activate new conda environment

Install pip requirements

Usage

Run the script

Example

Demonstrations

Edge Detection

Contour Detection

Perspective Transformation & Binarization

Text Detection & Recognition

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages