Skip to content

Latest commit

 

History

History
5 lines (4 loc) · 424 Bytes

README.md

File metadata and controls

5 lines (4 loc) · 424 Bytes

OCR-Transformers

A deep learning project that combines Vision Transformers (ViT) and GPT-2 for image-to-text generation, specifically designed for OCR (Optical Character Recognition) tasks. This project uses a vision-language model architecture to generate textual descriptions from images.

result