Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

easyOCR evaluation #8

Open
boltonn opened this issue Dec 12, 2024 · 2 comments
Open

easyOCR evaluation #8

boltonn opened this issue Dec 12, 2024 · 2 comments

Comments

@boltonn
Copy link

boltonn commented Dec 12, 2024

This is amazing work! Any plans to include easyOCR in the benchmark?

https://github.com/JaidedAI/EasyOCR

Secondary ask is if you have a prioritized list of other languagesm

@ouyanglinke
Copy link
Collaborator

Thank you for your attention. We will gradually add more models to the evaluation leaderboard, and the easyOCR results will be updated in the OCR module.
Currently, we don't have a schedule for expanding to other language data types, as this requires a high level of expertise from annotators. If there is further support for this work, we will update our TODO list accordingly.

@ouyanglinke
Copy link
Collaborator

Hi, we evaluate the EasyOCR in Omnidocbench at OCR module. Here is the result:

Model Type Model Language Text background Text Rotate
EN ZH Mixed White Single Multi Normal Rotate90 Rotate270 Horizontal
Expert Vision Models EasyOCR 0.260 0.398 0.445 0.366 0.287 0.388 0.360 0.970 0.997 0.926
PaddleOCR 0.071 0.055 0.118 0.060 0.038 0.0848 0.060 0.015 0.285 0.021
Tesseract OCR 0.179 0.553 0.553 0.453 0.463 0.394 0.448 0.369 0.979 0.982
Surya 0.057 0.123 0.164 0.093 0.186 0.235 0.104 0.634 0.767 0.255
GOT-OCR 0.041 0.112 0.135 0.092 0.052 0.155 0.091 0.562 0.966 0.097
Mathpix 0.033 0.240 0.261 0.185 0.121 0.166 0.180 0.038 0.185 0.638
Vision Language Models Qwen2-VL-72B 0.072 0.274 0.286 0.234 0.155 0.148 0.223 0.273 0.721 0.067
InternVL2-Llama3-76B 0.074 0.155 0.242 0.113 0.352 0.269 0.132 0.610 0.907 0.595
GPT4o 0.020 0.224 0.125 0.167 0.140 0.220 0.168 0.115 0.718 0.132

This is our model inference code for EasyOCR, aligned with model inference code for other model, adding a 50-pixel white border to each image.

import cv2
import numpy as np
from pathlib import Path
import pdb
import sys
from tqdm import tqdm
import logging

import os
import json
import numpy

from PIL import Image, ImageOps
import easyocr
import pdb

def model_infer(engine, img, lan, img_name):
    img_add_border = add_white_border(img)
    img_ndarray = numpy.array(img_add_border)
    # img = cv2.imdecode(img_ndarray, cv2.IMREAD_COLOR)
    tmp_img_path = f'tmp_easyocr.jpg'
    cv2.imwrite(tmp_img_path, img_ndarray)
    result = engine.readtext(tmp_img_path)

    text = ''
    for idx in range(len(result)):
        res = result[idx]
        t = res[1]
        text += t
    return text

def add_white_border(img: Image):
    border_width = 50
    border_color = (255, 255, 255) 
    img_with_border = ImageOps.expand(img, border=border_width, fill=border_color)
    return img_with_border


def poly2bbox(poly):
    L = poly[0]
    U = poly[1]
    R = poly[2]
    D = poly[5]
    L, R = min(L, R), max(L, R)
    U, D = min(U, D), max(U, D)
    bbox = [L, U, R, D]
    return bbox

def main():
    engine = easyocr.Reader(['ch_sim','en'])

    with open('./OmniDocBench/OmniDocBench.json', 'r') as f:
        samples = json.load(f)
    for sample in samples:
        img_name = os.path.basename(sample['page_info']['image_path'])
        img_path = os.path.join('./OmniDocBench/images', img_name)
        img = Image.open(img_path)
        if not os.path.exists(img_path):
            print('No exist: ', img_name)
            continue
        for i, anno in enumerate(sample['layout_dets']):
            if not anno.get('text'):
                continue
            # print(anno)
            lan = anno['attribute'].get('text_language', 'mixed')
            bbox = poly2bbox(anno['poly'])
            image = img.crop(bbox).convert('RGB') # crop text block
            outputs = model_infer(engine, image, lan, img_name)

            anno['pred'] = outputs
        with open('./OmniDocBench/result/OmniDocBench_easyocr_text_ocr.jsonl', 'a', encoding='utf-8') as f:
            json.dump(sample, f, ensure_ascii=False)
            f.write('\n')

def save_json():
    with open('./OmniDocBench/result/OmniDocBench_easyocr_text_ocr.jsonl', 'r') as f:
        lines = f.readlines()
    samples = [json.loads(line) for line in lines]
    with open('./OmniDocBench/result/OmniDocBench_easyocr_text_ocr.json', 'w', encoding='utf-8') as f:
        json.dump(samples, f, indent=4, ensure_ascii=False)

if __name__ == '__main__':
    main()
    save_json()

Please let us know if there are any issues in the infer code.
If you have no questions about the results, we will update the EasyOCR model results on the evaluation leaderboard soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants