Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何提升版面分析模型精度 #10

Open
shiyutang opened this issue Sep 7, 2023 · 3 comments
Open

如何提升版面分析模型精度 #10

shiyutang opened this issue Sep 7, 2023 · 3 comments
Labels

Comments

@shiyutang
Copy link
Owner

在CDLA数据集上进行训练测试,效果不是很理想,有人能帮解决一下吗

我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no): yes

请尽量不要包含图片在问题中/Please try to not include the image in the issue.

@aceissuesolver
Copy link

aceissuesolver bot commented Sep 7, 2023

Issue Response

  • Introduction: Thanks for opening this issue, I am the AceIssueSolver that will help you with your questoins!
  • 🎯 First round response: The answer to your question is:

根据提供的信息,针对版面分析模型精度提升的问题,可以尝试以下方法:

  1. 增加数据增强方式或者调大增强概率:可以尝试使用更多的数据增强方式,或者增加数据增强的概率,来增加训练集的多样性,减少过拟合的情况。

  2. 调大系统的L2 decay值:可以尝试增大系统的L2 decay值,以减少模型的复杂度,防止过拟合。

另外,可以参考PP-OCRv3这一开源超轻量OCR系统进行车牌识别系统的开发。该模型在CCPD数据集上达到了99%的检测精度和94%的识别精度,模型大小为12.8M。通过量化训练可以进一步压缩模型大小到5.8M,并提升推理速度。

对于识别模型中多预测一个特殊字符的问题,可以尝试去掉这个特殊字符,可以显著提高精度。

此外,还可以参考可视化效果,通用模型更倾向于检测一整行文字,而轻量级模型可能会将一行文字分成两段进行检测,因此并不是数量越多效果就越好。

另外,尽量保证类别均衡,如果某些类别样本较少,可以通过补充合成数据的方式处理。实验证明,训练集中出现频次较少的字符会导致识别效果较差,增加迭代次数不能解决样本量少的问题。

如果测试图像为中文,可以下载中文CDLA数据集的预训练模型,用于识别10种类型的文档区域:表格、图像、图像标题、表格标题、页眉、页脚、参考文献、公式。可以在版面分析模型中下载模型的训练和推理模型。

Related Sources

How to use

To invoke the AceIssueSolver, add a comment using one of the following commands:
@Ace-issue-solver <QUESTION>: Pose more questions on AceIssueSolver.

@tangtianyi1998
Copy link

如果测试的模型中英文都有 cdla训练数据加入英文标注会改善吗?

@codeman008
Copy link

我训练测试效果还可以,想对比直接提供的cdla预训练模型效果,没跑通W0222 11:42:21.103763 1932985 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 11.8
W0222 11:42:21.105564 1932985 gpu_resources.cc:164] device: 0, cuDNN Version: 8.2.
Traceback (most recent call last):
File "tools/infer.py", line 237, in
main()
File "tools/infer.py", line 233, in main
run(FLAGS, cfg)
File "tools/infer.py", line 165, in run
trainer.load_weights(cfg.weights)
File "/home/user/zw-0219/PaddleDetection/ppdet/engine/trainer.py", line 438, in load_weights
load_pretrain_weight(self.model, weights, ARSL_eval)
File "/home/user/zw-0219/PaddleDetection/ppdet/utils/checkpoint.py", line 253, in load_pretrain_weight
param_state_dict = match_state_dict(model_dict, param_state_dict)
File "/home/user/zw-0219/PaddleDetection/ppdet/utils/checkpoint.py", line 143, in match_state_dict
weight_keys = sorted(weight_state_dict.keys())
AttributeError: 'Tensor' object has no attribute 'keys'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants