-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
版面矫正网络DocTr++论文复现 #10379
Comments
这个训练集是自制的,还得自己构建训练集
|
认领 约需1个月完成 |
数据集的构造已经在问题中进一步说明,有任何问题我们可以持续交流~ |
进行了论文解读,可以参考 |
等有时间了写一下训练部分 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
hello,进展如何? |
@zhuxiaobin 可以看下这个PR |
你好,进展如何? |
@Li-Yidong 可以看下这个PR |
感谢分享! |
背景
经过需求征集#10334 和每周技术研讨会 #10223 讨论,我们确定了DocTr++版面矫正任务,该任务在文档比对、关键字提取、合同篡改确认等重要场景发挥作用。本任务的完成能显著OCR结果的细粒度,并有众多场景应用。
通过定量实验和定性对比,作者团队验证了 DocTr++ 的性能优势及泛化性,并在现有及所提出的基准测试中刷新了多项最佳记录,是目前最优的文档矫正方案。
暂时没有预训练权重和训练代码,需要按照论文描述重新训练尝试。
解决步骤
数据集:
The text was updated successfully, but these errors were encountered: