Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/dygraph' into dy1
Browse files Browse the repository at this point in the history
  • Loading branch information
Evezerest committed Aug 25, 2022
2 parents d5fddbe + 4eafe00 commit aa2e283
Show file tree
Hide file tree
Showing 10 changed files with 46 additions and 42 deletions.
11 changes: 6 additions & 5 deletions doc/doc_ch/algorithm_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
### 1.1 文本检测算法

已支持的文本检测算法列表(戳链接获取使用教程):
- [x] [DB](./algorithm_det_db.md)
- [x] [DB与DB++](./algorithm_det_db.md)
- [x] [EAST](./algorithm_det_east.md)
- [x] [SAST](./algorithm_det_sast.md)
- [x] [PSENet](./algorithm_det_psenet.md)
Expand All @@ -41,6 +41,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
|SAST|ResNet50_vd|91.39%|83.77%|87.42%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
|PSE|ResNet50_vd|85.81%|79.53%|82.55%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
|PSE|MobileNetV3|82.20%|70.48%|75.89%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
|DB++|ResNet50|90.89%|82.66%|86.58%|[合成数据预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|

在Total-text文本检测公开数据集上,算法效果如下:

Expand Down Expand Up @@ -129,10 +130,10 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广

已支持的关键信息抽取算法列表(戳链接获取使用教程):

- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm.md)
- [x] [LayoutLM](./algorithm_kie_laoutxlm.md)
- [x] [LayoutLMv2](./algorithm_kie_laoutxlm.md)
- [x] [LayoutXLM](./algorithm_kie_laoutxlm.md)
- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm.md)
- [x] [LayoutLM](./algorithm_kie_layoutxlm.md)
- [x] [LayoutLMv2](./algorithm_kie_layoutxlm.md)
- [x] [LayoutXLM](./algorithm_kie_layoutxlm.md)
- [x] [SDMGR](././algorithm_kie_sdmgr.md)

在wildreceipt发票公开数据集上,算法复现效果如下:
Expand Down
22 changes: 20 additions & 2 deletions doc/doc_en/algorithm_det_db_en.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# DB
# DB && DB++

- [1. Introduction](#1)
- [2. Environment](#2)
Expand All @@ -21,13 +21,23 @@ Paper:
> Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang
> AAAI, 2020
> [Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion](https://arxiv.org/abs/2202.10304)
> Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang
> TPAMI, 2022
On the ICDAR2015 dataset, the text detection result is as follows:

|Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
| --- | --- | --- | --- | --- | --- | --- |
|DB|ResNet50_vd|[configs/det/det_r50_vd_db.yml](../../configs/det/det_r50_vd_db.yml)|86.41%|78.72%|82.38%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)|
|DB|MobileNetV3|[configs/det/det_mv3_db.yml](../../configs/det/det_mv3_db.yml)|77.29%|73.08%|75.12%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar)|
|DB++|ResNet50|[configs/det/det_r50_db++_ic15.yml](../../configs/det/det_r50_db++_ic15.yml)|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|

On the TD_TR dataset, the text detection result is as follows:

|Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
| --- | --- | --- | --- | --- | --- | --- |
|DB++|ResNet50|[configs/det/det_r50_db++_td_tr.yml](../../configs/det/det_r50_db++_td_tr.yml)|92.92%|86.48%|89.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_td_tr_train.tar)|

<a name="2"></a>
## 2. Environment
Expand Down Expand Up @@ -96,4 +106,12 @@ More deployment schemes supported for DB:
pages={11474--11481},
year={2020}
}
```
@article{liao2022real,
title={Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion},
author={Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2022},
publisher={IEEE}
}
```
11 changes: 6 additions & 5 deletions doc/doc_en/algorithm_overview_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Developers are welcome to contribute more algorithms! Please refer to [add new a
### 1.1 Text Detection Algorithms

Supported text detection algorithms (Click the link to get the tutorial):
- [x] [DB](./algorithm_det_db_en.md)
- [x] [DB && DB++](./algorithm_det_db_en.md)
- [x] [EAST](./algorithm_det_east_en.md)
- [x] [SAST](./algorithm_det_sast_en.md)
- [x] [PSENet](./algorithm_det_psenet_en.md)
Expand All @@ -39,6 +39,7 @@ On the ICDAR2015 dataset, the text detection result is as follows:
|SAST|ResNet50_vd|91.39%|83.77%|87.42%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
|PSE|ResNet50_vd|85.81%|79.53%|82.55%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
|PSE|MobileNetV3|82.20%|70.48%|75.89%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
|DB++|ResNet50|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|

On Total-Text dataset, the text detection result is as follows:

Expand Down Expand Up @@ -127,10 +128,10 @@ On the PubTabNet dataset, the algorithm result is as follows:

Supported KIE algorithms (Click the link to get the tutorial):

- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm_en.md)
- [x] [LayoutLM](./algorithm_kie_laoutxlm_en.md)
- [x] [LayoutLMv2](./algorithm_kie_laoutxlm_en.md)
- [x] [LayoutXLM](./algorithm_kie_laoutxlm_en.md)
- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm_en.md)
- [x] [LayoutLM](./algorithm_kie_layoutxlm_en.md)
- [x] [LayoutLMv2](./algorithm_kie_layoutxlm_en.md)
- [x] [LayoutXLM](./algorithm_kie_layoutxlm_en.md)
- [x] [SDMGR](./algorithm_kie_sdmgr_en.md)

On wildreceipt dataset, the algorithm result is as follows:
Expand Down
9 changes: 3 additions & 6 deletions ppocr/postprocess/rec_postprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ class BaseRecLabelDecode(object):
def __init__(self, character_dict_path=None, use_space_char=False):
self.beg_str = "sos"
self.end_str = "eos"

self.reverse = False
self.character_str = []
if character_dict_path is None:
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
Expand All @@ -38,18 +38,15 @@ def __init__(self, character_dict_path=None, use_space_char=False):
if use_space_char:
self.character_str.append(" ")
dict_character = list(self.character_str)
if 'arabic' in character_dict_path:
self.reverse = True

dict_character = self.add_special_char(dict_character)
self.dict = {}
for i, char in enumerate(dict_character):
self.dict[char] = i
self.character = dict_character

if 'arabic' in character_dict_path:
self.reverse = True
else:
self.reverse = False

def pred_reverse(self, pred):
pred_re = []
c_current = ''
Expand Down
4 changes: 1 addition & 3 deletions ppstructure/kie/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,9 +242,7 @@ For training, evaluation and inference tutorial for KIE models, please refer to

For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](../../doc/doc_en/detection_en.md).

For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition.md).

If you want to finish the KIE tasks in your scene, and don't know what to prepare, please refer to [End cdoc](../../doc/doc_en/recognition.md).
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition_en.md).

To complete the key information extraction task in your own scenario from data preparation to model selection, please refer to: [Guide to End-to-end KIE](./how_to_do_kie_en.md)

Expand Down
19 changes: 2 additions & 17 deletions ppstructure/pdf2word/pdf2word.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,6 @@
# PDF2WORD

PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序,提供可直接安装的exe,方便windows用户运行

<div align="center">
<img src="./doc/imgs_results/PP-OCRv3/en/en_4.png" width="200">
</div>

PDF2WORD是PaddleOCR社区开发者[whjdark](https://github.com/whjdark) 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序,提供可直接安装的exe,方便windows用户运行

## 1.使用

Expand All @@ -23,17 +18,7 @@ PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模
python pdf2word.py
```

## 2.自行打包

PDF2WORD应用程序通过[QPT](https://github.com/QPT-Family/QPT)工具打包实现,若您修改了界面代码需要重新打包,请在 `PaddleOCR` 文件夹下运行下方指令

```
cd ./
mv ./ppstructure/pdf2word .. -r
python GenEXE.py
```

## 3.软件下载
## 2.软件下载

如需获取已打包程序,可以扫描下方二维码,关注公众号填写问卷后,加入PaddleOCR官方交流群免费获取20G OCR学习大礼包,内含OCR场景应用集合(包含数码管、液晶屏、车牌、高精度SVTR模型等7个垂类模型)、《动手学OCR》电子书、课程回放视频、前沿论文等重磅资料

Expand Down
2 changes: 1 addition & 1 deletion ppstructure/pdf2word/pdf2word.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,4 +438,4 @@ def main():


if __name__ == "__main__":
main()
main()
4 changes: 3 additions & 1 deletion ppstructure/table/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,9 @@ The performance indicators are explained as follows:

### 4.1 Quick start

PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The following takes the Chinese table recognition model as an example to introduce how to recognize a table.
PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.

The following takes the Chinese table recognition model as an example to introduce how to recognize a table.

Use the following commands to quickly complete the identification of a table.

Expand Down
4 changes: 3 additions & 1 deletion ppstructure/table/README_ch.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,9 @@

### 4.1 快速开始

PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 [models_list](../docs/models_list.md)。下面以中文表格识别模型为例,介绍如何识别一张表格。
PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 [models_list](../docs/models_list.md)。也提供了whl包的形式方便快速使用,详见 [quickstart](../docs/quickstart.md)

下面以中文表格识别模型为例,介绍如何识别一张表格。

使用如下命令即可快速完成一张表格的识别。
```python
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ tqdm
numpy
visualdl
rapidfuzz
opencv-contrib-python==4.4.0.46
opencv-contrib-python
cython
lxml
premailer
Expand Down

0 comments on commit aa2e283

Please sign in to comment.