Merge remote-tracking branch 'origin/dygraph' into dy1

Evezerest · Aug 25, 2022 · aa2e283 · aa2e283
2 parents d5fddbe + 4eafe00
commit aa2e283
Show file tree

Hide file tree

Showing 10 changed files with 46 additions and 42 deletions.
diff --git a/doc/doc_ch/algorithm_overview.md b/doc/doc_ch/algorithm_overview.md
@@ -24,7 +24,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型，**欢迎广
 ### 1.1 文本检测算法
 
 已支持的文本检测算法列表（戳链接获取使用教程）：
-- [x] [DB](./algorithm_det_db.md)
+- [x] [DB与DB++](./algorithm_det_db.md)
 - [x] [EAST](./algorithm_det_east.md)
 - [x] [SAST](./algorithm_det_sast.md)
 - [x] [PSENet](./algorithm_det_psenet.md)
@@ -41,6 +41,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型，**欢迎广
 |SAST|ResNet50_vd|91.39%|83.77%|87.42%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
 |PSE|ResNet50_vd|85.81%|79.53%|82.55%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
 |PSE|MobileNetV3|82.20%|70.48%|75.89%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
+|DB++|ResNet50|90.89%|82.66%|86.58%|[合成数据预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|
 
 在Total-text文本检测公开数据集上，算法效果如下：
 
@@ -129,10 +130,10 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型，**欢迎广
 
 已支持的关键信息抽取算法列表（戳链接获取使用教程）：
 
-- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm.md)
-- [x] [LayoutLM](./algorithm_kie_laoutxlm.md)
-- [x] [LayoutLMv2](./algorithm_kie_laoutxlm.md)
-- [x] [LayoutXLM](./algorithm_kie_laoutxlm.md)
+- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm.md)
+- [x] [LayoutLM](./algorithm_kie_layoutxlm.md)
+- [x] [LayoutLMv2](./algorithm_kie_layoutxlm.md)
+- [x] [LayoutXLM](./algorithm_kie_layoutxlm.md)
 - [x] [SDMGR](././algorithm_kie_sdmgr.md)
 
 在wildreceipt发票公开数据集上，算法复现效果如下：

diff --git a/doc/doc_en/algorithm_det_db_en.md b/doc/doc_en/algorithm_det_db_en.md
@@ -1,4 +1,4 @@
-# DB
+# DB && DB++
 
 - [1. Introduction](#1)
 - [2. Environment](#2)
@@ -21,13 +21,23 @@ Paper:
 > Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang
 > AAAI, 2020
 
+> [Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion](https://arxiv.org/abs/2202.10304)
+> Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang
+> TPAMI, 2022
+
 On the ICDAR2015 dataset, the text detection result is as follows:
 
 |Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
 | --- | --- | --- | --- | --- | --- | --- |
 |DB|ResNet50_vd|[configs/det/det_r50_vd_db.yml](../../configs/det/det_r50_vd_db.yml)|86.41%|78.72%|82.38%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)|
 |DB|MobileNetV3|[configs/det/det_mv3_db.yml](../../configs/det/det_mv3_db.yml)|77.29%|73.08%|75.12%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar)|
+|DB++|ResNet50|[configs/det/det_r50_db++_ic15.yml](../../configs/det/det_r50_db++_ic15.yml)|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|
+
+On the TD_TR dataset, the text detection result is as follows:
 
+|Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
+| --- | --- | --- | --- | --- | --- | --- |
+|DB++|ResNet50|[configs/det/det_r50_db++_td_tr.yml](../../configs/det/det_r50_db++_td_tr.yml)|92.92%|86.48%|89.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_td_tr_train.tar)|
 
 <a name="2"></a>
 ## 2. Environment
@@ -96,4 +106,12 @@ More deployment schemes supported for DB:
  pages={11474--11481},
  year={2020}
 }
-```
+
+@article{liao2022real,
+ title={Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion},
+ author={Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ year={2022},
+ publisher={IEEE}
+}
+```
diff --git a/doc/doc_en/algorithm_overview_en.md b/doc/doc_en/algorithm_overview_en.md
@@ -22,7 +22,7 @@ Developers are welcome to contribute more algorithms! Please refer to [add new a
 ### 1.1 Text Detection Algorithms
 
 Supported text detection algorithms (Click the link to get the tutorial):
-- [x] [DB](./algorithm_det_db_en.md)
+- [x] [DB && DB++](./algorithm_det_db_en.md)
 - [x] [EAST](./algorithm_det_east_en.md)
 - [x] [SAST](./algorithm_det_sast_en.md)
 - [x] [PSENet](./algorithm_det_psenet_en.md)
@@ -39,6 +39,7 @@ On the ICDAR2015 dataset, the text detection result is as follows:
 |SAST|ResNet50_vd|91.39%|83.77%|87.42%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
 |PSE|ResNet50_vd|85.81%|79.53%|82.55%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
 |PSE|MobileNetV3|82.20%|70.48%|75.89%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
+|DB++|ResNet50|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|
 
 On Total-Text dataset, the text detection result is as follows:
 
@@ -127,10 +128,10 @@ On the PubTabNet dataset, the algorithm result is as follows:
 
 Supported KIE algorithms (Click the link to get the tutorial):
 
-- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm_en.md)
-- [x] [LayoutLM](./algorithm_kie_laoutxlm_en.md)
-- [x] [LayoutLMv2](./algorithm_kie_laoutxlm_en.md)
-- [x] [LayoutXLM](./algorithm_kie_laoutxlm_en.md)
+- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm_en.md)
+- [x] [LayoutLM](./algorithm_kie_layoutxlm_en.md)
+- [x] [LayoutLMv2](./algorithm_kie_layoutxlm_en.md)
+- [x] [LayoutXLM](./algorithm_kie_layoutxlm_en.md)
 - [x] [SDMGR](./algorithm_kie_sdmgr_en.md)
 
 On wildreceipt dataset, the algorithm result is as follows:

diff --git a/ppocr/postprocess/rec_postprocess.py b/ppocr/postprocess/rec_postprocess.py
@@ -24,7 +24,7 @@ class BaseRecLabelDecode(object):
  def __init__(self, character_dict_path=None, use_space_char=False):
  self.beg_str = "sos"
  self.end_str = "eos"
-
+ self.reverse = False
  self.character_str = []
  if character_dict_path is None:
  self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
@@ -38,18 +38,15 @@ def __init__(self, character_dict_path=None, use_space_char=False):
  if use_space_char:
  self.character_str.append(" ")
  dict_character = list(self.character_str)
+ if 'arabic' in character_dict_path:
+ self.reverse = True
 
  dict_character = self.add_special_char(dict_character)
  self.dict = {}
  for i, char in enumerate(dict_character):
  self.dict[char] = i
  self.character = dict_character
 
- if 'arabic' in character_dict_path:
- self.reverse = True
- else:
- self.reverse = False
-
  def pred_reverse(self, pred):
  pred_re = []
  c_current = ''

diff --git a/ppstructure/kie/README.md b/ppstructure/kie/README.md
@@ -242,9 +242,7 @@ For training, evaluation and inference tutorial for KIE models, please refer to
 
 For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](../../doc/doc_en/detection_en.md).
 
-For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition.md).
-
-If you want to finish the KIE tasks in your scene, and don't know what to prepare, please refer to [End cdoc](../../doc/doc_en/recognition.md).
+For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition_en.md).
 
 To complete the key information extraction task in your own scenario from data preparation to model selection, please refer to: [Guide to End-to-end KIE](./how_to_do_kie_en.md)。
 

diff --git a/ppstructure/pdf2word/pdf2word.md b/ppstructure/pdf2word/pdf2word.md
@@ -1,11 +1,6 @@
 # PDF2WORD
 
-PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序，提供可直接安装的exe，方便windows用户运行
-
-<div align="center">
- <img src="./doc/imgs_results/PP-OCRv3/en/en_4.png" width="200">
-</div>
-
+PDF2WORD是PaddleOCR社区开发者[whjdark](https://github.com/whjdark) 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序，提供可直接安装的exe，方便windows用户运行
 
 ## 1.使用
 
@@ -23,17 +18,7 @@ PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模
 python pdf2word.py
 ```
 
-## 2.自行打包
-
-PDF2WORD应用程序通过[QPT](https://github.com/QPT-Family/QPT)工具打包实现，若您修改了界面代码需要重新打包，请在 `PaddleOCR` 文件夹下运行下方指令
-
-```
-cd ./
-mv ./ppstructure/pdf2word .. -r
-python GenEXE.py
-```
-
-## 3.软件下载
+## 2.软件下载
 
 如需获取已打包程序，可以扫描下方二维码，关注公众号填写问卷后，加入PaddleOCR官方交流群免费获取20G OCR学习大礼包，内含OCR场景应用集合（包含数码管、液晶屏、车牌、高精度SVTR模型等7个垂类模型）、《动手学OCR》电子书、课程回放视频、前沿论文等重磅资料
 

diff --git a/ppstructure/pdf2word/pdf2word.py b/ppstructure/pdf2word/pdf2word.py
@@ -438,4 +438,4 @@ def main():
 
 
 if __name__ == "__main__":
- main()
+ main()
diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md
@@ -51,7 +51,9 @@ The performance indicators are explained as follows:
 
 ### 4.1 Quick start
 
-PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The following takes the Chinese table recognition model as an example to introduce how to recognize a table.
+PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.
+
+The following takes the Chinese table recognition model as an example to introduce how to recognize a table.
 
 Use the following commands to quickly complete the identification of a table.
 

diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md
@@ -57,7 +57,9 @@
 
 ### 4.1 快速开始
 
-PP-Structure目前提供了中英文两种语言的表格识别模型，模型链接见 [models_list](../docs/models_list.md)。下面以中文表格识别模型为例，介绍如何识别一张表格。
+PP-Structure目前提供了中英文两种语言的表格识别模型，模型链接见 [models_list](../docs/models_list.md)。也提供了whl包的形式方便快速使用，详见 [quickstart](../docs/quickstart.md)。
+
+下面以中文表格识别模型为例，介绍如何识别一张表格。
 
 使用如下命令即可快速完成一张表格的识别。
 ```python

diff --git a/requirements.txt b/requirements.txt
@@ -7,7 +7,7 @@ tqdm
 numpy
 visualdl
 rapidfuzz
-opencv-contrib-python==4.4.0.46
+opencv-contrib-python
 cython
 lxml
 premailer