Merge remote-tracking branch 'upstream/dygraph' into dy1

Evezerest · Jan 29, 2021 · f20f6d2 · f20f6d2
2 parents 647db30 + acd479e
commit f20f6d2
Show file tree

Hide file tree

Showing 71 changed files with 1,996 additions and 225 deletions.
diff --git a/PPOCRLabel/PPOCRLabel.py b/PPOCRLabel/PPOCRLabel.py
@@ -1031,7 +1031,7 @@ def format_shape(s):
 
  for box in self.result_dic:
  trans_dic = {"label": box[1][0], "points": box[0], 'difficult': False}
- if trans_dic["label"] is "" and mode == 'Auto':
+ if trans_dic["label"] == "" and mode == 'Auto':
  continue
  shapes.append(trans_dic)
 
@@ -1763,7 +1763,7 @@ def reRecognition(self):
  QMessageBox.information(self, "Information", msg)
  return
  result = self.ocr.ocr(img_crop, cls=True, det=False)
- if result[0][0] is not '':
+ if result[0][0] != '':
  result.insert(0, box)
  print('result in reRec is ', result)
  self.result_dic.append(result)
@@ -1794,7 +1794,7 @@ def singleRerecognition(self):
  QMessageBox.information(self, "Information", msg)
  return
  result = self.ocr.ocr(img_crop, cls=True, det=False)
- if result[0][0] is not '':
+ if result[0][0] != '':
  result.insert(0, box)
  print('result in reRec is ', result)
  if result[1][0] == shape.label:
@@ -1999,7 +1999,7 @@ def main():
  resource_file = './libs/resources.py'
  if not os.path.exists(resource_file):
  output = os.system('pyrcc5 -o libs/resources.py resources.qrc')
- assert output is 0, "operate the cmd have some problems ,please check whether there is a in the lib " \
+ assert output == 0, "operate the cmd have some problems ,please check whether there is a in the lib " \
  "directory resources.py "
  import libs.resources
  sys.exit(main())
diff --git a/README.md b/README.md
@@ -5,10 +5,11 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
 
 ## Notice
 PaddleOCR supports both dynamic graph and static graph programming paradigm
-- Dynamic graph: dygraph branch (default), **supported by paddle 2.0rc1+ ([installation](./doc/doc_en/installation_en.md))**
+- Dynamic graph: dygraph branch (default), **supported by paddle 2.0.0 ([installation](./doc/doc_en/installation_en.md))**
 - Static graph: develop branch
 
 **Recent updates**
+- 2021.1.21 update more than 25+ multilingual recognition models [models list](./doc/doc_en/models_list_en.md), including：English, Chinese, German, French, Japanese，Spanish，Portuguese Russia Arabic and so on. Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
 - 2020.12.15 update Data synthesis tool, i.e., [Style-Text](./StyleText/README.md)，easy to synthesize a large number of images which are similar to the target scene image.
 - 2020.11.25 Update a new data annotation tool, i.e., [PPOCRLabel](./PPOCRLabel/README.md), which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
 - 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941

diff --git a/README_ch.md b/README_ch.md
@@ -4,11 +4,13 @@
 PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力使用者训练出更好的模型，并应用落地。
 ## 注意
 PaddleOCR同时支持动态图与静态图两种编程范式
-- 动态图版本：dygraph分支（默认），需将paddle版本升级至2.0rc1+（[快速安装](./doc/doc_ch/installation.md)）
+- 动态图版本：dygraph分支（默认），需将paddle版本升级至2.0.0（[快速安装](./doc/doc_ch/installation.md)）
 - 静态图版本：develop分支
 
 **近期更新**
-- 2021.1.18 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题，总数152个，每周一都会更新，欢迎大家持续关注。
+- 2021.1.26,28,29 PaddleOCR官方研发团队带来技术深入解读三日直播课，1月26日、28日、29日晚上19:30，[直播地址](https://live.bilibili.com/21689802)
+- 2021.1.25 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题，总数157个，每周一都会更新，欢迎大家持续关注。
+- 2021.1.21 更新多语言识别模型，目前支持语种超过27种，[多语言模型下载](./doc/doc_ch/models_list.md)，包括中文简体、中文繁体、英文、法文、德文、韩文、日文、意大利文、西班牙文、葡萄牙文、俄罗斯文、阿拉伯文等，后续计划可以参考[多语言研发计划](https://github.com/PaddlePaddle/PaddleOCR/issues/1048)
 - 2020.12.15 更新数据合成工具[Style-Text](./StyleText/README_ch.md)，可以批量合成大量与目标场景类似的图像，在多个场景验证，效果明显提升。
 - 2020.11.25 更新半自动标注工具[PPOCRLabel](./PPOCRLabel/README_ch.md)，辅助开发者高效完成标注任务，输出格式与PP-OCR训练任务完美衔接。
 - 2020.9.22 更新PP-OCR技术文章，https://arxiv.org/abs/2009.09941

diff --git a/StyleText/README.md b/StyleText/README.md
@@ -72,7 +72,7 @@ fusion_generator:
 python3 tools/synth_image.py -c configs/config.yml --style_image examples/style_images/2.jpg --text_corpus PaddleOCR --language en
 ```
 
-* Note 1: The language options is correspond to the corpus. Currently, the tool only supports English, Simplified Chinese and Korean.
+* Note 1: The language options is correspond to the corpus. Currently, the tool only supports English(en), Simplified Chinese(ch) and Korean(ko).
 * Note 2: Synth-Text is mainly used to generate images for OCR recognition models.
  So the height of style images should be around 32 pixels. Images in other sizes may behave poorly.
 * Note 3: You can modify `use_gpu` in `configs/config.yml` to determine whether to use GPU for prediction.
@@ -120,7 +120,7 @@ In actual application scenarios, it is often necessary to synthesize pictures in
  * `with_label`：Whether the `label_file` is label file list.
  * `CorpusGenerator`：
  * `method`：Method of CorpusGenerator，supports `FileCorpus` and `EnNumCorpus`. If `EnNumCorpus` is used，No other configuration is needed，otherwise you need to set `corpus_file` and `language`.
- * `language`：Language of the corpus.
+ * `language`：Language of the corpus. Currently, the tool only supports English(en), Simplified Chinese(ch) and Korean(ko). 
  * `corpus_file`: Filepath of the corpus. Corpus file should be a text file which will be split by line-endings（'\n'）. Corpus generator samples one line each time.
 
 

diff --git a/StyleText/README_ch.md b/StyleText/README_ch.md
@@ -63,10 +63,10 @@ fusion_generator:
 ```python
 python3 tools/synth_image.py -c configs/config.yml --style_image examples/style_images/2.jpg --text_corpus PaddleOCR --language en
 ```
-* 注1：语言选项和语料相对应，目前该工具只支持英文、简体中文和韩语。
+* 注1：语言选项和语料相对应，目前支持英文(en)、简体中文(ch)和韩语(ko)。
 * 注2：Style-Text生成的数据主要应用于OCR识别场景。基于当前PaddleOCR识别模型的设计，我们主要支持高度在32左右的风格图像。
  如果输入图像尺寸相差过多，效果可能不佳。
-* 注3：可以通过修改配置文件中的`use_gpu`(true或者false)参数来决定是否使用GPU进行预测。
+* 注3：可以通过修改配置文件`configs/config.yml`中的`use_gpu`(true或者false)参数来决定是否使用GPU进行预测。
 
 
 例如，输入如下图片和语料"PaddleOCR":
@@ -105,7 +105,7 @@ python3 tools/synth_image.py -c configs/config.yml --style_image examples/style_
  * `with_label`：标志`label_file`是否为label文件。
  * `CorpusGenerator`：
  * `method`：语料生成方法，目前有`FileCorpus`和`EnNumCorpus`可选。如果使用`EnNumCorpus`，则不需要填写其他配置，否则需要修改`corpus_file`和`language`；
- * `language`：语料的语种；
+ * `language`：语料的语种，目前支持英文(en)、简体中文(ch)和韩语(ko)；
  * `corpus_file`: 语料文件路径。语料文件应使用文本文件。语料生成器首先会将语料按行切分，之后每次随机选取一行。
 
  语料文件格式示例：

diff --git a/configs/rec/multi_language/rec_en_number_lite_train.yml b/configs/rec/multi_language/rec_en_number_lite_train.yml
@@ -16,7 +16,7 @@ Global:
  infer_img:
  # for data or label process
  character_dict_path: ppocr/utils/dict/en_dict.txt
- character_type: ch
+ character_type: EN
  max_text_length: 25
  infer_mode: False
  use_space_char: False

diff --git a/configs/rec/rec_mv3_none_bilstm_ctc.yml b/configs/rec/rec_mv3_none_bilstm_ctc.yml
@@ -1,5 +1,5 @@
 Global:
- use_gpu: true
+ use_gpu: True
  epoch_num: 72
  log_smooth_window: 20
  print_batch_step: 10
@@ -59,7 +59,7 @@ Metric:
 
 Train:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/training/
  transforms:
  - DecodeImage: # load image
@@ -78,7 +78,7 @@ Train:
 
 Eval:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/validation/
  transforms:
  - DecodeImage: # load image

diff --git a/configs/rec/rec_mv3_none_none_ctc.yml b/configs/rec/rec_mv3_none_none_ctc.yml
@@ -58,7 +58,7 @@ Metric:
 
 Train:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/training/
  transforms:
  - DecodeImage: # load image
@@ -77,7 +77,7 @@ Train:
 
 Eval:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/validation/
  transforms:
  - DecodeImage: # load image

diff --git a/configs/rec/rec_mv3_tps_bilstm_ctc.yml b/configs/rec/rec_mv3_tps_bilstm_ctc.yml
@@ -63,7 +63,7 @@ Metric:
 
 Train:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/training/
  transforms:
  - DecodeImage: # load image
@@ -82,7 +82,7 @@ Train:
 
 Eval:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/validation/
  transforms:
  - DecodeImage: # load image

diff --git a/configs/rec/rec_r34_vd_none_bilstm_ctc.yml b/configs/rec/rec_r34_vd_none_bilstm_ctc.yml
@@ -58,7 +58,7 @@ Metric:
 
 Train:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/training/
  transforms:
  - DecodeImage: # load image
@@ -77,7 +77,7 @@ Train:
 
 Eval:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/validation/
  transforms:
  - DecodeImage: # load image

diff --git a/configs/rec/rec_r34_vd_none_none_ctc.yml b/configs/rec/rec_r34_vd_none_none_ctc.yml
@@ -56,7 +56,7 @@ Metric:
 
 Train:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/training/
  transforms:
  - DecodeImage: # load image
@@ -75,7 +75,7 @@ Train:
 
 Eval:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/validation/
  transforms:
  - DecodeImage: # load image

diff --git a/configs/rec/rec_r34_vd_tps_bilstm_ctc.yml b/configs/rec/rec_r34_vd_tps_bilstm_ctc.yml
@@ -62,7 +62,7 @@ Metric:
 
 Train:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/training/
  transforms:
  - DecodeImage: # load image
@@ -81,7 +81,7 @@ Train:
 
 Eval:
  dataset:
- name: LMDBDateSet
+ name: LMDBDataSet
  data_dir: ./train_data/data_lmdb_release/validation/
  transforms:
  - DecodeImage: # load image

diff --git a/configs/rec/rec_r50_fpn_srn.yml b/configs/rec/rec_r50_fpn_srn.yml
@@ -0,0 +1,107 @@
+Global:
+ use_gpu: True
+ epoch_num: 72
+ log_smooth_window: 20
+ print_batch_step: 5
+ save_model_dir: ./output/rec/srn_new
+ save_epoch_step: 3
+ # evaluation is run every 5000 iterations after the 4000th iteration
+ eval_batch_step: [0, 5000]
+ # if pretrained_model is saved in static mode, load_static_weights must set to True
+ cal_metric_during_train: True
+ pretrained_model: 
+ checkpoints:
+ save_inference_dir:
+ use_visualdl: False
+ infer_img: doc/imgs_words/ch/word_1.jpg
+ # for data or label process
+ character_dict_path: 
+ character_type: en
+ max_text_length: 25
+ num_heads: 8
+ infer_mode: False
+ use_space_char: False
+
+
+Optimizer:
+ name: Adam
+ beta1: 0.9
+ beta2: 0.999
+ clip_norm: 10.0
+ lr:
+ learning_rate: 0.0001
+
+Architecture:
+ model_type: rec
+ algorithm: SRN
+ in_channels: 1
+ Transform:
+ Backbone:
+ name: ResNetFPN
+ Head:
+ name: SRNHead
+ max_text_length: 25
+ num_heads: 8
+ num_encoder_TUs: 2
+ num_decoder_TUs: 4
+ hidden_dims: 512
+
+Loss:
+ name: SRNLoss
+
+PostProcess:
+ name: SRNLabelDecode
+
+Metric:
+ name: RecMetric
+ main_indicator: acc
+
+Train:
+ dataset:
+ name: LMDBDataSet
+ data_dir: ./train_data/srn_train_data_duiqi
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - SRNLabelEncode: # Class handling label
+ - SRNRecResizeImg:
+ image_shape: [1, 64, 256]
+ - KeepKeys:
+ keep_keys: ['image',
+ 'label',
+ 'length',
+ 'encoder_word_pos',
+ 'gsrm_word_pos',
+ 'gsrm_slf_attn_bias1',
+ 'gsrm_slf_attn_bias2'] # dataloader will return list in this order
+ loader:
+ shuffle: False
+ batch_size_per_card: 64
+ drop_last: False
+ num_workers: 4
+
+Eval:
+ dataset:
+ name: LMDBDataSet
+ data_dir: ./train_data/data_lmdb_release/evaluation
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - SRNLabelEncode: # Class handling label
+ - SRNRecResizeImg:
+ image_shape: [1, 64, 256]
+ - KeepKeys:
+ keep_keys: ['image',
+ 'label',
+ 'length',
+ 'encoder_word_pos',
+ 'gsrm_word_pos',
+ 'gsrm_slf_attn_bias1',
+ 'gsrm_slf_attn_bias2'] 
+ loader:
+ shuffle: False
+ drop_last: False
+ batch_size_per_card: 32
+ num_workers: 4
diff --git a/deploy/slim/quantization/README.md b/deploy/slim/quantization/README.md
@@ -42,7 +42,7 @@ python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global
 # 比如下载提供的训练模型
 wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar
 tar -xf ch_ppocr_mobile_v2.0_det_train.tar
-python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./ch_ppocr_mobile_v2.0_det_train/best_accuracy Global.save_model_dir=./output/quant_model
+python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./ch_ppocr_mobile_v2.0_det_train/best_accuracy Global.save_inference_dir=./output/quant_inference_model
 
 ```
 如果要训练识别模型的量化，修改配置文件和加载的模型参数即可。

diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md
@@ -58,7 +58,7 @@ python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global
 After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:
 
 ```bash
-python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model
+python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_inference_dir=./output/quant_inference_model
 ```
 
 ### 5. Deploy