Skip to content

Commit

Permalink
update rec_r31_sar.yml and sar docs
Browse files Browse the repository at this point in the history
  • Loading branch information
andyjiang1116 committed Aug 31, 2021
1 parent 69f9bdd commit d43688a
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
4 changes: 2 additions & 2 deletions configs/rec/rec_r31_sar.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Train:
dataset:
name: SimpleDataSet
delimiter: ' '
label_file_list: ['/paddle/data/concat_data/icdar_2013_train20.txt', '/paddle/data/concat_data/icdar_2015_train20.txt', '/paddle/data/concat_data/coco_text_train20.txt', '/paddle/data/concat_data/IIIt5k_train20.txt', '/paddle/data/concat_data/SynthAdd_train.txt', '/paddle/data/concat_data/SynthText_train.txt', '/paddle/data/concat_data/Syn90k_train.txt']
label_file_list: ['/paddle/data/concat_data/train_list.txt']
data_dir: /paddle/data/concat_data/
ratio_list: 1.0
transforms:
Expand All @@ -71,7 +71,7 @@ Train:
keep_keys: ['image', 'label', 'valid_ratio'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 64 # 32
batch_size_per_card: 64
drop_last: True
num_workers: 8
use_shared_memory: False
Expand Down
2 changes: 2 additions & 0 deletions doc/doc_ch/recognition.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ train_data/rec/train/word_002.jpg 用科技让复杂的世界更简单

如果希望复现SRN的论文指标,需要下载离线[增广数据](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA),提取码: y3ry。增广数据是由MJSynth和SynthText做旋转和扰动得到的。数据下载完成后请解压到 {your_path}/PaddleOCR/train_data/data_lmdb_release/training/ 路径下。

如果希望复现SAR的论文指标,需要下载[SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), 提取码:627x。此外,真实数据集icdar2013, icdar2015, cocotext, IIIT5也作为训练数据的一部分。具体数据细节可以参考论文SAR。

```
# 训练集标签
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_train.txt
Expand Down
2 changes: 2 additions & 0 deletions doc/doc_en/recognition_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ If you do not have a dataset locally, you can download it on the official websit

If you want to reproduce the paper indicators of SRN, you need to download offline [augmented data](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA), extraction code: y3ry. The augmented data is obtained by rotation and perturbation of mjsynth and synthtext. Please unzip the data to {your_path}/PaddleOCR/train_data/data_lmdb_Release/training/path.

If you want to reproduce the paper SAR, you need to download extra dataset [SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), extraction code: 627x. Besides, icdar2013, icdar2015, cocotext, IIIT5k datasets are also used to train. For specific details, please refer to the paper SAR.

PaddleOCR provides label files for training the icdar2015 dataset, which can be downloaded in the following ways:

```
Expand Down

0 comments on commit d43688a

Please sign in to comment.