Skip to content

Commit

Permalink
fix dead link
Browse files Browse the repository at this point in the history
  • Loading branch information
andyjiang1116 committed Feb 16, 2022
1 parent bf396a5 commit e797692
Show file tree
Hide file tree
Showing 10 changed files with 29 additions and 34 deletions.
7 changes: 1 addition & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,16 +181,11 @@ For a new language request, please refer to [Guideline for new language_requests
<a name="language_requests"></a>
## Guideline for New Language Requests

If you want to request a new language support, a PR with 2 following files are needed:
If you want to request a new language support, a PR with 1 following files are needed:

1. In folder [ppocr/utils/dict](./ppocr/utils/dict),
it is necessary to submit the dict text to this path and name it with `{language}_dict.txt` that contains a list of all characters. Please see the format example from other files in that folder.

2. In folder [ppocr/utils/corpus](./ppocr/utils/corpus),
it is necessary to submit the corpus to this path and name it with `{language}_corpus.txt` that contains a list of words in your language.
Maybe, 50000 words per language is necessary at least.
Of course, the more, the better.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to [Multilingual OCR Development Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
Expand Down
2 changes: 1 addition & 1 deletion deploy/slim/prune/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ python3 setup.py install
'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594}
'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405}
}
加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86)
加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0-alpha/docs/zh_cn/algo/algo.md)

进入PaddleOCR根目录,通过以下命令对模型进行敏感度分析训练:
```bash
Expand Down
6 changes: 3 additions & 3 deletions deploy/slim/prune/README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance.

This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model.
This example uses PaddleSlim provided[APIs of Pruning](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/docs/zh_cn/api_cn/dygraph/pruners) to compress the OCR model.
[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry.

It is recommended that you could understand following pages before reading this example:
Expand Down Expand Up @@ -35,7 +35,7 @@ PaddleOCR also provides a series of [models](../../../doc/doc_en/models_list_en.

### 3. Pruning sensitivity analysis

After the pre-trained model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
After the pre-trained model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/en/tutorials/image_classification_sensitivity_analysis_tutorial_en.md)
The data format of sensitivity file:
sen.pickle(Dict){
'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss}
Expand All @@ -47,7 +47,7 @@ PaddleOCR also provides a series of [models](../../../doc/doc_en/models_list_en.
'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594}
'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405}
}
The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of corresponding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86)
The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of corresponding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0-alpha/docs/zh_cn/algo/algo.md)


Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command:
Expand Down
4 changes: 2 additions & 2 deletions deploy/slim/quantization/README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ Generally, a more complex model would achieve better performance in the task, bu
Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number,
so as to reduce model calculation complexity and improve model inference performance.

This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model.
This example uses PaddleSlim provided [APIs of Quantization](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/quanter/qat.rst) to compress the OCR model.

It is recommended that you could understand following pages before reading this example:
- [The training strategy of OCR model](../../../doc/doc_en/quickstart_en.md)
- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/)
- [PaddleSlim Document](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/quanter/qat.rst)

## Quick Start
Quantization is mostly suitable for the deployment of lightweight models on mobile terminals.
Expand Down
8 changes: 4 additions & 4 deletions doc/doc_ch/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ PaddleOCR收集整理了自从开源以来在issues和用户群中的常见问
OCR领域大佬众多,本文档回答主要依赖有限的项目实践,难免挂一漏万,如有遗漏和不足,也**希望有识之士帮忙补充和修正**,万分感谢。

- [FAQ](#faq)

* [1. 通用问题](#1)
+ [1.1 检测](#11)
+ [1.2 识别](#12)
Expand All @@ -20,7 +20,7 @@ OCR领域大佬众多,本文档回答主要依赖有限的项目实践,难
+ [1.5 垂类场景实现思路](#15)
+ [1.6 训练过程与模型调优](#16)
+ [1.7 补充资料](#17)

* [2. PaddleOCR实战问题](#2)
+ [2.1 PaddleOCR repo](#21)
+ [2.2 安装环境](#22)
Expand Down Expand Up @@ -734,7 +734,7 @@ C++TensorRT预测需要使用支持TRT的预测库并在编译时打开[-DWITH_T

#### Q:PaddleOCR中,对于模型预测加速,CPU加速的途径有哪些?基于TenorRT加速GPU对输入有什么要求?

**A**:(1)CPU可以使用mkldnn进行加速;对于python inference的话,可以把enable_mkldnn改为true,[参考代码](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/tools/infer/utility.py#L99),对于cpp inference的话,在配置文件里面配置use_mkldnn 1即可,[参考代码](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/deploy/cpp_infer/tools/config.txt#L6)
**A**:(1)CPU可以使用mkldnn进行加速;对于python inference的话,可以把enable_mkldnn改为true,[参考代码](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/tools/infer/utility.py#L99),对于cpp inference的话,可参考[文档](https://github.com/andyjpaddle/PaddleOCR/tree/dygraph/deploy/cpp_infer)

(2)GPU需要注意变长输入问题等,TRT6 之后才支持变长输入

Expand Down Expand Up @@ -838,4 +838,4 @@ nvidia-smi --lock-gpu-clocks=1590 -i 0

#### Q: 预测时显存爆炸、内存泄漏问题?

**A**: 打开显存/内存优化开关`enable_memory_optim`可以解决该问题,相关代码已合入,[查看详情](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/tools/infer/utility.py#L153)
**A**: 打开显存/内存优化开关`enable_memory_optim`可以解决该问题,相关代码已合入,[查看详情](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/tools/infer/utility.py#L153)
10 changes: 5 additions & 5 deletions doc/doc_ch/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@
| :---------------------: | :---------------------: | :--------------: | :--------------------: |
| model_type | 网络类型 | rec | 目前支持`rec`,`det`,`cls` |
| algorithm | 模型名称 | CRNN | 支持列表见[algorithm_overview](./algorithm_overview.md) |
| **Transform** | 设置变换方式 | - | 目前仅rec类型的算法支持, 具体见[ppocr/modeling/transform](../../ppocr/modeling/transform) |
| **Transform** | 设置变换方式 | - | 目前仅rec类型的算法支持, 具体见[ppocr/modeling/transforms](../../ppocr/modeling/transforms) |
| name | 变换方式类名 | TPS | 目前支持`TPS` |
| num_fiducial | TPS控制点数 | 20 | 上下边各十个 |
| loc_lr | 定位网络学习率 | 0.1 | \ |
Expand Down Expand Up @@ -176,7 +176,7 @@ PaddleOCR目前已支持80种(除中文外)语种识别,`configs/rec/multi
--dict {path/of/dict} \ # 字典文件路径
-o Global.use_gpu=False # 是否使用gpu
...

```

意大利文由拉丁字母组成,因此执行完命令后会得到名为 rec_latin_lite_train.yml 的配置文件。
Expand All @@ -191,21 +191,21 @@ PaddleOCR目前已支持80种(除中文外)语种识别,`configs/rec/multi
epoch_num: 500
...
character_dict_path: {path/of/dict} # 字典文件所在路径
Train:
dataset:
name: SimpleDataSet
data_dir: train_data/ # 数据存放根目录
label_file_list: ["./train_data/train_list.txt"] # 训练集label路径
...
Eval:
dataset:
name: SimpleDataSet
data_dir: train_data/ # 数据存放根目录
label_file_list: ["./train_data/val_list.txt"] # 验证集label路径
...
```

目前PaddleOCR支持的多语言算法有:
Expand Down
2 changes: 1 addition & 1 deletion doc/doc_ch/serving_inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

**Python操作指南:**

目前Serving用于OCR的部分功能还在测试当中,因此在这里我们给出[Servnig latest package](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)
目前Serving用于OCR的部分功能还在测试当中,因此在这里我们给出[Servnig latest package](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Latest_Packages_CN.md)
大家根据自己的环境选择需要安装的whl包即可,例如以Python 3.5为例,执行下列命令
```
#CPU/GPU版本选择一个
Expand Down
2 changes: 1 addition & 1 deletion doc/doc_en/config_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ In PaddleOCR, the network is divided into four stages: Transform, Backbone, Neck
| :---------------------: | :---------------------: | :--------------: | :--------------------: |
| model_type | Network Type | rec | Currently support`rec`,`det`,`cls` |
| algorithm | Model name | CRNN | See [algorithm_overview](./algorithm_overview_en.md) for the support list |
| **Transform** | Set the transformation method | - | Currently only recognition algorithms are supported, see [ppocr/modeling/transform](../../ppocr/modeling/transform) for details |
| **Transform** | Set the transformation method | - | Currently only recognition algorithms are supported, see [ppocr/modeling/transforms](../../ppocr/modeling/transforms) for details |
| name | Transformation class name | TPS | Currently supports `TPS` |
| num_fiducial | Number of TPS control points | 20 | Ten on the top and bottom |
| loc_lr | Localization network learning rate | 0.1 | \ |
Expand Down
Loading

0 comments on commit e797692

Please sign in to comment.