GitHub - bjjwwang/PaddleOCR at dygraph

Name		Name	Last commit message	Last commit date
Latest commit History 2,162 Commits
PPOCRLabel		PPOCRLabel
StyleText		StyleText
configs		configs
deploy		deploy
doc		doc
ppocr		ppocr
tools		tools
.clang_format.hook		.clang_format.hook
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.yapf		.style.yapf
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_ch.md		README_ch.md
__init__.py		__init__.py
paddleocr.py		paddleocr.py
requirements.txt		requirements.txt
setup.py		setup.py
train.sh		train.sh

Repository files navigation

PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.

PaddleOCR supports both dynamic graph and static graph programming paradigm

Dynamic graph: dygraph branch (default), supported by paddle 2.0.0 (installation)
Static graph: develop branch

Recent updates

2021.1.21 update more than 25+ multilingual recognition models models list, including：English, Chinese, German, French, Japanese，Spanish，Portuguese Russia Arabic and so on. Models for more languages will continue to be updated Develop Plan.
2020.12.15 update Data synthesis tool, i.e., Style-Text，easy to synthesize a large number of images which are similar to the target scene image.
2020.11.25 Update a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
more

PPOCR series of high-quality pre-trained models, comparable to commercial effects
- Ultra lightweight ppocr_mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
- General ppocr_server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
- Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
- Support multi-language recognition: Korean, Japanese, German, French
Rich toolkits related to the OCR areas
- Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
- Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
Support user-defined training, provides rich predictive inference deployment solutions
Support PIP installation, easy to use
Support Linux, Windows, MacOS and other systems

The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see More visualizations.

Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.