Implementation of Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition in AAAI 2019, with PyTorch >= v1.4.0.
- Backbone model
- Encoder model
- Decoder model
- Integrated model
- Data processing
- Training pipeline
- Inference pipeline
- Street View Text: https://vision.ucsd.edu/~kai/svt/
- IIIT5K: https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset
- Syn90k: https://www.robots.ox.ac.uk/~vgg/data/text/
- SynthText: https://www.robots.ox.ac.uk/~vgg/data/scenetext/
python train.py --batch 32 --epoch 5000 --dataset ./svt --dataset_type svt --gpu True
python inference.py --batch 32 --input input_folder --model model_path --gpu True
Input:
Output attention map per character:
Input:
Output attention map per character:
Input:
Output attention map per character:
[1] Original paper: https://arxiv.org/abs/1811.00751
[2] Official code by the authors in torch: https://github.com/wangpengnorman/SAR-Strong-Baseline-for-Text-Recognition
[3] A TensorFlow implementation: https://github.com/Pay20Y/SAR_TF