Skip to content
/ KD-LTR Public

[MM2023] An official implement of the paper "One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer"

License

Notifications You must be signed in to change notification settings

csguoh/KD-LTR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KD-STR

arXiv

An official Pytorch implement of the paper "One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer" (MM2023).

Authors: Hang Guo, Tao Dai, Mingyan Zhu, GuangHao Meng, Bin Chen, Zhi Wang, Shu-Tao Xia.

Motivation

This work focus on the problem of text recognition on the low-resolution. A novel knowledge distillation framework is proposed, which can directly adapt the text recognizer to low-resolution. We hope that our work can inspire more studies on one-stage low-resolution text recognition.

Pipeline

The architecture of the proposed framework is as follows.

model

Pre-trained Weight

We refer to the student model adapted to low-resolution inputs as ABINet-LTR, MATRN-LTR and PARSeq-LTR, respectively. As pointed out in the paper, since the input images between the two branches are of different resolutions, we modified the convolution stride (for CNN backbone) or patch sizes (for ViT backbone) to ensure the consistency of the deep visual features. The pretrained weights can be downloaded as follows.

Model ABINet-LTR MATRN-LTR PARSeq-LTR
Performance 72.45% 73.27% 78.23%

Please be noted that the pre-trained HR teacher model is still needed for both training and testing, you can download the model in their coresponding offical github repository, i.e. ABINet, MATRN and PARSeq.

Datasets

In this work, we use STISR datasets TextZoom and five STR benchmarks, i.e., ICDAR2013, ICDAR2015, CUTE80, SVT and SVTP for model comparison. All the datasets are in lmdb format. One can download these datasets from the following table.

Datasets TextZoom IC13 IC15 CUTE80 SVT SVTP
Download Link link link link link link link

How to Run?

We have set some default hype-parameters in the config.yaml and main.py, so you can directly implement training and testing after you modify the path of datasets and pre-trained model.

Training

python main.py

Testing

python main.py --go_test

Main Results

Quantitative Comparison

quantitative

Qualitative Comparison

Robustness Comparison

Citation

If you find our work helpful, please consider citing us.

@inproceedings{guo2023one,
  title={One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer},
  author={Guo, Hang and Dai, Tao and Zhu, Mingyan and Meng, Guanghao and Chen, Bin and Wang, Zhi and Xia, Shu-Tao},
  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
  pages={2189--2198},
  year={2023}
}

About

[MM2023] An official implement of the paper "One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages