GitHub - thiagopx/deeprec-pr20

Self-supervised deep reconstruction of mixed strip-shredded text documents

This repository comprises datasets and source codes used in our Pattern Recognition (2020) paper.

Preparing the enviroment with virtualenv

Main dependencies:

python==3.6
tensorflow-gpu==1.11.0
scikit-image==0.14.0
scikit-learn==0.23.1
opencv-python==3.3.1.11
numba==0.51.2

For a fully-automatic setup of the virtual environment (tested on Linux Ubuntu 18.04), set the variable BASE_DIR in scripts/install.sh to a valid directory, and then run source scripts/install.sh from within the repository root directory. BASE_DIR indicates where additional directories (envs, concorde, qsopt) will be created.

You should have sudo privileges to run properly the installation script. By default, the virtual environment will be created at $BASE_DIR/envs/deeprec-pr20. When finishing, the script will automatically activate the just created environment.

Download the datasets

The datasets include the (i) integral documents where the training (small) samples are extracted and (ii) the mechanically-shredded documents collections S-MARQUES (D1), S-ISRI-OCR (D2), and S-CDIP (D3) used in the tests. To download them, just run bash scripts/get_dataset.sh.

It will create a directory datasets in the repository root directory.

Download the results

You can download the results by running bash scripts/get_results.sh.

It will create a directory results in the repository root directory with three subdirectories (one for each experiment).

Demo

A reconstruction demo is available by running python demo.py. By default, the script uses a pretrained model available in the traindata directory. Here is an example of output of the demo script:

For details of the parameters, you may run python demo.py --help.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
analysis		analysis
docrec		docrec
docs		docs
exp1_proposed		exp1_proposed
exp2_ablation		exp2_ablation
exp3_comparison		exp3_comparison
github		github
graphs		graphs
illustration		illustration
reconstruction		reconstruction
scripts		scripts
traindata		traindata
.gitignore		.gitignore
.train.py.swp		.train.py.swp
README.md		README.md
ablation.sh		ablation.sh
bypass.sh		bypass.sh
comparison.sh		comparison.sh
demo.py		demo.py
generate_samples.py		generate_samples.py
requeriments_liang.txt		requeriments_liang.txt
requirements.txt		requirements.txt
train.py		train.py
train.txt		train.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-supervised deep reconstruction of mixed strip-shredded text documents

Preparing the enviroment with virtualenv

Download the datasets

Download the results

Demo

About

Releases

Packages

Languages

thiagopx/deeprec-pr20

Folders and files

Latest commit

History

Repository files navigation

Self-supervised deep reconstruction of mixed strip-shredded text documents

Preparing the enviroment with virtualenv

Download the datasets

Download the results

Demo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages