data_augementation

This project rely on python3.6.5, need to install PIL, skimage, scipy, hashlib.

This hub is about how to make data augementation in seq2seqs model, when OCR model processes digital sequences, a very import

issue is digital sequences not satisfied uniform distribution, the problem will be caused is OCR model cannot recognize partial digital sequences. for example, training the ten digital sequence 0000012345, the first five digits are beginning with 00000. when prediction data appears 1234500000, the training model can not correctly predict the sequence data. At this time, the best practice is to evenly distribute the data. The main functions are as follows:

a. Independent and identical distribution enhancement of ten-digit sequence data. b. Independent and identical distribution enhancement of date sequence data.

c. The data format uses a normal distribution, such as image pixel resolution of 250*50, the augementation data with 250 as the mean, 10 for the variance of the length obey the normal distribution.

d. In order to improve the robustness of OCR model, Gaussian noise and salt-and-pepper noise are added in the process of data enhancement.

e. Randomly rotate 50% of the generated data to improve the generalization ability of the model.

Usage is as follows:

Includes:

--code_images  if not noe generate code data images

--date_images  if not noe generate date data images

--image_years  data deadline specified by date data images (2018-image_years)

--image_nums   code data images numbers

--noise_sigma  

--use_rotate 

--normal_mean

Usage:

code data images generate:

python data_augementation.py --code_images --image_nums 100

date data images generate:

python data_augementation.py --date_images --image_years 2019

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
code_trimed		code_trimed
date_trimed_data		date_trimed_data
original		original
original_code_trim		original_code_trim
original_date_trim		original_date_trim
result_code_data		result_code_data
README.md		README.md
data_augementation.py		data_augementation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data_augementation

About

Releases

Packages

Languages

Qunstores/data_augementation

Folders and files

Latest commit

History

Repository files navigation

data_augementation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages