Skip to content

amirabbasasadi/Shotor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Shotor

Word Level OCR Dataset for Persian Language

Shotor (means camel in Persian) is a free synthetic dataset for Word Level OCR.

Sample Images

The current version contains 120000 grayscale 50*100 images and corresponding words. The words contain only alphabet.
Note: To train a robust model, apply augmentations like scaling, translation, additive noise and ... on the images.
To see an example of using the Shotor dataset see this notebook:
A simple word level OCR for Persian Language using Pytorch and OpenCV

I used these resourses to create word lists:

The images have been generated using multiple fonts:

Created by: Amirabbas Asadi ([email protected])

Releases

No releases published

Packages