Online Ocr

This is an open source online OCR service built by the Django framework.

1. Requirements

1.1. Run directly

python 3.8
tesseract 4.1.x

1.1.1. install Tesseract

https://github.com/tesseract-ocr/tessdoc/blob/master/Home.md

macOS

A macOS wrapper for the Tesseract API is also available at Tesseract macOS.

MacPorts

To install Tesseract run this command:

sudo port install tesseract

To install any language data, run:

sudo port install tesseract-<langcode>

List of available langcodes can be found on MacPorts tesseract page.

Homebrew

To install Tesseract run this command:

brew install tesseract

The tesseract directory can then be found using brew info tesseract, e.g. /usr/local/Cellar/tesseract/3.05.02/share/tessdata/.

Linux

Tesseract is available directly from many Linux distributions. The package is generally called 'tesseract' or 'tesseract-ocr' - search your distribution's repositories to find it. Thus you can install Tesseract 4.x and its developer tools on Ubuntu 18.x bionic by simply running:

sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

1.2. Run by Docker

Docker Engine

2. What's included

Within the download, you'll find the following directories and files. You'll see something like this:

├────.gitattributes
├────.gitignore
├────app/
│    ├────__init__.py
│    ├────admin.py
│    ├────apps.py
│    ├────migrations/
│    │    ├────0001_initial.py
│    │    ├────0002_auto_20200505_0909.py
│    │    └────__init__.py
│    ├────models.py
│    ├────ocr.py
│    ├────serializers.py
│    ├────urls_api.py
│    └────views.py
├────db/
│    └────__init__.py
├────dockerfile_base
├────dockerfile_product
├────manage.py
├────media/
│    └────test/
├────nginx.conf
├────onlineocr/
│    ├────__init__.py
│    ├────asgi.py
│    ├────settings.py
│    ├────urls.py
│    └────wsgi.py
├────README.md
├────requirements.txt
├────sources.list
├────start.sh
├────static/
├────templates/
├────uwsgi/
├────uwsgi.ini
└────uwsgi_params

3. Run directly

Before run it you need first install Tesseract.

3.1. Clone the project

git clone https://github.com/ginguocun/onlineocr.git
cd onlineocr

3.2. Create the virtual environment

python3 -m venv venv
source venv/bin/activate

3.3. Install the reqirements

pip install -r requirements.txt

3.4. Migrate the database

python3 manage.py migrate

3.5. Runserver

Start development server at https://127.0.0.1:9999/

python3 manage.py runserver 127.0.0.1:9999

4. Run through docker

4.1. Build the basic docker image

git clone https://github.com/ginguocun/onlineocr.git
cd onlineocr
sudo docker build --rm -t onlineocr:base -f dockerfile_base .

4.2. Build the product docker image

sudo docker build --rm -t onlineocr:latest -f dockerfile_product .

4.3. Runserver by docker run

Start development server at https://127.0.0.1:9999/

sudo docker run -it -p 9999:80 onlineocr:latest /bin/bash

5. API documents

The online API documents is at https://127.0.0.1:9999/docs/

The main APIs are listed below:

/api/ocr/
/api/history/
/api/register/
/api/token_obtain_pair/
/api/token_refresh/

5.1. /api/ocr/

POST /api/ocr/

This is an API which can take an uploaded image(jpg, png) and find any letters in it.

Request Body

The request body should be a "application/json" encoded object, containing the following items.

Parameter	Description
`image` required	A image file, the size should be less than 2Mb

5.2. /api/history/

GET /api/history/

This is an API used to obtain the historical upload records.

Query Parameters

The following parameters can be included as part of a URL query string.

Parameter	Description
`page`	A page number within the paginated result set.
`search`	A search term.
`ordering`	Which field to use when ordering the results.

5.3. /api/register/

POST /api/register/

This is an API for user registration.

Request Body

The request body should be a "application/json" encoded object, containing the following items.

Parameter	Description
`username` required	Required. 150 characters or fewer. Letters, digits and @/./+/-/_ only.
`email`
`password` required

5.4. /api/token_obtain_pair/

POST /api/token_obtain_pair/

Takes a set of user credentials and returns an access and refresh JSON web token pair to prove the authentication of those credentials.

Request Body

The request body should be a "application/json" encoded object, containing the following items.

Parameter	Description
`username` required
`password` required

5.5. /api/token_refresh/

POST /api/token_refresh/

Takes a refresh type JSON web token and returns an access type JSON web token if the refresh token is valid.

Request Body

The request body should be a "application/json" encoded object, containing the following items.

Parameter	Description
`refresh` required

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Online Ocr

1. Requirements

1.1. Run directly

1.1.1. install Tesseract

macOS

Linux

1.2. Run by Docker

2. What's included

3. Run directly

3.1. Clone the project

3.2. Create the virtual environment

3.3. Install the reqirements

3.4. Migrate the database

3.5. Runserver

4. Run through docker

4.1. Build the basic docker image

4.2. Build the product docker image

4.3. Runserver by docker run

5. API documents

5.1. /api/ocr/

Request Body

5.2. /api/history/

Query Parameters

5.3. /api/register/

Request Body

5.4. /api/token_obtain_pair/

Request Body

5.5. /api/token_refresh/

Request Body

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
app		app
db		db
media/test		media/test
onlineocr		onlineocr
static		static
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
dockerfile_base		dockerfile_base
dockerfile_product		dockerfile_product
manage.py		manage.py
nginx.conf		nginx.conf
requirements.txt		requirements.txt
sources.list		sources.list
start.sh		start.sh
uwsgi.ini		uwsgi.ini
uwsgi_params		uwsgi_params

ginguocun/onlineocr

Folders and files

Latest commit

History

Repository files navigation

Online Ocr

1. Requirements

1.1. Run directly

1.1.1. install Tesseract

macOS

Linux

1.2. Run by Docker

2. What's included

3. Run directly

3.1. Clone the project

3.2. Create the virtual environment

3.3. Install the reqirements

3.4. Migrate the database

3.5. Runserver

4. Run through docker

4.1. Build the basic docker image

4.2. Build the product docker image

4.3. Runserver by docker run

5. API documents

5.1. /api/ocr/

Request Body

5.2. /api/history/

Query Parameters

5.3. /api/register/

Request Body

5.4. /api/token_obtain_pair/

Request Body

5.5. /api/token_refresh/

Request Body

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages