Welcome to ImageToLatex 👋

A neural network capable of translating handwriting into latex. The project also provides A-Z tools for generating raw latex, producing images and transforming images as if they were written by a human. Project scheme:

Raw ==> Set ==> Visual ==> Model
Where raw, set, visual are tools and model is a neural network for recognizing latex.

Model [Python]

Model is based on paper:

IMAGE TO LATEX VIA NEURAL NETWORKS
Avinash More
San Jose State University

General Idea

ITL detects each character separately and merges them into one sequence.

Details

Let s be a number of supported characters. ITL uses s clones of the same architecture. J'th neural network recognises j'th character using one-hot encoding. The project currently supports the following characters: +, -, ^, {, },^, \cdot, a, x, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0

Dataset

Input shape=(64, 64).
Format of input features: eq_n_b_c.

n number of label
b number of background
c number of effect pack applied on a given feature. For given n and b all features eq_n_b_{0...effects number} represents the same label.

Link for the dataset used during training will be available soon on mvxxx.github.io as exp.tar.gz.
The whole dataset was fully generated in use of tool/{raw, set, visual} in kind of pipe.

Accuracy

For training dataset of length 7, the mean accuracy was (depending on the difficulty of the dataset) ~0.70-0.95 after 10 epochs of training.

Tool/Raw [OCaml]

A functional tool which is written in OCaml. Random latex expression generators, with various syntactic levels and concepts describing exact behavior within the level. Create a set of generators capable of supplying the model with properly generated random latex expressions, matching strict expectations, for training purposes.

Performance:

Type	Images/sec
Complex	~1.600.000
Standard	~2.000.000
Basic	~4.000.000

Tool/Set [Asymptote, Bash]

Script tool which gets several input files with raw LaTeX and converts them into basic .png expressions. This part executes the worker for each input file (kind of thread pooling). Using via bash script: bash set.sh *.in.
It will produce all content inside temporary folders, then it moves all photos to the output folder. These images are input for visual part. All input *.in labels are concatenated and stacked into the labels file.
It gets raw text like: 3-\frac{100}{88} and returns:

Tool/Visual [C++]

The biggest tool, written in C++ and using OpenCV, capable of creating millions of written like human math equations. It applies a lot of different effects in order to make math as if it was written by people. It can be configured using config.hpp. The final result is the base of the dataset for machine learning. It gets raw png like before and returns (in that case, only rotate was applied):

In that part, there are predefined effects like:

Type	Brief
left rotate	rotates images by (-C,0)
right rotate	rotates images by (0,C)
symetric scaling upward	scales both x and y upward
symetric scaling downward	scales both x and y downward
non-symetric scaling upward	scales independently x and y upward
non-symetric scaling downward	scales independently x and y downward

There are also effects applied outside effect manager:

Type	Brief
position	changes position of sprite on background
background	changes background
perlin	applies perlin noise mask (in progress)

Each effect is taken or not. For each image, we apply all possible combinations of effects. Let say that we have effects e1, e2, e3 and image p. Then the output will be
p ---(!e1,!e2,!e3)---> p0
p ---(!e1,!e2,e3)----> p1
p ---(!e1,e2,!e3)----> p2
p ---(e1,!e2,!e3)----> p3
p ---(e1,e2,!e3)-----> p4
p ---(!e1,e2,e3)-----> p5
p ---(e1,!e2,e3)-----> p6
p ---(e1,e2,e3)------> p7

!e means that we don't take e. So for each image, the output is 2^k modified images.
Example of use:

$ /usr/bin/time -f %e ./visual -la ../data/ .png
[INFO] Initializing module: PN3itl5StateE
[INFO] Program uses multithreading. Threads number: 12
[INFO] Initializing module: PN3itl9TransformE
[INFO] Initializing module: PN3itl12ImageManagerE
[INFO] Initializing module: PN3itl13EffectManagerE
[INFO] Initializing module: PN3itl11PerlinNoiseE
[INFO] Work finished successfully
3.82

for 96.000 produced images 64x64. Speed is about 25.000/s on AMD RYZEN 5 2600. As you can see, there are available multiple flags:

constexpr char const* testing = "-t";
constexpr char const* printing_steps = "-p";
constexpr char const* log_erros = "-le";
constexpr char const* log_info = "-li";
constexpr char const* log_suggestions = "-ls";
constexpr char const* log_warnings = "-lw";
constexpr char const* log_all = "-la";
constexpr char const* log_time = "-lt";

If you don't want to log, just run a program without any flags.

Name		Name	Last commit message	Last commit date
Latest commit History 238 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
examples		examples
imagetolatex		imagetolatex
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to ImageToLatex 👋

Model [Python]

General Idea

Details

Dataset

Accuracy

Tool/Raw [OCaml]

Tool/Set [Asymptote, Bash]

Tool/Visual [C++]

About

Releases 2

Packages

Contributors 3

Languages

License

kakainet/ImageToLatex

Folders and files

Latest commit

History

Repository files navigation

Welcome to ImageToLatex 👋

Model [Python]

General Idea

Details

Dataset

Accuracy

Tool/Raw [OCaml]

Tool/Set [Asymptote, Bash]

Tool/Visual [C++]

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Languages

Packages