A (work in progress) project that lets you use Tesseract-OCR to pull text from images, and manipulate/export it.
HOW TO USE: Run imgtext.py from the src folder, with a command line argument to the path of your image. Currently works on Linux only.
Dependencies:
Tesseract: https://code.google.com/p/tesseract-ocr/ PyPDF2: https://github.com/mstamy2/PyPDF2
Planned Updates:
- Ability to OCR more than one image at a time
- PDF export
- Output report formatting
- GUI
=======