- Install Tesseract
- Install requirements.txt:
$ pip install -r requirements.txt
Warning this is a large zip file ~4GB
Remember to set your tesseract path in main.py
line 9
For example:
pytesseract.pytesseract.tesseract_cmd = '/usr/local/Cellar/tesseract/5.1.0/bin/tesseract'
Text is extracted to <filename>.txt
For example img2.png will output to img2.txt
Example execution:
$ python main.py --file img2.png
Make sure the One_Piece folder is in the root directory of this project. This will output all the .txt files in each folder per manga panel.
$ bash automate_extraction.sh