Skip to content

This repository contains a Python-based Text Detection App using TesseractOCR and OpenCV for extracting and processing text from images. It handles simple text extraction and advanced image preprocessing for enhanced accuracy. Additionally, a Flask web app is included, offering an intuitive interface for uploading images and viewing extracted text.

License

Notifications You must be signed in to change notification settings

himankgupta1/Text-Detection-Recognition

Repository files navigation

Text Detection App

Overview

This Text Detection App uses Python libraries such as Tesseract, OpenCV, and Pillow to extract and process text from images. The app can handle simple text extraction as well as more complex image processing tasks to improve text detection accuracy. Additionally, a Flask web application is included to provide a user-friendly interface for the text detection functionality.

Features

  • Extract text from an image using Tesseract OCR.
  • Remove irrelevant symbols from extracted text.
  • Perform various image preprocessing operations using OpenCV:
    • Grayscale conversion
    • Noise removal
    • Thresholding
    • Erosion
    • Morphology operations
    • Canny edge detection
    • Skew correction
    • Template matching
  • Draw rectangles around detected text.
  • Highlight specific words or patterns in the image.

Requirements

  • Python 3.x
  • Requests
  • Pillow
  • pytesseract
  • OpenCV
  • numpy
  • re
  • Flask

Installation

  1. Install Tesseract OCR from here. Ensure that Tesseract is added to your system path.
  2. Install the required Python packages:
    pip install requests
    pip install pillow
    pip install pytesseract
    pip install opencv-python
    pip install numpy
    pip install flask

Usage

  1. Ensure Tesseract is correctly installed and its path is configured in the script.
  2. Navigate to the folder containing the app:
    cd path/to/your/app
  3. Run the Flask application:
    python app.py
  4. Open a web browser and go to https://127.0.0.1:5000/ to access the application.

About

This repository contains a Python-based Text Detection App using TesseractOCR and OpenCV for extracting and processing text from images. It handles simple text extraction and advanced image preprocessing for enhanced accuracy. Additionally, a Flask web app is included, offering an intuitive interface for uploading images and viewing extracted text.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published