Skip to content

This package tries to create a document classifier, text extractor from documents stream and divide documents into groups based on their contents for INDIAN financial institutions

Notifications You must be signed in to change notification settings

PrachetShah/Document-Classifier-and-Text-Extracter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Description

Xtracter

This package tries to create a document classifier, text extractor from documents stream and divide documents into groups based on their contents for INDIAN financial institutions

🚀 Getting Started

To start using this package, clone it using github:

git clone https://github.com/PrachetShah/Document-Classifier-and-Text-Extracter.git

In the project directory, you can run:

pip install -r requirements.txt

👩‍💻 Usage Guide

In Development

Requirements

''' To activate venv: venv\Scripts\activate

Required Milestones:

  1. Create a Library for Data Classification and Extraction
  • the documents must be identified, classified, and divided into multiple groups submit a single file (image/pdf/word document) that contains many documents. To achieve:
    1. Class Extract
    2. Relevant Docstrings
    3. Implement functions in class - identify(take input and OCR function), save, OCR Function(conditions for diff documents and split based result)
  1. Once document is classified and split, create a library which accepts split document and extracts the data from it.

Non-Tech Issues

  1. Project README
  2. Docstring for Functions
  3. Find which all functions can be added to improve efficiency and workflow '''

About

This package tries to create a document classifier, text extractor from documents stream and divide documents into groups based on their contents for INDIAN financial institutions

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages