DocuSmart is a versatile AI-powered tool designed using crewAI to assist users in extracting and summarizing content from various file formats including CSV, PDF, DOCX, TXT, and JSON. Leveraging advanced language models, this application can provide detailed descriptions and comprehensive reports based on user queries.
- 📄 PDF Content Extraction: Extract and summarize content from PDF files.
- 🗂️ General File Content Extraction: Handle multiple file formats (CSV, JSON) and extract relevant information.
- 📝 Text File Analysis: Provide detailed descriptions of content within plain text files.
- 📃 DOCX File Analysis: Summarize and extract information from DOCX files.
- 📊 CSV File Analysis: Perform detailed analysis and summarization of CSV data.
- 🖋️ Content Writing: Summarize extracted information into comprehensive reports or blogs.
- 🐍 Python: The core programming language for implementation.
- 🔬 TensorFlow: Used for machine learning models.
- 🌐 Gradio: For creating the web interface.
- 🤖 Google Generative AI: The language model used for text processing.
- 💡 Hugging Face Transformers: For embedding and language model support.
- Python 3.x
- Required Python libraries (install via
requirements.txt
)
-
Clone the repository:
git clone https://github.com/Surabhi-26/DocuSmart.git
-
Navigate to the project directory:
cd DocuSmart
-
Install the required packages:
pip install -r requirements.txt
-
Set up your API key:
export GOOGLE_API_KEY="YOUR-API-KEY"
-
Run the application:
python app.py
-
Open your web browser and go to
http:https://localhost:7860
to access the Smart File Assistant interface.
https://huggingface.co/spaces/SurabhiT/DocuSmart
- Upload a file: Select and upload a file (CSV, PDF, DOCX, TXT, or JSON).
- Enter your query: Type in the query you want the AI to address.
- Specify the answer format: Describe the format in which you want the answer to be provided.
- Get the result: The application processes the file and your query to generate a detailed response.
- "Summarize the key points from this PDF document."
- "Provide a detailed analysis of the data in this CSV file."
- "Extract and describe the main topics covered in this DOCX file."
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create your feature branch (
git checkout -b feature/AmazingFeature
). - Commit your changes (
git commit -m 'Add some AmazingFeature'
). - Push to the branch (
git push origin feature/AmazingFeature
). - Open a pull request.
This project is licensed under the MIT License. See the MIT License file for details.
For any inquiries or feedback, please reach out to:
- Name: Surabhi Tilekar
- Email: [email protected]
- LinkedIn: https://www.linkedin.com/in/surabhi-tilekar-87b437284/
Developed by Surabhi Tilekar and Ishan Pardeshi.