Welcome to the RAG (Retrieval-Augmented Generation) application repository! This project leverages the Phi3 model and ChromaDB to read PDF documents, embed their content, store the embeddings in a database, and perform retrieval-augmented generation.
This repository contains a RAG application that reads PDF files, generates embeddings using the Alibaba-NLP/gte-large-en-v1.5 model, stores these embeddings in ChromaDB, and performs retrieval-augmented generation to provide contextual answers based on the embedded content. The system is designed to enhance the capability of answering queries by leveraging the context from the embedded documents.
- PDF Reading: Extracts text content from PDF documents.
- Embedding Generation: Utilizes the Alibaba-NLP/gte-large-en-v1.5 model to generate embeddings for the extracted text.
- Database Storage: Stores the generated embeddings in ChromaDB.
- Retrieval-Augmented Generation: Retrieves relevant embeddings from the database and generates contextually accurate responses.
To get started with the RAG application, follow these steps:
-
Clone the repository:
git clone https://github.com/sankethsj/phi3-rag-application.git cd phi3-rag-application
-
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
- Open RAG-Workbook.ipynb and run all cells.
We welcome contributions to enhance the capabilities of this RAG application. To contribute, please follow these steps:
-
Fork the repository.
-
Create a new branch for your feature or bugfix:
git checkout -b feature-name
-
Make your changes and commit them with descriptive messages.
-
Push your changes to your fork:
git push origin feature-name
-
Create a pull request to the main repository.
This project is licensed under the MIT License. See the LICENSE file for more details.
Thank you for using the RAG application with the Phi3 model and ChromaDB. If you encounter any issues or have any questions, please feel free to open an issue on this repository.