MDC-RAG

This project demonstrates the use of Pinecone for creating a vector embedding database and Langchain for handling language operations with OpenAI's ChatModel. The goal is to create a question answering chatbot from PDFs, create an index, and query the data efficiently.

Setup Instructions

Clone the repository:

git clone https://github.com/VirajDeshwal/mdc-RAG
cd mdc-RAG

Install Anaconda:
- Download and install Anaconda from here.

Create and activate a new conda environment:

conda create --name rag python=3.10
conda activate rag

Install the required dependencies:
```
pip install -r requirements.txt
```
Configure environment variables:
- Create a .env file in the root directory.
- Add your OpenAI and Pinecone API keys:
```
OPENAI_API_KEY=your_openai_api_key
PINECONE_API_KEY=your_pinecone_api_key
```
Add your PDF files:
- Place the PDF files you want to process in the input_src folder.
Create the vector database and index the data:
```
python run.py
```
Run a query against the indexed data:
```
python pinecone_query.py
```

Project Structure

run.py: Script to create the vector index and store the data.
pinecone_query.py: Script to query the Pinecone vector database.
requirements.txt: List of dependencies required for the project.
input_src/: Directory to place the PDF files to be processed.
utils/: Utility functions and modules used in the project.
.env: Environment file to store API keys (not included in the repository).

Usage

Indexing Data:
- Ensure your PDFs are in the input_src folder.
- Run python run.py to create the vector index.
Querying Data:
- Run python pinecone_query.py to perform queries on the indexed data.

Notes

Make sure to replace your_openai_api_key and your_pinecone_api_key with your actual API keys in the .env file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDC-RAG

Setup Instructions

Project Structure

Usage

Notes

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
input_src		input_src
utils		utils
.env		.env
.gitignore		.gitignore
MDC-pinecone-demo.ipynb		MDC-pinecone-demo.ipynb
README.md		README.md
pinecone_query.py		pinecone_query.py
requirements.txt		requirements.txt
run.py		run.py

cdukedev/mdc-RAG

Folders and files

Latest commit

History

Repository files navigation

MDC-RAG

Setup Instructions

Project Structure

Usage

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages