RAG Chunking Evaluation

This repository contains code and datasets for evaluating chunking strategies in Retrieval-Augmented Generation (RAG) systems. The project includes various benchmarks, data loaders, and utility functions to facilitate the evaluation process.

Setup

Clone this repository

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables: Copy .env.example to .env and fill in the required values.

Usage

Follow the instructions in the my_benchmark notebook to run the proposed chunking evaluation framework. The specific chunking strategies under evaluation are detailed in the chunking_strategies notebook.

Each step in the evaluation pipeline generates intermediate results, which are saved in the data directory for later review and loading.

The experimental directory includes tests for other benchmarks and evaluation frameworks, such as Ragas, Trulens, and Multi-Hop-RAG.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
data		data
experimental		experimental
utils		utils
.env.example		.env.example
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
chunking_strategies.ipynb		chunking_strategies.ipynb
my_benchmark.ipynb		my_benchmark.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Chunking Evaluation

Setup

Usage

About

Releases

Packages

Languages

License

Leo310/rag-chunking-evaluation

Folders and files

Latest commit

History

Repository files navigation

RAG Chunking Evaluation

Setup

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages