haystack-evaluation

Using Haystack to benchmark different RAG architectures over different datasets

Use evaluation on the selected datasets to optimise some parameters commonly tweaked in RAG pipelines:

top_k
chunk_size
embedding model

goal number 1 is to give user practical guidance on what techniques to try out on their dataset/use case

goal number 2 is to show that there is not a “silver bullet” type of solution, that it depends on the dataset and use case, but that Haystack can support them all

goal number 3 is to showcase advanced evaluation/experimentation API (most advanced compared to competitors)

it’s not a research paper, so should not be too “academic” (i.e. not too restricted in terms of metrics or datasets to use, not meant to be peer-reviewed or submitted to an academic conference)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

haystack-evaluation

Files

README.md

Latest commit

History

README.md

File metadata and controls

haystack-evaluation