Using Haystack to benchmark different RAG architectures over different datasets
Use evaluation on the selected datasets to optimise some parameters commonly tweaked in RAG pipelines:
- top_k
- chunk_size
- embedding model
goal number 1 is to give user practical guidance on what techniques to try out on their dataset/use case
goal number 2 is to show that there is not a “silver bullet” type of solution, that it depends on the dataset and use case, but that Haystack can support them all
goal number 3 is to showcase advanced evaluation/experimentation API (most advanced compared to competitors)
it’s not a research paper, so should not be too “academic” (i.e. not too restricted in terms of metrics or datasets to use, not meant to be peer-reviewed or submitted to an academic conference)