deepset-ai / haystack-evaluation Public

Notifications You must be signed in to change notification settings
Fork 0
Star 5

Using Haystack to benchmark different RAG architectures over different datasets

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
architectures		architectures
datasets		datasets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arago_evaluation.py		arago_evaluation.py
mini_esg_evaluation.py		mini_esg_evaluation.py
squad_evaluation.py		squad_evaluation.py

Repository files navigation

haystack-evaluation

Using Haystack to benchmark different RAG architectures over different datasets

Use evaluation on the selected datasets to optimise some parameters commonly tweaked in RAG pipelines:

top_k
chunk_size
embedding model

goal number 1 is to give user practical guidance on what techniques to try out on their dataset/use case

goal number 2 is to show that there is not a “silver bullet” type of solution, that it depends on the dataset and use case, but that Haystack can support them all

goal number 3 is to showcase advanced evaluation/experimentation API (most advanced compared to competitors)

it’s not a research paper, so should not be too “academic” (i.e. not too restricted in terms of metrics or datasets to use, not meant to be peer-reviewed or submitted to an academic conference)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

haystack-evaluation

About

Releases

Packages

Contributors 2

Languages

License

deepset-ai/haystack-evaluation

Folders and files

Latest commit

History

Repository files navigation

haystack-evaluation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages