Add HaystackEvaluator #6904

tholor · 2024-02-05T08:54:23Z

We are adding external integrations to enable eval with uptrain, deepeval and RAGAs.
While those come with many nice out-of-the box capabilities, we also see some limitations with them that motivates us to create an own Evaluator, natively in Haystack:

Dependencies: deepeval and RAGAs come with several extra dependencies. We'd like to keep environments clean and small.
Choose a different model: Most default to using GPT-4. We want to allow users to pick different providers. Especially for privacy relevant use cases, where only open source models are acceptable.
Customization of model based metrics: Slightly lower prio than the other two but still important: users should be able to customize the prompt behind a metric (e.g. because slightly different wording is needed when they change an LLM or they want to provide different few-shot examples ...)

masci · 2024-02-17T09:20:42Z

Closing in favor of #7022

tholor added topic:eval P1 High priority, add to the next sprint 2.x Related to Haystack v2.0 labels Feb 5, 2024

shadeMe assigned shadeMe, silvanocerza and julian-risch and unassigned silvanocerza Feb 5, 2024

masci added P2 Medium priority, add to the next sprint if no P1 available and removed P1 High priority, add to the next sprint labels Feb 12, 2024

masci closed this as completed Feb 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HaystackEvaluator #6904

Add HaystackEvaluator #6904

tholor commented Feb 5, 2024

masci commented Feb 17, 2024

Add HaystackEvaluator #6904

Add HaystackEvaluator #6904

Comments

tholor commented Feb 5, 2024

masci commented Feb 17, 2024