Unify evaluation API of model-based and statistical metrics #6903

tholor · 2024-02-05T08:47:48Z

We recently introduced two different ways to evaluate a pipeline:

Statistical metrics
Model-based metrics

Let's unify the interface here so that this is a consistent UX.
We already discussed that we will go for the new "model-based interface":

A single StatisticalEvaluator component
- One metric per-instance
- Inputs based on the metric

The text was updated successfully, but these errors were encountered:

shadeMe · 2024-02-07T10:34:51Z

The DeepEvalEvaluator implementation can serve as a blueprint for the statistical evaluator - deepset-ai/haystack-core-integrations#346

tholor added topic:eval P1 High priority, add to the next sprint 2.x Related to Haystack v2.0 labels Feb 5, 2024

shadeMe assigned shadeMe and silvanocerza Feb 5, 2024

shadeMe removed their assignment Feb 7, 2024

This was referenced Feb 13, 2024

feat: Add SASEvaluator component #6980

Merged

feat: Add StatisticalEvaluator component #6982

Merged

chore: Delete old eval API #6983

Merged

docs: Add eval Components docs config #6984

Merged

silvanocerza closed this as completed Feb 14, 2024

This was referenced Feb 15, 2024

feat: Add TextCleaner component #6997

Merged

refactor: Refactor SASEvaluator #6998

Merged

refactor: Refactor StatisticalEvaluator #6999

Merged

refactor: Clean eval components #7005

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify evaluation API of model-based and statistical metrics #6903

Unify evaluation API of model-based and statistical metrics #6903

tholor commented Feb 5, 2024

shadeMe commented Feb 7, 2024

Unify evaluation API of model-based and statistical metrics #6903

Unify evaluation API of model-based and statistical metrics #6903

Comments

tholor commented Feb 5, 2024

shadeMe commented Feb 7, 2024