feat: Implement `UpTrainEvaluator` #272

shadeMe · 2024-01-25T16:14:16Z

Related to #248.

We introduce UpTrainEvaluator, a component that uses the UpTrain LLM evaluation framework to calculate evaluation metrics for RAG pipelines (among others). Refer deepset-ai/haystack#6784 for an overview of the API design.

This PR introduces the following user-facing classes:

UpTrainMetric - A enumeration that lists the supported UpTrain metrics.
UpTrainEvaluator - The pipeline component interfaces with the evaluation framework. It accepts a single metric and its optional parameters. It also provides extra optional parameters to configure the API client. The inputs to the pipeline are dynamically configured depending on the metric. This is done with help of a metric descriptor table that contains metadata concerning input/output conversion formats, expected inputs/outputs, etc.

The output of the component is a nested list of metric results. Each input can have one or more results, depending on the metric. Each result is a dictionary containing the following keys and values:

name - The name of the metric.
score - The score of the metric.
explanation - An optional explanation of the score.

CLAassistant · 2024-01-25T16:14:23Z

All committers have signed the CLA.

julian-risch

This PR is in really great shape! I have commented what we also briefly talked about.
Could you please also add an example of how the new component can be used in a pipeline? Here is an example of another integration's example: https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/chroma/example/example.py

integrations/uptrain/src/uptrain_haystack/evaluator.py

integrations/uptrain/tests/test_evaluator.py

integrations/uptrain/src/uptrain_haystack/evaluator.py

julian-risch

You can add an entry about UpTrain also to the inventory in the readme as part of this PR. https://github.com/deepset-ai/haystack-core-integrations/tree/main?tab=readme-ov-file#inventory

Update project structure to use the `haystack_integrations` namespace

README.md

julian-risch

LGTM! 👍 The example is really helpful. The topic of list processing we can postpone and for the output format, let's see whether there are any use cases that would benefit from having separate edges instead of one dict. We talked about the pipeline visualization that unfortunately hides the contents of the dict from the user in its current implementation. Let's get this merged fast and collect feedback from users! Thanks for the fruitful discussions and great job! 🙂

julian-risch · 2024-01-26T13:57:26Z

And let's see whether somebody from UpTrain can help with the integration test of the response matching metric that fails with a 500 Internal Server Error

shadeMe added new integration Discuss the creation of a new integration in Core integration: uptrain labels Jan 25, 2024

github-actions bot added the topic:CI label Jan 25, 2024

shadeMe force-pushed the feat/uptrain-evaluator branch from 984144c to 4f95b33 Compare January 25, 2024 16:17

shadeMe linked an issue Jan 25, 2024 that may be closed by this pull request

Add UpTrain evaluation framework integration #248

Closed

10 tasks

shadeMe removed a link to an issue Jan 25, 2024

Add UpTrain evaluation framework integration #248

Closed

10 tasks

feat: Implement UpTrainEvaluator and co.

68e7338

shadeMe force-pushed the feat/uptrain-evaluator branch from 4f95b33 to 68e7338 Compare January 25, 2024 16:20

shadeMe marked this pull request as ready for review January 25, 2024 16:25

shadeMe requested a review from a team as a code owner January 25, 2024 16:25

shadeMe requested review from vblagoje and julian-risch and removed request for a team and vblagoje January 25, 2024 16:25

julian-risch requested changes Jan 26, 2024

View reviewed changes

julian-risch reviewed Jan 26, 2024

View reviewed changes

shadeMe added 2 commits January 26, 2024 14:27

Address review comments

554bd18

Update project structure to use the `haystack_integrations` namespace

Update README

ff31733

julian-risch reviewed Jan 26, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

Merge branch 'main' into feat/uptrain-evaluator

0dd6a0b

shadeMe requested a review from julian-risch January 26, 2024 13:33

Fix typo

84958f5

julian-risch mentioned this pull request Jan 26, 2024

Add UpTrain Integration deepset-ai/haystack-integrations#149

Merged

julian-risch approved these changes Jan 26, 2024

View reviewed changes

shadeMe merged commit 4ddcd5e into deepset-ai:main Jan 26, 2024
9 checks passed

shadeMe deleted the feat/uptrain-evaluator branch January 26, 2024 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement `UpTrainEvaluator` #272

feat: Implement `UpTrainEvaluator` #272

shadeMe commented Jan 25, 2024 •

edited

Loading

CLAassistant commented Jan 25, 2024 •

edited

Loading

julian-risch left a comment

julian-risch left a comment

julian-risch left a comment

julian-risch commented Jan 26, 2024

feat: Implement UpTrainEvaluator #272

feat: Implement UpTrainEvaluator #272

Conversation

shadeMe commented Jan 25, 2024 • edited Loading

CLAassistant commented Jan 25, 2024 • edited Loading

julian-risch left a comment

Choose a reason for hiding this comment

julian-risch left a comment

Choose a reason for hiding this comment

julian-risch left a comment

Choose a reason for hiding this comment

julian-risch commented Jan 26, 2024

feat: Implement `UpTrainEvaluator` #272

feat: Implement `UpTrainEvaluator` #272

shadeMe commented Jan 25, 2024 •

edited

Loading

CLAassistant commented Jan 25, 2024 •

edited

Loading