User Feedback in LLM apps

User feedback is a great source to evaluate the quality of an LLM app’s output.

What are common feedback types?

Depending on the type of the application, there are different types of feedback that can be collected that vary in quality, detail, and quantity.

Explicit Feedback: Directly prompt the user to give feedback, this can be a rating, a like, a dislike, a scale or a comment. While it is simple to implement, quality and quantity of the feedback is often low.
Implicit Feedback: Measure the user’s behavior, e.g., time spent on a page, click-through rate, accepting/rejecting a model-generated output. This type of feedback is more difficult to implement but is often more frequent and reliable.

What is a common workflow of using user feedback?

A common workflow is to:

Collection of feedback alongside LLM traces in Langfuse

Example: Negative, Langchain not included in response
Browsing of all user feedback, especially when collecting comments from users
Identification of the root cause of the low-quality response (can be managed via annotation queues)

Example: Docs on Langchain integration are not included in embedding similarity search

Demo

Langfuse UI on the left, demo application on the right

→ Try the Q&A chatbot yourself and browse the collected feedback in the public Langfuse project

Get started

In Langfuse, any kind of user feedback can be collected as a score via the Langfuse SDKs or API and attached to an execution trace or an individual LLM generation (tracing data model).

Types of user feedback that can be collected as a score:

Be numeric (e.g. 1-5 stars)
Be categorical (e.g. thumbs up/down)
Be boolean (e.g. accept/reject)

Integration Example

In this example, we use the LangfuseWeb SDK to collect user feedback right from the browser.

Note: You can also use the SDKs server-side or add scores via the API.

User feedback on individual responses

Chat application

Integration

UserFeedbackComponent.tsx

import { LangfuseWeb } from "langfuse";
 
export function UserFeedbackComponent(props: { traceId: string }) {
  const langfuseWeb = new LangfuseWeb({
    publicKey: env.NEXT_PUBLIC_LANGFUSE_PUBLIC_KEY,
  });
 
  const handleUserFeedback = async (value: number) =>
    await langfuseWeb.score({
      traceId: props.traceId,
      name: "user_feedback",
      value,
    });
 
  return (
    <div>
      <button onClick={() => handleUserFeedback(1)}>👍</button>
      <button onClick={() => handleUserFeedback(0)}>👎</button>
    </div>
  );
}

Preview

Custom via SDKs/API SDKs/API: Evaluation Pipelines

Was this page useful?

Questions? We're here to help

GitHub Q&AEmail Talk to sales