#

evaluation

Here are 32 public repositories matching this topic...

langfuse / langfuse

🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

open-source playground monitoring analytics evaluation self-hosted ycombinator openai gpt observability large-language-models llm prompt-engineering langchain llmops llama-index prompt-management evals llm-evaluation

Updated Jul 29, 2024
TypeScript

promptfoo / promptfoo

Test your prompts, agents, and RAGs. Redteaming, pentesting, vulnerability scanning for LLMs. Improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

testing ci evaluation ci-cd pentesting cicd vulnerability-scanners prompts evaluation-framework red-teaming rag llm prompt-engineering llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework

Updated Jul 29, 2024
TypeScript

ianarawjo / ChainForge

An open-source visual programming environment for battle-testing prompts to LLMs.

ai evaluation large-language-models prompt-engineering llms llmops

Updated Jul 22, 2024
TypeScript

lunary-ai / lunary

The production toolkit for LLMs. Observability, prompt management and evaluations.

testing ai monitoring evaluation logs self-hosted openai hacktoberfest observability prompts llm langchain

Updated Jul 29, 2024
TypeScript

Wscats / compile-hero

🔰Visual Studio Code Extension For Compiling Language

javascript gulp sass less json typescript es6 jsx evaluation scss pug jade automatic compile

Updated Jan 23, 2024
TypeScript

radarlabs / api-diff

A command line tool for diffing json rest APIs

api diff json rest evaluation side-by-side compare regression-testing sxs quality-metrics side-by-sidediff ranking-quality

Updated Jun 13, 2022
TypeScript

langwatch / langwatch

🤖 Build AI applications with confidence ✅ DSPy Visualizer ✅ Understand how your users are using your LLM-app ✅ Get a full picture of the quality performance of your LLM-app ✅ Collaborate with your stakeholders in ONE platform ✅ Iterate towards the most valuable & reliable LLM-app.

ai analytics evaluation openai gpt datasets observability llm prompt-engineering

Updated Jul 29, 2024
TypeScript

initminal / run

Safely execute untrusted code with ESM syntax support, dynamic injection of ESM modules from URL or plain JS code, and granular access control based on whitelisting for each JS object.

nodejs javascript security vm browser modules es6 worker sandbox evaluation eval jailed esm js-interpreter untrusted-code

Updated Apr 5, 2023
TypeScript

poyro / poyro

Test your web app LLM integrations using existing testing frameworks. Confidently launch AI-driven webapps to production.

nodejs testing ai evaluation prompt prompts vitest llm prompt-engineering llmops

Updated Jul 28, 2024
TypeScript

fig-tree-evaluator

CarlosNZ / fig-tree-evaluator

A highly configurable custom expression tree evaluator

json tree evaluation configuration evaluator expression-evaluator expression-tree json-forms configuration-files

Updated Jul 29, 2024
TypeScript

kanugurajesh / Career-Guide

An application to help to make good career choices

open-source learning-path evaluation career-guide educational-project tailwindcss career-path hackathon-winner wellfare nextjs14 gemini-pro personal-problem-solution realworld-problem-solution

Updated Jan 23, 2024
TypeScript

MaastrichtU-IDS / fairificator

Tool to evaluate how FAIR is a resource URL using the F-UJI API

metrics evaluation fair fair-data fair-principles

Updated Dec 14, 2022
TypeScript

dysbulic / serial-pairs

Continual development workflow developed for HackFS 2023

video evaluation pair-programming

Updated Sep 17, 2023
TypeScript

paradite / 16x-eval

Evaluation framework for LLMs and prompts on real world coding tasks in JavaScript, Python and SQL

evaluation prompt eval llm prompt-engineering

Updated Jun 23, 2024
TypeScript

kocmitom / MT-Thresholds

metrics machine-translation evaluation

Updated Jan 27, 2024
TypeScript

cdaringe / programming-language-selector

Programming Language Selector based on language metadata and user-specified values.

decision-making evaluation languages

Updated Jul 29, 2024
TypeScript

mrblack360 / PSAIMS

Primary School Academic Information Management System, PSAIMS makes it easy to collect, process, analyse and disseminate Tanzanian based primary school academic information.

teachers education evaluation academic assessment students-information

Updated Aug 21, 2021
TypeScript

CarlosNZ / jsonforms-with-figtree-demo

Integrate FigTree Evaluator with JSON Forms

json json-schema evaluation configuration configuration-files figtree

Updated Feb 21, 2024
TypeScript

bxr1nG / boolean-evaluator

Package for transforming a string with logical operators into the result of an expression

typescript calculus logic evaluation boolean

Updated May 16, 2023
TypeScript

koishijs / koishi-plugin-eval

Execute JavaScript in Koishi

plugin code sandbox evaluation koishi

Updated Jul 22, 2023
TypeScript

Improve this page

Add a description, image, and links to the evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the evaluation topic, visit your repo's landing page and select "manage topics."