OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 3,334 346 Updated Jul 20, 2024

dongrixinyu / JioNLP

中文 NLP 预处理、解析工具包，准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Python 3,148 380 Updated Jul 20, 2024

fossabot / clash

A rule based proxy in Go.

Go 832 6,946 Updated Nov 3, 2023

mudler / LocalAI

🤖 The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformer…

C++ 21,969 1,685 Updated Jul 20, 2024

promptfoo / promptfoo

Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with com…

TypeScript 3,652 257 Updated Jul 20, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 23,250 3,302 Updated Jul 20, 2024

OpenLMLab / LEval

[ACL'24 Oral] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Python 313 13 Updated Jul 9, 2024

WooooDyy / LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

5,833 348 Updated Jul 14, 2024

google-research-datasets / conceptual-captions

Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.

Shell 504 25 Updated Aug 21, 2021

persimmon-ai-labs / adept-inference

Inference code for Persimmon-8B

Python 415 23 Updated Sep 9, 2023

LiLittleCat / awesome-free-chatgpt

🆓免费的 ChatGPT 镜像网站列表，持续更新。List of free ChatGPT mirror sites, continuously updated.

Python 17,237 1,206 Updated Jul 9, 2024

nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,097 711 Updated Jul 16, 2024

jinlanfu / GPTScore

Source Code of Paper "GPTScore: Evaluate as You Desire"

Python 213 13 Updated Feb 21, 2023

explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Python 5,839 553 Updated Jul 20, 2024

LC1332 / Chat-Haruhi-Suzumiya

Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.

Jupyter Notebook 1,706 151 Updated Apr 4, 2024

modAL-python / modAL

A modular active learning framework for Python

Python 2,177 317 Updated Feb 26, 2024

vercel / vercel

Develop. Preview. Ship.

TypeScript 12,448 2,139 Updated Jul 20, 2024

thunlp / ChatEval

Codes for our paper "ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate"

Python 212 13 Updated Apr 15, 2024

zhenbench / z-bench

Z-Bench 1.0 by 真格基金：一个麻瓜的大语言模型中文测试集。Z-Bench is a LLM prompt dataset for non-technical users, developed by an enthusiastic AI-focused team in Zhenfund.

469 41 Updated Jun 28, 2023

yaodongC / awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

1,044 58 Updated Jan 4, 2024

FranxYao / chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

Jupyter Notebook 2,451 121 Updated Apr 22, 2024

openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 14,410 2,551 Updated Jul 13, 2024

IBM / Dromedary

Dromedary: towards helpful, ethical and reliable LLMs.

Python 1,102 84 Updated Oct 26, 2023

Instruction-Tuning-with-GPT-4 / GPT-4-LLM

Instruction Tuning with GPT-4

HTML 4,101 295 Updated Jun 11, 2023

google-research / FLAN

Python 1,422 150 Updated Jul 5, 2024

Previous Next

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sisterdong

Block or report sisterdong

Lists (1)

👀 Interviews

Stars

bloomberg / memray

Kensuke-Hinata / statistic

oap-project / raydp

google / space

phfaist / pylatexenc

open-compass / opencompass