OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 3,556 370 Updated Aug 17, 2024

f / awesome-chatgpt-prompts

This repo includes ChatGPT prompt curation to use ChatGPT better.

HTML 108,153 14,820 Updated Aug 16, 2024

HowieHwong / TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 392 33 Updated Jul 31, 2024

MLGroupJLU / LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

1,351 86 Updated Jun 3, 2024

JailbreakBench / jailbreakbench

An Open Robustness Benchmark for Jailbreaking Language Models [arXiv 2024]

Python 148 15 Updated Aug 15, 2024

huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences

Python 4,360 374 Updated Aug 15, 2024

Zjh-819 / LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,347 154 Updated Nov 28, 2023

TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models

Python 1,332 264 Updated Aug 16, 2024

flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Python 13,779 2,083 Updated Aug 17, 2024

chawins / llm-sp

Papers and resources related to the security and privacy of LLMs 🤖

Python 363 29 Updated Aug 7, 2024

pytorch / captum

Model interpretability and understanding for PyTorch

Python 4,757 481 Updated Aug 17, 2024

inseq-team / inseq

Interpretability for sequence generation models 🐛 🔍

Python 348 37 Updated Aug 15, 2024

controllability / jailbreak-evaluation

The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.

Python 19 2 Updated May 15, 2024

ThuCCSLab / Awesome-LM-SSP

A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).

661 40 Updated Aug 17, 2024

niconi19 / LLM-Conversation-Safety

[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

62 5 Updated Aug 7, 2024

corca-ai / awesome-llm-security

A curation of awesome tools, documents and projects about LLM Security.

838 83 Updated Aug 17, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 21,258 2,032 Updated Aug 9, 2024

florin-git / The-Power-of-Noise

Code and data for "The Power of Noise: Redefining Retrieval for RAG Systems"

Jupyter Notebook 36 1 Updated Aug 13, 2024

casperllm / CASPER

Jupyter Notebook 9 2 Updated Apr 27, 2024

deepset-ai / haystack

🔍 LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your d…

Python 15,161 1,750 Updated Aug 17, 2024

IntelLabs / fastRAG

Efficient Retrieval Augmentation and Generation Framework

Python 1,219 104 Updated Aug 8, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 6,157 1,630 Updated Aug 17, 2024

Tongji-KGLLM / RAG-Survey

1,638 118 Updated May 8, 2024

google / python-fire

Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.

Python 26,758 1,439 Updated Aug 9, 2024

swansonk14 / typed-argument-parser

Typed argument parser for Python

Python 489 39 Updated Aug 12, 2024

vinid / safety-tuned-llamas

ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.

Python 57 6 Updated May 9, 2024

MathewSachin / Captura

Capture Screen, Audio, Cursor, Mouse Clicks and Keystrokes

C# 9,588 1,793 Updated Apr 9, 2023

tjunlp-lab / Awesome-LLMs-Evaluation-Papers

The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.

668 41 Updated May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fyyfu FYYFU

Block or report FYYFU

Lists (1)

security

Stars

booydar / babilong

gkamradt / LLMTest_NeedleInAHaystack

open-compass / opencompass