A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…

960 54 Updated Oct 28, 2024

haizelabs / redteaming-resistance-benchmark

Python 33 2 Updated Aug 3, 2024

METR / task-standard

METR Task Standard

TypeScript 118 28 Updated Oct 30, 2024

adityatelange / hugo-PaperMod

A fast, clean, responsive Hugo theme.

HTML 9,900 2,691 Updated Sep 15, 2024

UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations

Python 597 111 Updated Nov 2, 2024

OpenSafetyLab / SALAD-BENCH

【ACL 2024】 SALAD benchmark & MD-Judge

Python 103 11 Updated Oct 11, 2024

ChiangE / Sophon

The implementation of Sophon

Python 8 2 Updated Jun 22, 2024

princeton-nlp / SWE-agent

[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challen…

Python 13,615 1,379 Updated Oct 31, 2024

AI-secure / DecodingTrust

A Comprehensive Assessment of Trustworthiness in GPT Models

Python 258 55 Updated Sep 16, 2024

centerforaisafety / HarmBench

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Jupyter Notebook 316 53 Updated Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yawen Duan yawen-d

Achievements

Achievements

Block or report yawen-d

Lists (1)

🚀 My stack

Stars

flageval-baai / FlagEval

aiverify-foundation / aiverify

karpathy / nanoGPT

UKGovernmentBEIS / inspect_evals

anthropics / anthropic-quickstarts

normster / llm_rules

aiverify-foundation / moonshot-data

aiverify-foundation / moonshot-ui

IS2Lab / S-Eval

kevinyaobytedance / llm_eval

EleutherAI / cookbook

GraySwanAI / nanoGCG

RZFan525 / Awesome-ScalingLaws

princeton-nlp / tree-of-thought-llm

karpathy / LLM101n

aiverify-foundation / moonshot

lakeraai / pint-benchmark

MetaGLM / zhipuai-sdk-python-v4

prompt-security / ps-fuzz

EleutherAI / lm-evaluation-harness

ydyjya / Awesome-LLM-Safety