We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Python 201 17 Updated Feb 23, 2024

leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,511 964 Updated Jun 22, 2024

dair-ai / Prompt-Engineering-Guide

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 45,978 4,422 Updated Jul 7, 2024

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Python 33,352 4,668 Updated Jul 7, 2024

Jarviswang94 / Multilingual_safety_benchmark

Multilingual safety benchmark for Large Language Models

16 1 Updated Oct 13, 2023

alex000kim / nsfw_data_scraper

Collection of scripts to aggregate image data for the purposes of training an NSFW Image Classifier

Shell 12,197 2,877 Updated Jan 21, 2024

NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Python 3,734 331 Updated Jul 5, 2024

rotaryhammer / code-autodan

Forked from llm-attacks/llm-attacks

An unofficial implementation of AutoDAN attack on LLMs (arXiv:2310.15140)

Python 23 6 Updated Feb 8, 2024

meta-llama / PurpleLlama

Set of tools to assess and improve LLM security.

Python 2,142 355 Updated Jul 3, 2024

amitsangani / Llama

All the projects related to Llama

Jupyter Notebook 353 69 Updated May 29, 2024

atfortes / Awesome-LLM-Reasoning

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.

1,264 68 Updated Jun 30, 2024

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

15,991 1,279 Updated Jul 6, 2024

greshake / llm-security

New ways of breaking app-integrated LLMs

Jupyter Notebook 1,739 111 Updated Jun 17, 2023

thunlp / OpenDelta

A plug-and-play library for parameter-efficient-tuning (Delta Tuning)

Python 957 77 Updated Aug 16, 2023

facebookresearch / end-to-end-negotiator

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Python 1,376 276 Updated May 4, 2020

ahxt / fair_fairness_benchmark

FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods.

Python 23 2 Updated May 10, 2024

JFChi / PLUE

Python 10 2 Updated May 25, 2023

automl / TabPFN

Official implementation of the TabPFN paper (https://arxiv.org/abs/2207.01848) and the tabpfn package.

Python 1,134 104 Updated Apr 28, 2024

EgoAlpha / prompt-in-context-learning

Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.

Jupyter Notebook 1,423 89 Updated Jul 7, 2024

Shark-NLP / OpenICL

OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.

Python 519 27 Updated Oct 3, 2023

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,119 819 Updated Jul 7, 2024

microsoft / robustlearn

Robust machine learning for responsible AI

Python 438 52 Updated Mar 19, 2024

mmoradi-iut / NLP-perturbation

Text perturbation methods to evaluate the robustness of NLP models

Python 21 2 Updated Oct 6, 2021

JFChi / CSCL4FTC

Python 4 2 Updated Feb 7, 2023

pliang279 / LM_bias

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Python 56 8 Updated Nov 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jianfeng Chi JFChi

Achievements