#

llm-security

Here are 40 public repositories matching this topic...

awesome-software / llm-attacks

Universal and Transferable Attacks on Aligned Language Models

Updated Sep 19, 2023
Python

awesome-software / llm-guard

The Security Toolkit for LLM Interactions

Updated Sep 21, 2023
Python

llm-platform-security / chatgpt-plugin-eval

LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins

openai llm chatgpt chatgpt-plugins llm-security llm-privacy llm-platform llm-platform-security

Updated Oct 5, 2023
HTML

rohilrg / CatchPromptInjection

This repo focus on how to deal with prompt injection problem faced by LLMs

openai-api transformers-models llm langchain prompt-injection llm-security

Updated Oct 19, 2023
Python

sinanw / llm-security-prompt-injection

This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model.

cybersecurity transformers-models prompt-injection llm-prompting llm-security

Updated Dec 18, 2023
Jupyter Notebook

M507 / HackMeGPT

Vulnerable LLM Application

gandalf security-tools damn-vulnerable prompt-engineering prompt-injection llm-security jailbreak-prompt vulnerable-llm-application

Updated Jan 1, 2024
Python

nodite / llm-guard-ts

The Security Toolkit for LLM Interactions (TS version)

typescript transformers security-tools adversarial-machine-learning large-language-models llm prompt-engineering chatgpt llmops prompt-injection llm-security

Updated Jan 5, 2024

levitation-opensource / Manipulative-Expression-Recognition

MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.

Updated Jan 31, 2024
HTML

deadbits / vigil-llm

⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs

security-tools adversarial-machine-learning adversarial-attacks yara-scanner large-language-models llmops prompt-injection llm-security

Updated Jan 31, 2024
Python

balavenkatesh3322 / guardrails-demo

LLM Security Project with Llama Guard

security attack-defense llm aisecurity generative-ai llmops llm-security llama-2 prompt-injection-tool llama-guard

Updated Feb 18, 2024
Python

mickymultani / TestingGemma2B

Evaluation of Google's Instruction Tuned Gemma-2B, an open-source Large Language Model (LLM). Aimed at understanding the breadth of the model's knowledge, its reasoning capabilities, and adherence to ethical guardrails, this project presents a systematic assessment across a diverse array of domains.

gemma responsible-ai huggingface-transformers llm llms llmops genai llm-security llm-inference genai-usecase largelanguagemodels gemma-2b

Updated Feb 26, 2024
Jupyter Notebook

matthernet / LLM-security-check

CLI tool that uses the Lakera API to perform security checks in LLM inputs

ai artificial-intelligence ai-security large-language-models llm llm-security

Updated Mar 13, 2024
Python

CyberAlbSecOP / MINOTAUR_Impossible_GPT_Security_Challenge

MINOTAUR: The STRONGEST Secure Prompt EVER! Prompt Security Challenge, Impossible GPT Security, Prompts Cybersecurity, Prompting Vulnerabilities, FlowGPT, Secure Prompting, Secure LLMs, Prompt Hacker, Cutting-edge Ai Security, Unbreakable GPT Agent, Anti GPT Leak, System Prompt Security.

cyber-security security-challenge ai-security prompt-engineering prompt-injection gpt-security llm-security ai-jailbreak ai-jailbreak-prompts prompt-security system-prompt super-prompt prompt-security-challenge ai-cyber-security gpts-security flow-gpt

Updated Mar 27, 2024

Awesome-LLMs-ICLR-24

azminewasi / Awesome-LLMs-ICLR-24

It is a comprehensive resource hub compiling all LLM papers accepted at the International Conference on Learning Representations (ICLR) in 2024.

pretrained-models pretrained-weights pretrained-language-model large-language-models llm llms llmops large-language-model llm-serving llm-prompting llm-agent llm-security llm-training llm-inference llm-framework llm-privacy llm-evaluation large-language-models-for-graph-learning large-language-models-and-translation-systems

Updated Apr 4, 2024

lastlayer / last-layer-vercel

Example of running last_layer with FastAPI on vercel

llm-security llm-privacy llm-guard llm-guardrails

Updated Apr 5, 2024
Python

microsoft / BIPIA

A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.

Updated Apr 15, 2024
Python

briland / LLM-security-and-privacy

LLM security and privacy

security awesome privacy awesome-list llm generative-ai llm-security llm-framework llm-threats llm-vulnerabilities llm-privacy awesome-llm-security-and-privacy

Updated Apr 15, 2024
TeX

lakeraai / chainguard

Guard your LangChain applications against prompt injection with Lakera ChainGuard.

llm langchain prompt-injection langchain-python llm-security

Updated Apr 17, 2024
Python

LostOxygen / llm-confidentiality

Whispers in the Machine: Confidentiality in LLM-integrated Systems

security machine-learning framework deep-learning transformers openai prompt-toolkit gpt confidentiality systems-security llm prompt-engineering chatgpt prompt-injection llm-security

Updated Apr 24, 2024
Python

EasyJailbreak / EasyJailbreak

An easy-to-use Python framework to generate adversarial jailbreak prompts.

jailbreak discrete-optimization large-language-model llm-security llm-safety-benchmark jailbreak-framework

Updated Apr 25, 2024
Python

Improve this page

Add a description, image, and links to the llm-security topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-security topic, visit your repo's landing page and select "manage topics."