GitHub - superf0sh/AI-Red-Teaming: All things specific to LLM Red Teaming Generative AI

AI Red Teaming a.k.a. Awesome-LLM-Red-Teaming

Back to TOC

About Me	Blog

All things specific to Generative AI LLM Red Teaming

My Blog "What the heck is AI Red Teaming" https://bit.ly/ai-red-teaming

As of 11.30.23, I am working hard to build the repos - takes time to review and curate. Appreciate your patience ... Thanks ...

As of 2.1.24, Started transcribing and curating the links from my Omnioutline to this GitHub page ...

Best Practices	NIST	Survey & Analytical Paper Collection	Metrics	Benchmarks	Datasets	Other Repos

Best Practices

Top

Year	Title	Notes
	My Blog "What the heck is AI Red Teaming"	A quick general blog
	What’s the Difference Between Traditional Red-Teaming and AI Red-Teaming?	There is a slight cognitice dissonance between traditional Red Teaming and AI Red Teaming
2023.07	Google's AI Red Team: the ethical hackers making AI safer	Good Conceptual Diagrams
2023.10	Best Practices for Securing LLM-Enabled Applications	Nvidia
2023.06	NVIDIA AI Red Team: An Introduction
	Use Cases
	Adversarial Intelligence: Red Teaming Malicious Use Cases for AI
	Sensational Press
2023.08	Hackers red-teaming A.I. are ‘breaking stuff left and right,’ but don’t expect quick fixes from DefCon: ‘There are no good guardrails

NIST

Top

All NIST documents, ideas, responses et al

Most probably will split into a Awesome-NIST repository. I have - see Awesome-NIST

Survey & Analytical Papers

Top

Year	Title	Notes
	Survey Papers
2024.01	Gradient-Based Language Model Red Teaming	Hot from the press (at least for now! as of Stardate -299100.57) I had written, in my Red Teaming blog, “Tests follow a progressive nature, where a response could lead to another prompt deeper in the knowledge graph on the same topic” Here I was thinking of a prompt hierarchy, this paper does the adaptive Red Teaming by creating new, modified prompts using backprop !!
2024.01	Red Teaming Visual Language Models
2024.01	Red-Teaming for Generative AI: Silver Bullet or Security Theater?
2023.11	Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild
2023.08	Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
2023.06	Explore, Establish, Exploit: Red Teaming Language Models from Scratch
2022.09	Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
	LLMs vs. LLMs
2022.02	Red Teaming Language Models with Language Models
	Analytical Papers
2023.10	Risk Assessment and Statistical Significance in the Age of Foundation Models
	Star Trek Stardate Calculator

Metrics

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section

Benchmarks

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section

Datasets

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section

Other Repos

Top

LLM benchmarks (See LLM Evaluation Other Repos

I will start polulating this section

Title	Notes
Awesome Security
Awesome Controls	Links to various security fraeworks. Last update 4 years ago, still useful
Awesome Infosec	A curated list of awesome information security resources

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
images		images
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
S61261-compressed.pdf		S61261-compressed.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Red Teaming a.k.a. Awesome-LLM-Red-Teaming

Back to TOC

Best Practices

NIST

Survey & Analytical Papers

Metrics

Benchmarks

Datasets

Other Repos

About

Releases

Packages

License

superf0sh/AI-Red-Teaming

Folders and files

Latest commit

History

Repository files navigation

AI Red Teaming a.k.a. Awesome-LLM-Red-Teaming

Back to TOC

Best Practices

NIST

Survey & Analytical Papers

Metrics

Benchmarks

Datasets

Other Repos

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages