Skip to content

All things specific to LLM Red Teaming Generative AI

License

Notifications You must be signed in to change notification settings

superf0sh/AI-Red-Teaming

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Red Teaming a.k.a. Awesome-LLM-Red-Teaming

About Me Blog

All things specific to Generative AI LLM Red Teaming

My Blog "What the heck is AI Red Teaming" https://bit.ly/ai-red-teaming

As of 11.30.23, I am working hard to build the repos - takes time to review and curate. Appreciate your patience ... Thanks ...
As of 2.1.24, Started transcribing and curating the links from my Omnioutline to this GitHub page ...

Best Practices NIST Survey & Analytical Paper Collection Metrics Benchmarks Datasets Other Repos

Best Practices

Top

Year Title Notes
My Blog "What the heck is AI Red Teaming" A quick general blog
What’s the Difference Between Traditional Red-Teaming and AI Red-Teaming? There is a slight cognitice dissonance between traditional Red Teaming and AI Red Teaming
2023.07 Google's AI Red Team: the ethical hackers making AI safer Good Conceptual Diagrams
2023.10 Best Practices for Securing LLM-Enabled Applications Nvidia
2023.06 NVIDIA AI Red Team: An Introduction
Use Cases
Adversarial Intelligence: Red Teaming Malicious Use Cases for AI
Sensational Press
2023.08 Hackers red-teaming A.I. are ‘breaking stuff left and right,’ but don’t expect quick fixes from DefCon: ‘There are no good guardrails

NIST

Top

All NIST documents, ideas, responses et al

Most probably will split into a Awesome-NIST repository. I have - see Awesome-NIST


Survey & Analytical Papers

Top

Year Title Notes
Survey Papers
2024.01 Gradient-Based Language Model Red Teaming Hot from the press (at least for now! as of Stardate -299100.57)
  • I had written, in my Red Teaming blog, “Tests follow a progressive nature, where a response could lead to another prompt deeper in the knowledge graph on the same topic” Here
  • I was thinking of a prompt hierarchy, this paper does the adaptive Red Teaming by creating new, modified prompts using backprop !!
2024.01 Red Teaming Visual Language Models
2024.01 Red-Teaming for Generative AI: Silver Bullet or Security Theater?
2023.11 Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild
2023.08 Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
2023.06 Explore, Establish, Exploit: Red Teaming Language Models from Scratch
2022.09 Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
LLMs vs. LLMs
2022.02 Red Teaming Language Models with Language Models
Analytical Papers
2023.10 Risk Assessment and Statistical Significance in the Age of Foundation Models
Star Trek Stardate Calculator

Metrics

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section


Benchmarks

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section


Datasets

Top

LLM benchmarks (See LLM Evaluation Topics for a quick intro)

I will start polulating this section


Other Repos

Top

LLM benchmarks (See LLM Evaluation Other Repos

I will start polulating this section

Title Notes
Awesome Security
Awesome Controls Links to various security fraeworks. Last update 4 years ago, still useful
Awesome Infosec A curated list of awesome information security resources

About

All things specific to LLM Red Teaming Generative AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published