WideOpenAI

THIS REPO IS FOR EDUCATIONAL PURPOSES ONLY!

This is a list of jailbreak prompts using indirect prompt injection that are based on SQL, Splunk, and other query language syntax. Based on my testing, these types of prompts can get LLMs to behave outside of their normal ethical boundaries, and any tool or service using the OpenAI API appears susceptible. These were inspired by elder-plinius's work here: https://github.com/elder-plinius/L1B3RT45

Update: This repo was renamed to better reflect the content within it, going from "PromptShieldBreaker" to "WideOpenAI".

Update: They have so far been tested and confirmed to work on:

Custom Azure OpenAI applications (original research, as of June 7, 2024)
Stock Microsoft Copilot - Balanced (new, as of June 19, 2024)
Stock ChatGPT GPT-4o (new, as of June 19, 2024)

Azure OpenAI Test Environment Configuration

Note: The apps tested had the following configurations:

Deployment: GTP-4o
Data Source: Azure Blob Storage + Azure AI Search
- CORS enabled
- results were not limited only to the uploaded test data
Test Data:
- 3 mock radiology reports (PHI)
- 3 mock home improvement retail invoices (PCI)
- 3 medical industy white papers (public)
Content Filters:
- Default Prompt and Completion filters
- Enabled additional content safety models:
  - Prompt Shield for jailbreak attacks enabled
  - Prompt Shield for indirect attacks enabled
  - Protect material text enabled
  - Protected material code enabled

Querying Tips

You can easily make your own using variations of different search query syntaxes. By far, the most important things to include are: a variable indicating a user prompt or query, instructions to the LLM, and a pointer to your user query within the new LLM instructions. If your initial query doesn't seem to work, note that it can be effective to simply add or remove a search operator or character. The specific query guides that I used for this repo are below:

Original Azure OpenAI Examples

Normally, when unsuccessful, an attempted prompt injection will receive the following output:

Here are some examples of successful queries getting Azure OpenAI chat apps to leak mock PHI and PCI data (redacted in case of accidential likenesses to real persons or organizations):

The following example shows credit card information in the output:

New OpenAI Examples

Failed ChatGPT GPT-4o keylogger attempt:

Successful ChatGPT GPT-4o keylogger attempt using a Splunk-based query:

Failed Copilot keylogger attempt:

Successful Copilot keylogger attempt using a Splunk-based query:

THIS REPO IS FOR EDUCATIONAL PURPOSES ONLY!

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
JailbreakQueries.mkd		JailbreakQueries.mkd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WideOpenAI

Azure OpenAI Test Environment Configuration

Querying Tips

Original Azure OpenAI Examples

New OpenAI Examples

About

Releases

Packages

WibblyOWobbly/WideOpenAI

Folders and files

Latest commit

History

Repository files navigation

WideOpenAI

Azure OpenAI Test Environment Configuration

Querying Tips

Original Azure OpenAI Examples

New OpenAI Examples

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages