Skip to content

Get 100% uptime, reliability from OpenAI. Handle Rate Limit, Timeout, API, Keys Errors

License

Notifications You must be signed in to change notification settings

shyamal-anadkat/reliableGPT

 
 

Repository files navigation

💪 reliableGPT: Stop OpenAI Errors in Production 🚀

⚡️ Never worry about overloaded OpenAI servers, rotated keys, or context window limitations again!⚡️

reliableGPT handles:

  • OpenAI APIError, OpenAI Timeout, OpenAI Rate Limit Errors, OpenAI Service UnavailableError / Overloaded
  • Context Window Errors
  • Invalid API Key errors

👉 Code Examples

ezgif com-optimize

How does it handle failures?

  • Specify a fallback strategy for handling failed requests: For instance, you can define fallback_strategy=['gpt-3.5-turbo', 'gpt-4', 'gpt-3.5-turbo-16k', 'text-davinci-003'], and if you hit an error then reliableGPT will retry with the specified models in the given order until it receives a valid response. This is optional, and reliableGPT also has a default strategy it uses.

  • Specify backup tokens: Using your OpenAI keys across multiple servers - and just got one rotated? You can pass backup keys using add_keys(). We will store and go through these, in case any get keys get rotated by OpenAI. For security, we use special tokens, and enable you to delete all your keys (using delete_keys()) as well.

  • Context Window Errors: For context window errors it automatically retries your request with models with larger context windows

  • Rate Limit Errors: Set queue_requests=True and We put your requests in a queue, and run parallel batches - while accounting for your OpenAI or Azure OpenAI request + token limits (works with Langchain/LlamaIndex/Azure as well).

Getting Started

Step 1. pip install package

pip install reliableGPT

Step 2. The core package is 1 line of code

Integrating with OpenAI, Azure OpenAI, Langchain, LlamaIndex

from reliablegpt import reliableGPT
openai.ChatCompletion.create = reliableGPT(openai.ChatCompletion.create, user_email='[email protected]')

Advanced Usage - Queue Requests for Token & Request Limits

Guaranteed responses from Azure + OpenAI GPT-4, GPT 3.5 Turbo - Handle Rate Limit Errors

Use queue_requests=True and set your token limits as model_limits_dir = {"gpt-3.5-turbo": {"max_token_capacity": 1000000, "max_request_capacity": 10000} You can find your account rate limits here: https://platform.openai.com/account/rate-limits

Here's an example request using queing to handle rate limits

openai.ChatCompletion.create = reliableGPT(
  openai.ChatCompletion.create, 
  user_email= ["[email protected]", "[email protected]"], 
  queue_requests=True,
  model_limits_dir = {"gpt-3.5-turbo": {"max_token_capacity": 1000000, "max_request_capacity": 10000}},
  fallback_strategy=['gpt-3.5-turbo', 'text-davinci-003', 'gpt-3.5-turbo']
)

[👋 Give us feedback on how we could make this easier - Email us ([email protected]) or Text/Whatsapp us (+17708783106)].

Breakdown of params

Here's everything you can pass to reliableGPT

Parameter Type Required/Optional Description
openai.ChatCompletion.create OpenAI method Required This is a method from OpenAI, used for calling the OpenAI chat endpoints
user_email string/list Required Update you on spikes in errors. You can either set user_email to one email (as user_email = "[email protected]") or multiple (as user_email = ["[email protected]", "[email protected]"] if you want to send alerts to multiple emails
fallback_strategy list Optional You can define a custom fallback strategy of OpenAI models you want to try using. If you want to try one model several times, then just repeat that e.g. ['gpt-4', 'gpt-4', 'gpt-3.5-turbo'] will try gpt-4 twice before trying gpt-3.5-turbo
queue_requests bool Optional Set to True if you want to handle rate limit errors using a request queuing mechanism
model_limits_dir dict Optional Note: Required if using queue_requests = True, For models you want to handle rate limits for set model_limits_dir = {"gpt-3.5-turbo": {"max_token_capacity": 1000000, "max_request_capacity": 10000}} You can find your account rate limits here: https://platform.openai.com/account/rate-limits
user_token string Optional Pass your user token if you want us to handle OpenAI Invalid Key Errors - we'll rotate through your stored keys (more on this below 👇) till we get one that works
backup_openai_key string Optional Pass your OpenAI API key if you're using Azure and want to switch to OpenAI in case your requests start failing

Reliable Data Loaders

Use reliableGPT for retry + alerting on langchain data loaders.

  • Fix + retry malformed urls
  • Retry different data loaders for PDFs and CSVs
  • Get email alerts w/ failing file/url if there's still errors

Here's an example:

from reliablegpt import reliableData
# initialize your langchain text splitter
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000,
                                               chunk_overlap=200,
                                               length_function=len)

# initialize reliableData object. Pass in your email, any metadata you want to receive in your email alerts, and your initialized langchain text splitter
rDL = reliableData(user_emails=["[email protected]"], metadata={"environment": "local"}, text_splitter=text_splitter)

# identify the impacted user (can be email/id/etc.)
rDL.set_user("[email protected]")

# just wrap your ingestion function with ReliableDataLoaders 
chunks = rDL.reliableDataLoaders(ingest(file, api_url, input_url), filepath="data/" + file.filename, web_url=input_url)

Track LLM Server Dropped Requests, Errors

Use reliableGPT to track Incoming Requests, Dropped Requests - Timeout Requests + Error Requests

In order to get started add the @reliable_query decorator to your query endpoint on your App server, ensure to pass in your user_email to access your logs, your logs will be available at https://berri-sentry.vercel.app/

from reliablegpt import reliable_query

@app.route("/berri_query")
@reliable_query(user_email='[email protected]')
def berri_query():
  print('Request receieved: ', request) 
  # parse input params
  query = request.args.get("query")

View of the reliableGPT dashboard at: https://berri-sentry.vercel.app/<user_email>

Screenshot 2023-07-01 at 8 00 49 PM

Handle rotated keys

Step 1. Add your keys

from reliablegpt import add_keys, delete_keys, reliableGPT
# Storing your keys 🔒
user_email = "[email protected]" # 👈 Replace with your email
token = add_keys(user_email, ["openai_key_1", "openai_key_2", "openai_key_3"])

Pass in a list of your openai keys. We will store these and go through them in case any get keys get rotated by OpenAI. You will get a special token, give that to reliableGPT.

Step 2. Initialize reliableGPT

import openai 
openai.api_key = "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" ## Invalid OpenAI key

print("Initializing reliableGPT 💪")
openai.ChatCompletion.create = reliableGPT(openai.ChatCompletion.create, user_email= user_email, user_token = token)

reliableGPT💪 catches the Invalid API Key error thrown by OpenAI and rotates through the remaining keys to ensure you have zero downtime in production.

Step 3. Delete keys

#Deleting your keys from reliableGPT 🫡
delete_keys(user_email = user_email, user_token=token)

You own your keys, and can delete them whenever you want.

Support

Reach out to us on Discord or Email us at [email protected] & [email protected]

About

Get 100% uptime, reliability from OpenAI. Handle Rate Limit, Timeout, API, Keys Errors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%