Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the generate/generate_events APIs to support generation options #256

Merged
merged 13 commits into from
Feb 14, 2024

Conversation

drazvan
Copy link
Collaborator

@drazvan drazvan commented Jan 16, 2024

This PR is work in progress. Feedback requested.

Goal

Have a more flexible LLMRails interface that can accommodate the following requirements:

  • Allow forwarding of LLM Response fields back to the user
  • Pass additional parameters to the main LLM call
  • Explicitly invoke input/output rails
  • Allow rails to be triggered without blocking the response
  • Return LLM call errors
  • Return additional data (e.g., relevant chunks)
  • Return additional logging (e.g., token usage, raw responses, etc.)
  • Return state object, which should be used instead of using the cache in the case of multi-turn. [TODO: include in the design]
  • Return simple information around how much time each category of rails took (e.g., input 0.6s, main call 1.2s, output 0.8s).

Example Usage

To run only the input rails:

# Since everything is enabled by default, we disable explicitly the others
options = {
    "rails": {
        "output": False,
        "dialog": False,
        "retrieval": False
    }
}
messages = [{
    "role": "user",
    "content": "Am I allowed to say this?"
}]
rails.generate(messages=messages, options=options)

To invoke only some specific input/output rails:

rails.generate(messages=messages, options={
    "rails": {
        "input": ["check jailbreak"],
        "output": ["output moderation v2"]
    }
})

To provide additional parameters to the main LLM call:

rails.generate(messages=messages, options={
    "llm_params": {
        "temperature": 0.5
    }
})

To return additional information from the generation (i.e., context variables):

# This will include the relevant chunks in the returned response, as part
# of the `output_data` field.
rails.generate(messages=messages, options={
    "output_vars": ["relevant_chunks"]
})

To skip enforcing the rails, and only inform the user if they were triggered:

rails.generate(messages=messages, options={
    "enforce": False
})
# {..., log: {"triggered_rails": {"type": "input", "name": "check jailbreak"}}}

To get more details on the LLM calls that were executed, including the raw responses:

rails.generate(messages=messages, options={
    "log": {
        "llm_calls": True
    }
})
# {..., log: {"llm_calls": [...]}}

@drazvan drazvan self-assigned this Jan 16, 2024
@drazvan drazvan added enhancement New feature or request status: help wanted Issues where external contributions are encouraged. labels Jan 16, 2024
@drazvan drazvan added this to the v0.8.0 milestone Jan 16, 2024
@drazvan
Copy link
Collaborator Author

drazvan commented Jan 16, 2024

Are there any other features that should be supported by the "generation options" mechanism?

@drazvan
Copy link
Collaborator Author

drazvan commented Feb 14, 2024

I'm merging this so we can start QA on it. @prasoonvarshney / @trebedea have a look at it from a usage perspective when you have a moment.

@drazvan drazvan marked this pull request as ready for review February 14, 2024 14:13
@drazvan drazvan merged commit b4dd063 into develop Feb 14, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request status: help wanted Issues where external contributions are encouraged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant