Extend the `generate`/`generate_events` APIs to support generation options #256

drazvan · 2024-01-16T09:41:47Z

This PR is work in progress. Feedback requested.

Goal

Have a more flexible LLMRails interface that can accommodate the following requirements:

Allow forwarding of LLM Response fields back to the user
Pass additional parameters to the main LLM call
Explicitly invoke input/output rails
Allow rails to be triggered without blocking the response
Return LLM call errors
Return additional data (e.g., relevant chunks)
Return additional logging (e.g., token usage, raw responses, etc.)
Return state object, which should be used instead of using the cache in the case of multi-turn. [TODO: include in the design]
Return simple information around how much time each category of rails took (e.g., input 0.6s, main call 1.2s, output 0.8s).

Example Usage

To run only the input rails:

# Since everything is enabled by default, we disable explicitly the others
options = {
    "rails": {
        "output": False,
        "dialog": False,
        "retrieval": False
    }
}
messages = [{
    "role": "user",
    "content": "Am I allowed to say this?"
}]
rails.generate(messages=messages, options=options)

To invoke only some specific input/output rails:

rails.generate(messages=messages, options={
    "rails": {
        "input": ["check jailbreak"],
        "output": ["output moderation v2"]
    }
})

To provide additional parameters to the main LLM call:

rails.generate(messages=messages, options={
    "llm_params": {
        "temperature": 0.5
    }
})

To return additional information from the generation (i.e., context variables):

# This will include the relevant chunks in the returned response, as part
# of the `output_data` field.
rails.generate(messages=messages, options={
    "output_vars": ["relevant_chunks"]
})

To skip enforcing the rails, and only inform the user if they were triggered:

rails.generate(messages=messages, options={
    "enforce": False
})
# {..., log: {"triggered_rails": {"type": "input", "name": "check jailbreak"}}}

To get more details on the LLM calls that were executed, including the raw responses:

rails.generate(messages=messages, options={
    "log": {
        "llm_calls": True
    }
})
# {..., log: {"llm_calls": [...]}}

drazvan · 2024-01-16T09:44:02Z

Are there any other features that should be supported by the "generation options" mechanism?

…ontext`.

drazvan · 2024-02-14T14:13:42Z

I'm merging this so we can start QA on it. @prasoonvarshney / @trebedea have a look at it from a usage perspective when you have a moment.

drazvan self-assigned this Jan 16, 2024

drazvan added enhancement New feature or request status: help wanted Issues where external contributions are encouraged. labels Jan 16, 2024

drazvan added this to the v0.8.0 milestone Jan 16, 2024

This was referenced Jan 16, 2024

Execution of action in parallel with input guardrails #257

Open

Any updates regarding state? #232

Open

Draft design for generation options support.

af99de3

drazvan force-pushed the feature/generation-options branch from 7728422 to af99de3 Compare February 8, 2024 04:42

drazvan added 12 commits February 8, 2024 07:49

Small fix related to FastEmbed model caching.

919425f

Add support for output vars instead of return_context.

ef68057

Update RunnableRails to use generation options instead of `return_c…

7236121

…ontext`.

Add test for output_vars in generation options.

718033b

Add documentation for output variables.

3cf2e96

Add support for detailed logging information through generation options.

9c23d04

Deactivate test which needs to be fixed.

a040533

Add back the pytest.ini file.

0c01bd9

Remove generation options from generate_events.

256e1b0

Add support for additional LLM parameters.

d80450b

Add support for additional LLM output.

c5806c8

Add support for choosing individual categories of rails.

2e2c9ea

drazvan marked this pull request as ready for review February 14, 2024 14:13

drazvan merged commit b4dd063 into develop Feb 14, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend the `generate`/`generate_events` APIs to support generation options #256

Extend the `generate`/`generate_events` APIs to support generation options #256

drazvan commented Jan 16, 2024

drazvan commented Jan 16, 2024

drazvan commented Feb 14, 2024

Extend the generate/generate_events APIs to support generation options #256

Extend the generate/generate_events APIs to support generation options #256

Conversation

drazvan commented Jan 16, 2024

Goal

Example Usage

drazvan commented Jan 16, 2024

drazvan commented Feb 14, 2024

Extend the `generate`/`generate_events` APIs to support generation options #256

Extend the `generate`/`generate_events` APIs to support generation options #256