All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- #288 Replace SentenceTransformers with FastEmbed.
- #254 Support for Llama Guard input and output content moderation.
- #253 Support for server-side threads.
- #235 Improved LangChain integration through
RunnableRails
. - #190 Add example for using
generate_events_async
with streaming. - Support for Python 3.11.
- #286 Fixed not having the same evaluation set given a random seed for topical rails.
- #239 Fixed logging issue where
verbose=true
flag did not trigger expected log output. - #228 Fix docstrings for various functions.
- #242 Fix Azure LLM support.
- #225 Fix annoy import, to allow using without.
- #209 Fix user messages missing from prompt.
- #261 Fix small bug in
print_llm_calls_summary
. - #252 Fixed duplicate loading for the default config.
- Fixed the dependencies pinning, allowing a wider range of dependencies versions.
- Fixed sever security issues related to uncontrolled data used in path expression and information exposure through an exception.
- Support for
--version
flag in the CLI.
- Upgraded
langchain
to0.0.352
. - Upgraded
httpx
to0.24.1
. - Replaced deprecated
text-davinci-003
model withgpt-3.5-turbo-instruct
.
- #191: Fix chat generation chunk issue.
- Support for explicit definition of input/output/retrieval rails.
- Support for custom tasks and their prompts.
- Support for fact-checking using AlignScore.
- Support for NeMo LLM Service as an LLM provider.
- Support for making a single LLM call for both the guardrails process and generating the response (by setting
rails.dialog.single_call.enabled
toTrue
). - Support for sensitive data detection guardrails using Presidio.
- Example using NeMo Guardrails with the LLaMa2-13B model.
- Dockerfile for building a Docker image.
- Support for prompting modes using
prompting_mode
. - Support for TRT-LLM as an LLM provider.
- Support for streaming the LLM responses when no output rails are used.
- Integration of ActiveFence ActiveScore API as an input rail.
- Support for
--prefix
and--auto-reload
in the guardrails server. - Example authentication dialog flow.
- Example RAG using Pinecone.
- Support for loading a configuration from dictionary, i.e.
RailsConfig.from_content(config=...)
. - Guidance on LLM support.
- Support for
LLMRails.explain()
(see the Getting Started guide for sample usage).
- Allow context data directly in the
/v1/chat/completion
using messages with the type"role"
. - Allow calling a subflow whose name is in a variable, e.g.
do $some_name
. - Allow using actions which are not
async
functions. - Disabled pretty exceptions in CLI.
- Upgraded dependencies.
- Updated the Getting Started Guide.
- Main README now provides more details.
- Merged original examples into a single ABC Bot and removed the original ones.
- Documentation improvements.
- Fix going over the maximum prompt length using the
max_length
attribute in Prompt Templates. - Fixed problem with
nest_asyncio
initialization. - #144 Fixed TypeError in logging call.
- #121 Detect chat model using openai engine.
- #109 Fixed minor logging issue.
- Parallel flow support.
- Fix
HuggingFacePipeline
bug related to LangChain version upgrade.
- Support for custom configuration data.
- Example for using custom LLM and multiple KBs
- Support for
PROMPTS_DIR
. - #101 Support for using OpenAI embeddings models in addition to SentenceTransformers.
- First set of end-to-end QA tests for the example configurations.
- Support for configurable embedding search providers
- Moved to using
nest_asyncio
for implementing the blocking API. Fixes #3 and #32. - Improved event property validation in
new_event_dict
. - Refactored imports to allow installing from source without Annoy/SentenceTransformers (would need a custom embedding search provider to work).
- Fixed when the
init
function fromconfig.py
is called to allow custom LLM providers to be registered inside. - #93: Removed redundant
hasattr
check innemoguardrails/llm/params.py
. - #91: Fixed how default context variables are initialized.
- Event-based API for guardrails.
- Support for message with type "event" in
LLMRails.generate_async
. - Support for bot message instructions.
- Support for using variables inside bot message definitions.
- Support for
vicuna-7b-v1.3
andmpt-7b-instruct
. - Topical evaluation results for
vicuna-7b-v1.3
andmpt-7b-instruct
. - Support to use different models for different LLM tasks.
- Support for red-teaming using challenges.
- Support to disable the Chat UI when running the server using
--disable-chat-ui
. - Support for accessing the API request headers in server mode.
- Support to enable CORS settings for the guardrails server.
- Changed the naming of the internal events to align to the upcoming UMIM spec (Unified Multimodal Interaction Management).
- If there are no user message examples, the bot messages examples lookup is disabled as well.
- #58: Fix install on Mac OS 13.
- #55: Fix bug in example causing config.py to crash on computers with no CUDA-enabled GPUs.
- Fixed the model name initialization for LLMs that use the
model
kwarg. - Fixed the Cohere prompt templates.
- #55: Fix bug related to LangChain callbacks initialization.
- Fixed generation of "..." on value generation.
- Fixed the parameters type conversion when invoking actions from colang (previously everything was string).
- Fixed
model_kwargs
property for theWrapperLLM
. - Fixed bug when
stop
was used inside flows. - Fixed Chat UI bug when an invalid guardrails configuration was used.
- Support for defining subflows.
- Improved support for customizing LLM prompts
- Support for using filters to change how variables are included in a prompt template.
- Output parsers for prompt templates.
- The
verbose_v1
formatter and output parser to be used for smaller models that don't understand Colang very well in a few-shot manner. - Support for including context variables in prompt templates.
- Support for chat models i.e. prompting with a sequence of messages.
- Experimental support for allowing the LLM to generate multi-step flows.
- Example of using Llama Index from a guardrails configuration (#40).
- Example for using HuggingFace Endpoint LLMs with a guardrails configuration.
- Example for using HuggingFace Pipeline LLMs with a guardrails configuration.
- Support to alter LLM parameters passed as
model_kwargs
in LangChain. - CLI tool for running evaluations on the different steps (e.g., canonical form generation, next steps, bot message) and on existing rails implementation (e.g., moderation, jailbreak, fact-checking, and hallucination).
- Initial evaluation results for
text-davinci-003
andgpt-3.5-turbo
. - The
lowest_temperature
can be set through the guardrails config (to be used for deterministic tasks).
- The core templates now use Jinja2 as the rendering engines.
- Improved the internal prompting architecture, now using an LLM Task Manager.
- Fixed bug related to invoking a chain with multiple output keys.
- Fixed bug related to tracking the output stats.
- #51: Bug fix - avoid str concat with None when logging user_intent.
- #54: Fix UTF-8 encoding issue and add embedding model configuration.
- Support to connect any LLM that implements the BaseLanguageModel interface from LangChain.
- Support for customizing the prompts for specific LLM models.
- Support for custom initialization when loading a configuration through
config.py
. - Support to extract user-provided values from utterances.
- Improved the logging output for Chat CLI (clear events stream, prompts, completion, timing information).
- Updated system actions to use temperature 0 where it makes sense, e.g., canonical form generation, next step generation, fact checking, etc.
- Excluded the default system flows from the "next step generation" prompt.
- Updated langchain to 0.0.167.
- Fixed initialization of LangChain tools.
- Fixed the overriding of general instructions #7.
- Fixed action parameters inspection bug #2.
- Fixed bug related to multi-turn flows #13.
- Fixed Wolfram Alpha error reporting in the sample execution rail.
- First alpha release.