Releases · NVIDIA/NeMo-Guardrails

10 May 07:51

v0.9.0

b3c6bb8

This release introduces Colang 2.0, the next version of Colang, and a revamped NeMo Guardrails Documentation.

Colang 2.0 brings a more solid foundation for building complex guardrail configurations (with better parallelism support), advanced RAG orchestration (e.g., with multi-query, contextual relevance check), agents (e.g., driving business process logic), and multi-modal LLM-driven interaction (e.g., interactive avatars). Colang 2.0 is a complete overhaul of the Colang language and runtime, and key enhancements include:

A more powerful flow engine supporting multiple parallel flows and advanced pattern matching over the stream of events.
Adoption of terminology and syntax akin to Python to reduce the learning curve for new developers.
A standard library and an import mechanism to streamline development.
Explicit entry point through the main flow and explicit activation of flows.
Smaller set of core abstractions: flows, events, and actions.
The new generation operator (...).
Asynchronous action execution.

NOTE: The version of Colang included in v0.8.* is referred to as Colang 2.0-alpha. In v0.9.0, Colang 2.0 moved to Beta, which we refer to as Colang 2.0-beta. We expect Colang 2.0 to go out of Beta and replace Colang 1.0 as the default option in NeMo Guardrails v0.11.0.

Current limitations include not being able to use the Guardrails Library from within Colang 2.0 and no support for generation options (e.g., logs, activated rails). These limitations will be addressed in v0.10.0 and v0.11.0, along with additional features and example guardrail configurations.

To get started with Colang 2.0, if you’ve used Colang 1.0 before, you should check out What’s Changed page. If not, you can get started with the Hello World example.

Full Changelog: v0.8.3...v0.9.0

Assets 2

0 Join discussion

18 Apr 15:03

drazvan

v0.8.3

63ec36d

Release v0.8.3

This minor release updates the NVIDIA API Catalog integration documentation and fixes two bugs.

What's Changed

Changed

#453 Update documentation for NVIDIA API Catalog example.

Fixed

#382 Fix issue with lowest_temperature in self-check and hallucination rails.
#454 Redo fix for #385.
#442 Fix README type by @dileepbapat.

New Contributors

@dileepbapat made their first contribution in #442

Full Changelog: v0.8.2...v0.8.3

Contributors

dileepbapat

Assets 2

0 Join discussion

01 Apr 20:52

drazvan

v0.8.2

88da745

Release v0.8.2

This minor release adds support for integrating NeMo Guardrails with NVIDIA AI Endpoints and Vertex AI. It also introduces the research overview page, which guides the development of future guardrails. Last but not least, it adds another round of improvements for Colang 2.0 and multiple getting-started examples.

Colang 2.0 is the next version of Colang and will replace Colang 1.0 in a future release. It adds a more powerful flow engine, improved syntax, multi-modal support, parallelism for actions and flows, a standard library of flows, and more. This release still targets alpha testers and does not include the new documentation, which will be added in 0.9.0. Colang 2.0 and 1.0 will be supported side-by-side until Colang 1.0 is deprecated and removed.

What's Changed

Added

#402 Integrate Vertex AI Models into Guardrails by @aishwaryap.
#403 Add support for NVIDIA AI Endpoints by @patriciapampanelli
#396 Docs/examples nv ai foundation models.
#438 Add research roadmap documentation.

Changed

#389 Expose the verbose parameter through RunnableRails by @d-mariano.
#415 Enable print(...) and log(...).
#389 Expose verbose arg in RunnableRails by @d-mariano.
#414 Feature/colang march release.
#416 Refactor and improve the verbose/debug mode.
#418 Feature/colang flow context sharing.
#425 Feature/colang meta decorator.
#427 Feature/colang single flow activation.
#426 Feature/colang 2.0 tutorial.
#428 Feature/Standard library and examples.
#431 Feature/colang various improvements.
#433 Feature/Colang 2.0 improvements: generate_async support, stateful API.

Fixed

#412 Fix #411 - explain rails not working for chat models.
#413 Typo fix: Comment in llm_flows.co by @habanoz.
#420 Fix typo for hallucination message.

New Contributors

@aishwaryap made their first contribution in #402
@patriciapampanelli made their first contribution in #403
@habanoz made their first contribution in #413
@d-mariano made their first contribution in #389

Full Changelog: v0.8.1...v0.9.0

Contributors

aishwaryap, habanoz, and 2 other contributors

Assets 2

0 Join discussion

15 Mar 10:32

drazvan

v0.8.1

4bc1d52

Release v0.8.1

This minor release mainly focuses on fixing Colang 2.0 parser and runtime issues. It fixes a bug related to logging the prompt for chat models in verbose mode and a small issue in the installation guide. It also adds an example of using streaming with a custom action.

What's Changed

Added

#377 Add example for streaming from custom action.

Changed

#380 Update installation guide for OpenAI usage.
#401 Replace YAML import with new import statement in multi-modal example.

Fixed

#398 Colang parser fixes and improvements.
#394 Fixes and improvements for Colang 2.0 runtime.
#381 Fix typo by @serhatgktp.
#379 Fix missing prompt in verbose mode for chat models.
#400 Fix Authorization header showing up in logs for NeMo LLM.

Full Changelog: v0.8.0...v0.8.1

Contributors

serhatgktp

Assets 2

0 Join discussion

28 Feb 15:19

drazvan

v0.8.0

8bb50af

Release v0.8.0

This release adds three main new features:

A new type of input rail that uses a set of jailbreak heuristics. More heuristics will be added in the future.
Support for generation options allowing fine-grained control on what types of rails should be triggered, what data should be returned and what logging information should be included in the response.
Support for making API calls to the guardrails server using multiple configuration ids.

This release also improves the support for working with embeddings (better async support, batching and caching), adds support for stop tokens per task template, and adds streaming support for HuggingFace pipelines. Last but not least, this release includes the core implementation for Colang 2.0 as a preview for early testing (version 0.9.0 will include documentation and examples).

What's Changed

Added

#292 Jailbreak heuristics by @erickgalinkin.
#256 Support generation options.
#307 Added support for multi-config api calls by @makeshn.
#293 Adds configurable stop tokens by @zmackie.
#334 Colang 2.0 - Preview by @schuellc.
#208 Implement cache embeddings (resolves #200) by @Pouyanpi.
#331 Huggingface pipeline streaming by @trebedea.

Documentation:

#311 Update documentation to demonstrate the use of output rails when using a custom RAG by @niels-garve.
#347 Add detailed logging docs by @erickgalinkin.
#354 Input and output rails only guide by @trebedea.
#359 Added user guide for jailbreak detection heuristics by @makeshn.
#363 Add multi-config API call user guide.
#297 Example configurations for using only the guardrails, without LLM generation.

Changed

#309 Change the paper citation from ArXiV to EMNLP 2023 by @manuelciosici
#319 Enable embeddings model caching.
#267 Make embeddings computing async and add support for batching.
#281 Follow symlinks when building knowledge base by @piotrm0.
#280 Add more information to results of retrieve_relevant_chunks by @piotrm0.
#332 Update docs for batch embedding computations.
#244 Docs/edit getting started by @DougAtNvidia.
#333 Follow-up to PR 244.
#341 Updated 'fastembed' version to 0.2.2 by @NirantK.

Fixed

#286 Fixed #285 - using the same evaluation set given a random seed for topical rails by @trebedea.
#336 Fix #320. Reuse the asyncio loop between sync calls.
#337 Fix stats gathering in a parallel async setup.
#342 Fixes OpenAI embeddings support.
#346 Fix issues with KB embeddings cache, bot intent detection and config ids validator logic.
#349 Fix multi-config bug, asyncio loop issue and cache folder for embeddings.
#350 Fix the incorrect logging of an extra dialog rail.
#358 Fix Openai embeddings async support.
#362 Fix the issue with the server being pointed to a folder with a single config.
#352 Fix a few issues related to jailbreak detection heuristics.
#356 Redo followlinks PR in new code by @piotrm0.

New Contributors

@manuelciosici made their first contribution in #309
@erickgalinkin made their first contribution in #292
@trebedea made their first contribution in #286
@piotrm0 made their first contribution in #281
@Pouyanpi made their first contribution in #208
@niels-garve made their first contribution in #311
@zmackie made their first contribution in #293
@DougAtNvidia made their first contribution in #244
@NirantK made their first contribution in #341
@makeshn made their first contribution in #359

Full Changelog: v0.7.1...v0.8.0

Contributors

manuelciosici, piotrm0, and 9 other contributors

Assets 2

0 Join discussion

01 Feb 14:24

drazvan

v0.7.1

2a3a5ce

Release v0.7.1

What's Changed

Replace SentenceTransformers with FastEmbed by @drazvan in #288

Full Changelog: v0.7.0...v0.7.1

Contributors

drazvan

Assets 2

0 Join discussion

31 Jan 14:54

drazvan

v0.7.0

7cb05d4

Release v0.7.0

This release adds three new features: support for Llama Guard, improved LangChain integration, and support for server-side threads. It also adds support for Python 3.11 and solves the issue with pinned dependencies (e.g., langchain>=0.1.0,<2.0, typer>=0.7.0). Last but not least, it includes multiple feature and security-related fixes.

What's Changed

Added

#254 Support for Llama Guard input and output content moderation.
#253 Support for server-side threads.
#235 Improved LangChain integration through RunnableRails.
#190 Add example for using generate_events_async with streaming.
Support for Python 3.11.

Changed

#240 Switch to pyproject.
#276 Upgraded Typer to 0.9.

Fixed

#239 Fixed logging issue where verbose=true flag did not trigger expected log output.
#228 Fix docstrings for various functions.
#242 Fix Azure LLM support.
#225 Fix annoy import, to allow using without.
#209 Fix user messages missing from prompt.
#261 Fix small bug in print_llm_calls_summary.
#252 Fixed duplicate loading for the default config.
Fixed the dependencies pinning, allowing a wider range of dependencies versions.
Fixed sever security issues related to uncontrolled data used in path expression and information exposure through an exception.

New Contributors

@spehl-max made their first contribution in #239
@rajveer43 made their first contribution in #228
@smartestrobotdai made their first contribution in #242
@prasoonvarshney made their first contribution in #269
@eneadodi made their first contribution in #276
@baggiponte made their first contribution in #240

Full Changelog: v0.6.1...v0.7.0

Contributors

prasoonvarshney, magiccpp, and 4 other contributors

Assets 2

0 Join discussion

20 Dec 22:13

drazvan

v0.6.1

3273ca7

Release v0.6.1

This patch release upgrades two dependencies (langchain and httpx) and replaces the deprecated text-davinci-003 model with gpt-3.5-turbo-instruct in all configurations and examples.

Added

Support for --version flag in the CLI.

Changed

Upgraded langchain to 0.0.352.
Upgraded httpx to 0.24.1.
Replaced deprecated text-davinci-003 model with gpt-3.5-turbo-instruct.

Fixed

#191: Fix chat generation chunk issue.

Assets 2

0 Join discussion

13 Dec 21:59

drazvan

v0.6.0

cc598c3

Release v0.6.0

This release builds on the feedback received over the last few months and brings many improvements and new features. It is also the first beta release for NeMo Guardrails. Equally important, this release is the first to include LLM vulnerability scan results for one of the sample bots.

Release highlights include:

Better configuration and support for input, output, dialog, retrieval, and execution rails.
Ability to reduce the overall latency using single_call mode or embeddings_only mode for dialog rails.
Support for streaming.
First version of the Guardrails Library.
Fast fact-checking using AlignScore.
Updated Getting Started guide.
Docker image for easy deployment.

Detailed changes are included below.

Added

Support for explicit definition of input/output/retrieval rails.
Support for custom tasks and their prompts.
Support for fact-checking using AlignScore.
Support for NeMo LLM Service as an LLM provider.
Support for making a single LLM call for both the guardrails process and generating the response (by setting rails.dialog.single_call.enabled to True).
Support for sensitive data detection guardrails using Presidio.
Example using NeMo Guardrails with the LLaMa2-13B model.
Dockerfile for building a Docker image.
Support for prompting modes using prompting_mode.
Support for TRT-LLM as an LLM provider.
Support for streaming the LLM responses when no output rails are used.
Integration of ActiveFence ActiveScore API as an input rail.
Support for --prefix and --auto-reload in the guardrails server.
Support for loading a configuration from dictionary, i.e. RailsConfig.from_content(config=...).
Guidance on LLM support.
Support for LLMRails.explain() (see the Getting Started guide for sample usage).

Changed

Allow context data directly in the /v1/chat/completion using messages with the type "role".
Allow calling a subflow whose name is in a variable, e.g. do $some_name.
Allow using actions which are not async functions.
Disabled pretty exceptions in CLI.
Upgraded dependencies.
Updated the Getting Started Guide.
Main README now provides more details.
Merged original examples into a single ABC Bot and removed the original ones.
Documentation improvements.

Fixed

Fix going over the maximum prompt length using the max_length attribute in Prompt Templates.
Fixed problem with nest_asyncio initialization.
#144 Fixed TypeError in logging call.
#121 Detect chat model using openai engine.
#109 Fixed minor logging issue.
Parallel flow support.
Fix HuggingFacePipeline bug related to LangChain version upgrade.

Assets 2

0 Join discussion

04 Sep 20:36

drazvan

v0.5.0

cb07be6

Release v0.5.0 Pre-release

Pre-release

This release adds support for custom embedding search providers (not using Annoy/SentenceTransformers) and support for OpenAI embeddings for the default embedding search provider. This release adds an advanced example for using multiple knowledge bases (i.e., a tabular and regular one). This release also fixes an old issue related to using the generate method inside an async environment (e.g., a notebook) and includes multiple small fixes. Detailed change log below.

Added

Support for custom configuration data.
Example for using custom LLM and multiple KBs
Support for PROMPTS_DIR.
#101 Support for using OpenAI embeddings models in addition to SentenceTransformers.
First set of end-to-end QA tests for the example configurations.
Support for configurable embedding search providers

Changed

Moved to using nest_asyncio for implementing the blocking API. Fixes #3 and #32.
Improved event property validation in new_event_dict.
Refactored imports to allow installing from source without Annoy/SentenceTransformers (would need a custom embedding search provider to work).

Fixed

Fixed when the init function from config.py is called to allow custom LLM providers to be registered inside.
#93: Removed redundant hasattr check in nemoguardrails/llm/params.py.
#91: Fixed how default context variables are initialized.

Assets 2

0 Join discussion

Releases: NVIDIA/NeMo-Guardrails

Release v0.9.0

Release v0.8.3

What's Changed

Changed

Fixed

New Contributors

Contributors

Release v0.8.2

What's Changed

Added

Changed

Fixed

New Contributors

Contributors

Release v0.8.1

What's Changed

Added

Changed

Fixed

Contributors

Release v0.8.0

What's Changed

Added

Changed

Fixed

New Contributors

Contributors

Release v0.7.1

What's Changed

Contributors

Release v0.7.0

What's Changed

Added

Changed

Fixed

New Contributors

Contributors

Release v0.6.1

Added

Changed

Fixed

Release v0.6.0

Added

Changed

Fixed

Release v0.5.0

Added

Changed

Fixed