From 16e806d0fa199f6a6aa8426c695c5a99aada8739 Mon Sep 17 00:00:00 2001 From: Sandro Cavallari Date: Tue, 29 Apr 2025 08:07:27 +0000 Subject: [PATCH 1/3] add enable: True to streaming output documentation --- docs/getting-started/5-output-rails/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/getting-started/5-output-rails/README.md b/docs/getting-started/5-output-rails/README.md index c8f0be042..0c71d6e25 100644 --- a/docs/getting-started/5-output-rails/README.md +++ b/docs/getting-started/5-output-rails/README.md @@ -185,9 +185,11 @@ You can enable streaming to provide asynchronous responses and reduce the time t streaming: chunk_size: 200 context_size: 50 + enabled: True streaming: True ``` + Note that, the `enabled: True` filed is needed to enable streaming output rails while `streaming: True` is needed to enable streaming generation. 1. Call the `stream_async` method and handle the chunked response: From 043b1e1e15b2b555af8d5a4b1f89dcf3af394c56 Mon Sep 17 00:00:00 2001 From: Sandro Cavallari Date: Tue, 29 Apr 2025 08:12:56 +0000 Subject: [PATCH 2/3] fix `Streaming Output Configuration` section with streaming.enabled = True --- docs/user-guides/configuration-guide.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/user-guides/configuration-guide.md b/docs/user-guides/configuration-guide.md index 2f5b1d532..70f0df214 100644 --- a/docs/user-guides/configuration-guide.md +++ b/docs/user-guides/configuration-guide.md @@ -679,7 +679,7 @@ You can enable streaming to begin receiving responses from the output rail soone You must set the top-level `streaming: True` field in your `config.yml` file. -For each output rail, add the `streaming` field and configuration parameters. +For the output rails, add the `streaming` field and configuration parameters. ```yaml rails: @@ -689,6 +689,7 @@ rails: chunk_size: 200 context_size: 50 stream_first: True + enabled: True streaming: True ``` @@ -742,6 +743,11 @@ The following table describes the subfields for the `streaming` field: By default, the toolkit streams the chunks as soon as possible and before applying output rails to them. - `True` + +* - streaming.enabled + - When set to True enable the execution of the output rails in streaming mode. + + - `False` ``` The following table shows how the number of tokens, chunk size, and context size interact to trigger the number of rails invocations. From 7dcc798e15ea826074f6aa541d17347d18181f5c Mon Sep 17 00:00:00 2001 From: Mike McKiernan Date: Fri, 9 May 2025 10:40:48 -0400 Subject: [PATCH 3/3] docs: nitpicks Signed-off-by: Mike McKiernan --- docs/getting-started/5-output-rails/README.md | 5 +++-- docs/user-guides/configuration-guide.md | 13 +++++++------ 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/docs/getting-started/5-output-rails/README.md b/docs/getting-started/5-output-rails/README.md index 0c71d6e25..43965c61e 100644 --- a/docs/getting-started/5-output-rails/README.md +++ b/docs/getting-started/5-output-rails/README.md @@ -183,13 +183,14 @@ You can enable streaming to provide asynchronous responses and reduce the time t flows: - self check output streaming: + enabled: True chunk_size: 200 context_size: 50 - enabled: True streaming: True ``` - Note that, the `enabled: True` filed is needed to enable streaming output rails while `streaming: True` is needed to enable streaming generation. + + The `enabled: True` field is required to enable streaming output rails while the `streaming: True` field is needed to enable streaming generation. 1. Call the `stream_async` method and handle the chunked response: diff --git a/docs/user-guides/configuration-guide.md b/docs/user-guides/configuration-guide.md index 70f0df214..073cc7b93 100644 --- a/docs/user-guides/configuration-guide.md +++ b/docs/user-guides/configuration-guide.md @@ -103,6 +103,7 @@ nemoguardrails find-providers [--list] ``` The command supports two modes: + - Interactive mode (default): Guides you through selecting a provider type (text completion or chat completion) and then shows available providers for that type - List mode (`--list`): Simply lists all available providers without interactive selection @@ -686,10 +687,10 @@ rails: output: - rail name streaming: + enabled: True chunk_size: 200 context_size: 50 stream_first: True - enabled: True streaming: True ``` @@ -736,6 +737,11 @@ The following table describes the subfields for the `streaming` field: Specifying approximately 25% of `chunk_size` provides a good compromise. - `50` +* - streaming.enabled + - When set to `True`, the toolkit executes output rails in streaming mode. + + - `False` + * - streaming.stream_first - When set to `False`, the toolkit applies the output rails to the chunks before streaming them to the client. If you set this field to `False`, you can avoid streaming chunks of blocked content. @@ -743,11 +749,6 @@ The following table describes the subfields for the `streaming` field: By default, the toolkit streams the chunks as soon as possible and before applying output rails to them. - `True` - -* - streaming.enabled - - When set to True enable the execution of the output rails in streaming mode. - - - `False` ``` The following table shows how the number of tokens, chunk size, and context size interact to trigger the number of rails invocations.