Skip to content

Commit 002a520

Browse files
docs: output rails are supported with streaming (#1007)
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
1 parent d1e5558 commit 002a520

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

docs/user-guides/advanced/streaming.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
# Streaming
22

3-
To use a guardrails configuration in streaming mode, the following must be met:
3+
If the application LLM supports streaming, you can configure NeMo Guardrails to stream tokens as well.
44

5-
1. The main LLM must support streaming.
6-
2. There are no output rails.
5+
For information about configuring streaming with output guardrails, refer to the following:
6+
7+
- For configuration, refer to [streaming output configuration](../../user-guides/configuration-guide.md#streaming-output-configuration).
8+
- For sample Python client code, refer to [streaming output](../../getting-started/5-output-rails/README.md#streaming-output).
79

810
## Configuration
911

@@ -26,6 +28,7 @@ nemoguardrails chat --config=examples/configs/streaming --streaming
2628
### Python API
2729

2830
You can use the streaming directly from the python API in two ways:
31+
2932
1. Simple: receive just the chunks (tokens).
3033
2. Full: receive both the chunks as they are generated and the full response at the end.
3134

@@ -73,9 +76,11 @@ For the complete working example, check out this [demo script](https://github.co
7376
### Server API
7477

7578
To make a call to the NeMo Guardrails Server in streaming mode, you have to set the `stream` parameter to `True` inside the JSON body. For example, to get the completion for a chat session using the `/v1/chat/completions` endpoint:
79+
7680
```
7781
POST /v1/chat/completions
7882
```
83+
7984
```json
8085
{
8186
"config_id": "some_config_id",

0 commit comments

Comments
 (0)