-
Notifications
You must be signed in to change notification settings - Fork 477
docs: Possible update to injection detection #1144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -952,29 +952,37 @@ Times reported below in are **averages** and are reported in milliseconds. | |||||
| Docker | 2057 | 115 | | ||||||
| In-Process | 3227 | 157 | | ||||||
|
||||||
|
||||||
### Injection Detection | ||||||
NeMo Guardrails offers detection of potential injection attempts (_e.g._ code injection, cross-site scripting, SQL injection, template injection) using [YARA rules](https://yara.readthedocs.io/en/stable/index.html), a technology familiar to many security teams. | ||||||
NeMo Guardrails ships with some basic rules for the following categories: | ||||||
* Code injection (Python) | ||||||
* Cross-site scripting (Markdown and Javascript) | ||||||
* SQL injection | ||||||
* Template injection (Jinja) | ||||||
|
||||||
Additional rules can be added by including them in the `library/injection_detection/yara_rules` folder or specifying a `yara_path` with all the rules. | ||||||
NeMo Guardrails offers detection of potential injection attempts such as code injection, cross-site scripting, SQL injection, and template injection. | ||||||
Injection detection is primarily intended to be used in agentic systems to enhance other security controls as part of a defense-in-depth strategy. | ||||||
|
||||||
The first part of injection detection is [YARA rules](https://yara.readthedocs.io/en/stable/index.html). | ||||||
A YARA rule specifies a set of strings--text or binary patterns--to match and a Boolean expression that specifies the logic of the rule. | ||||||
YARA rules is a technology that is familiar to many security teams. | ||||||
mikemckiernan marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
The second part of injection detection is specifying the action to take when a rule is triggered. | ||||||
You can specify to *reject* the text and return "I'm sorry, the desired output triggered rule(s) designed to mitigate exploitation of {detections}." | ||||||
Or, you can specify to *omit* the triggering text from the response. | ||||||
mikemckiernan marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
#### About the Default Rules | ||||||
|
||||||
Injection detection has a number of action options that indicate what to do when potential exploitation is detected. | ||||||
Two options are currently available: `reject` and `omit`, with `sanitize` planned for a future release. | ||||||
By default, NeMo Guardrails provides the following rules: | ||||||
|
||||||
* `reject` will return a message to the user indicating that their query could not be handled and they should try again. | ||||||
* `omit` will return the model's output, removing the offending detected content. | ||||||
* `sanitize` attempts to "de-fang" the malicious content, returning the output in a way that is less likely to result exploitation. This action is generally considered unsuitable for production use. | ||||||
- Code injection (Python): Recommended if the LLM output is used as an argument to downstream functions or passed to a code interpreter. | ||||||
- SQL injection: Recommended if the LLM output is used as part of a SQL query to a database. | ||||||
- Template injection (Jinja): Recommended for use if LLM output is rendered using templating languages like Jinja. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
The syntax used is specific to Jinja, so even if you're using something Jinja-like, it won't work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the clarification about "Jinja-like" not working. I'm not a fan of "currently" because, depending on priorities, it might read "currently" for three or four years. How about...
When we add more languages, we'll just have to update this section--and would have to even if "currently" was in the text. |
||||||
This rule is usually paired with code injection rules. | ||||||
- Cross-site scripting (Markdown and Javascript): Recommended if the LLM output is rendered directly in HTML or Markdown. | ||||||
|
||||||
You can view the default rules in the [yara_rules directory](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/injection_detection/yara_rules) of the GitHub repository. | ||||||
|
||||||
#### Configuring Injection Detection | ||||||
To activate injection detection, you must include the `injection detection` output flow. | ||||||
|
||||||
To activate injection detection, you must specify the rules to apply and the action to take as well as include the `injection detection` output flow. | ||||||
As an example config: | ||||||
|
||||||
```colang | ||||||
```yaml | ||||||
rails: | ||||||
config: | ||||||
injection_detection: | ||||||
|
@@ -991,14 +999,73 @@ rails: | |||||
- injection detection | ||||||
``` | ||||||
|
||||||
**SECURITY WARNING:** It is _strongly_ advised that the `sanitize` action not be used in production systems, as there is no guarantee of its efficacy, and it may lead to adverse security outcomes. | ||||||
Refer to the following table for the `rails.config.injection_detection` field syntax reference: | ||||||
|
||||||
```{list-table} | ||||||
:header-rows: 1 | ||||||
|
||||||
* - Field | ||||||
- Description | ||||||
- Default Value | ||||||
|
||||||
* - `injections` | ||||||
- Specifies the injection detection rules to use. | ||||||
The following injections are part of the library: | ||||||
|
||||||
- `code` for Python code injection | ||||||
- `sqli` for SQL injection | ||||||
- `template` for Jinja template injection | ||||||
- `xss` for cross-site scripting | ||||||
- None (required) | ||||||
|
||||||
* - `action` | ||||||
- Specifies the action to take when injection is detected. | ||||||
Refer to the following actions: | ||||||
|
||||||
- `reject` returns a message to the user indicating that the query could not be handled and they should try again. | ||||||
- `omit` returns the model response, removing the offending detected content. | ||||||
- None (required) | ||||||
|
||||||
* - `yara_path` | ||||||
- Specifies the path to a directory that contains custom YARA rules. | ||||||
- `library/injection_detection/yara_rules` in the NeMo Guardrails package. | ||||||
``` | ||||||
|
||||||
#### Example | ||||||
|
||||||
Before you begin, install the `yara-python` package or you can install the NeMo Guardrails package with `pip install nemoguardrails[jailbreak]`. | ||||||
|
||||||
1. Set your NVIDIA API key as an environment variable: | ||||||
|
||||||
```console | ||||||
$ export NVIDIA_API_KEY=<nvapi-...> | ||||||
``` | ||||||
|
||||||
1. Create a configuration directory, such as `config`, and add a `config.yml` file with contents like the following: | ||||||
|
||||||
```{literalinclude} ../../examples/configs/injection_detection/config/config.yml | ||||||
:language: yaml | ||||||
``` | ||||||
|
||||||
1. Load the guardrails configuration: | ||||||
|
||||||
```{literalinclude} ../../examples/configs/injection_detection/demo.py | ||||||
:language: python | ||||||
:start-after: "# start-load-config" | ||||||
:end-before: "# end-load-config" | ||||||
``` | ||||||
|
||||||
1. Send a possibly unsafe request: | ||||||
|
||||||
```{literalinclude} ../../examples/configs/injection_detection/demo.py | ||||||
:language: python | ||||||
:start-after: "# start-unsafe-response" | ||||||
:end-before: "# end-unsafe-response" | ||||||
``` | ||||||
|
||||||
This rail is primarily intended to be used in agentic systems to _enhance_ other security controls as part of a defense in depth strategy. | ||||||
The provided rules are recommended to be used in the following settings: | ||||||
* `code`: Recommended if the LLM's output will be used as an argument to downstream functions or passed to a code interpreter. | ||||||
* `sqli`: Recommended if the LLM's output will be used as part of a SQL query to a database | ||||||
* `template`: Recommended for use if LLM output is rendered using templating languages like Jinja. This rule should usually be paired with `code` rules. | ||||||
* `xss`: Recommended if LLM output will be rendered directly in HTML or Markdown | ||||||
*Example Output* | ||||||
|
||||||
The included rules are in no way comprehensive. | ||||||
They can and should be extended by security teams for use in your application's particular context and paired with additional security controls. | ||||||
```{literalinclude} ../../examples/configs/injection_detection/demo-out.txt | ||||||
:start-after: "# start-unsafe-response" | ||||||
:end-before: "# end-unsafe-response" | ||||||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
models: | ||
- type: main | ||
engine: nvidia_ai_endpoints | ||
model: meta/llama-3.3-70b-instruct | ||
|
||
rails: | ||
config: | ||
injection_detection: | ||
injections: | ||
- code | ||
- sqli | ||
- template | ||
- xss | ||
action: reject | ||
|
||
output: | ||
streaming: | ||
enabled: True | ||
chunk_size: 200 | ||
context_size: 50 | ||
|
||
streaming: True |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# start-unsafe-response | ||
{'role': 'assistant', 'content': '**Getting the Weather in Santa Clara using Python**\n=====================================================\n\nTo get the weather in Santa Clara, we can use the OpenWeatherMap API, which provides current and forecasted weather conditions. We will use the `requests` library to make an HTTP request to the API and the `json` library to parse the response.\n\n**Prerequisites**\n---------------\n\n* Python 3.x\n* `requests` library (`pip install requests`)\n* OpenWeatherMap API key (sign up for free at [OpenWeatherMap](https://home.openweathermap.org/users/sign_up))\n\n**Code**\n-----\n\n```python\nimport requests\nimport json\n\ndef get_weather(api_key, city, units=\'metric\'):\n """\n Get the current weather in a city.\n\n Args:\n api_key (str): OpenWeatherMap API key\n city (str): City name\n units (str, optional): Units of measurement (default: \'metric\')\n\n Returns:\n dict: Weather data\n """\n base_url = \'http://api.openweathermap.org/data/2.5/weather\'\n params = {\n \'q\': city,\n \'units\': units,\n \'appid\': api_key\n }\n response = requests.get(base_url, params=params)\n response.raise_for_status()\n return response.json()\n\ndef main():\n api_key = \'YOUR_API_KEY\' # replace with your OpenWeatherMap API key\n city = \'Santa Clara\'\n weather_data = get_weather(api_key, city)\n print(\'Weather in {}:\'.format(city))\n print(\'Temperature: {}°C\'.format(weather_data[\'main\'][\'temp\']))\n print(\'Humidity: {}%\'.format(weather_data[\'main\'][\'humidity\']))\n print(\'Conditions: {}\'.format(weather_data[\'weather\'][0][\'description\']))\n\nif __name__ == \'__main__\':\n main()\n```\n\n**Explanation**\n--------------\n\n1. We import the required libraries: `requests` for making HTTP requests and `json` for parsing the response.\n2. We define a function `get_weather` that takes the API key, city name, and units of measurement as arguments.\n3. We construct the API URL and parameters using the `base_url` and `params` dictionary.\n4. We make a GET request to the API using `requests.get` and raise an exception if the response status code is not 200 (OK).\n5. We parse the response data using `response.json()` and return the result as a dictionary.\n6. In the `main` function, we set the API key and city name, call `get_weather`, and print the weather data.\n\n**Notes**\n-------\n\n* Replace `YOUR_API_KEY` with your actual OpenWeatherMap API key.\n* You can adjust the units of measurement by passing a different value for the `units` parameter (e.g., \'imperial\' for Fahrenheit).\n* This script only retrieves the current weather conditions. For forecasted weather, you can use the [OpenWeatherMap forecast API](https://openweathermap.org/forecast5).\n\nI hope this helps! Let me know if you have any questions or need further assistance.'} | ||
# end-unsafe-response |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# fmt: off | ||
|
||
import atexit | ||
import os | ||
import sys | ||
from pathlib import Path | ||
|
||
curdir = os.getcwd() | ||
|
||
@atexit.register | ||
def cleanup(): | ||
os.chdir(curdir) | ||
|
||
os.chdir(Path(__file__).parent) | ||
|
||
# start-load-config | ||
import os | ||
|
||
from nemoguardrails import LLMRails, RailsConfig | ||
|
||
config = RailsConfig.from_path("./config") | ||
rails = LLMRails(config) | ||
# end-load-config | ||
|
||
# start-unsafe-response | ||
messages = [ | ||
{ | ||
"role": "user", | ||
"content": "Write a Python script that uses the requests library, and maybe urllib, to get the weather in Santa Clara.", | ||
}, | ||
] | ||
|
||
response = rails.generate(messages=messages) | ||
print(response) | ||
# end-unsafe-response | ||
|
||
stdout = sys.stdout | ||
with open("demo-out.txt", "w") as sys.stdout: | ||
print("# start-unsafe-response") | ||
print(response) | ||
print("# end-unsafe-response\n") | ||
sys.stdout = stdout |
Uh oh!
There was an error while loading. Please reload this page.