You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/torch/attention.md
-1
Original file line number
Diff line number
Diff line change
@@ -65,7 +65,6 @@ It contains the following predefined fields:
65
65
| request_ids | List[int]| The request ID of each sequence in the batch. |
66
66
| prompt_lens | List[int]| The prompt length of each sequence in the batch. |
67
67
| kv_cache_params | KVCacheParams | The parameters for the KV cache. |
68
-
| is_dummy_attention | bool | Indicates whether this is a simulation-only attention operation used for KV cache memory estimation. Defaults to False. |
69
68
70
69
During `AttentionMetadata.__init__`, you can initialize additional fields for the new attention metadata.
71
70
For example, the Flashinfer metadata initializes `decode_wrapper` here.
0 commit comments