-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: [v0.8.4][Critical] Tools calling broken: xgrammar rejects minItems in JSON Schema, blocking agent functionality
bug
Something isn't working
#16880
opened Apr 19, 2025 by
okamiRvS
1 task done
[Usage]: Request scheduling when using LoRA
usage
How to use vllm
#16876
opened Apr 19, 2025 by
chenhongyu2048
1 task done
[Bug]: KeyError in mm_input_cache when processing multimodal requests with Qwen2.5-VL-72B
bug
Something isn't working
#16875
opened Apr 19, 2025 by
uyzhang
1 task done
[New Model]: jinaai/jina-embeddings-v2-base-code
new-model
Requests to new models
#16874
opened Apr 19, 2025 by
cynial
1 task done
[Bug]: Bug while using deepspeed with TRL with vLLM
bug
Something isn't working
#16867
opened Apr 18, 2025 by
abeerag
1 task done
[Bug]: vllm 0.8.3 abnormal TTFT (too long) in the first serving
bug
Something isn't working
#16858
opened Apr 18, 2025 by
sjtu-zwh
1 task done
[Feature]: Support Gemma 3 QAT series
feature request
New feature or request
#16856
opened Apr 18, 2025 by
rbavery
1 task done
[Bug]: Two BOS when using chat
bug
Something isn't working
#16853
opened Apr 18, 2025 by
efsotr
1 task done
[Bug]: Invalid json schema disconnect the container from GPU without user notice
bug
Something isn't working
#16851
opened Apr 18, 2025 by
Rictus
1 task done
[Bug]: Calling the load_weights method of the MOE model failed
bug
Something isn't working
#16842
opened Apr 18, 2025 by
lyz22233
1 task done
[Bug]: Rocm Memory Access Fault.
bug
Something isn't working
rocm
Related to AMD ROCm
#16840
opened Apr 18, 2025 by
zhang-yu-wei
1 task done
[Bug]: 0.8.3 V1 engine /completions with prompt_logprobs outputs Something isn't working
Ġ
instead of a space
in decoded_token
bug
#16838
opened Apr 18, 2025 by
tripathiarpan20
1 task done
[Bug]: 0.8.3 V1 engine /completions breaks when prompt_logprobs set to 10; works when set between 1-9
bug
Something isn't working
#16836
opened Apr 18, 2025 by
tripathiarpan20
1 task done
[Bug]: PreemptionMode.RECOMPUTE is incorrect
bug
Something isn't working
#16832
opened Apr 18, 2025 by
efsotr
1 task done
[Bug]: An error occurred when deploying DeepSeek-R1-Channel-INT8 on two A100 machines using lws
bug
Something isn't working
#16827
opened Apr 18, 2025 by
MouseSun846
1 task done
[Bug]: The Transformers implementation of My Model is not compatible with vLLM.
bug
Something isn't working
#16826
opened Apr 18, 2025 by
SnowCharmQ
1 task done
[Bug]: Bug in LRUEvictor: priority_queue and free_table desynchronization cause error
bug
Something isn't working
#16825
opened Apr 18, 2025 by
loricxy0707
[Usage]: How vllm split the llm when pipeline parallel size>1, is it divided equally by the number of hidden layers?
usage
How to use vllm
#16822
opened Apr 18, 2025 by
janelu9
1 task done
[Bug]: benchmark with mii backend occurs Error
bug
Something isn't working
#16821
opened Apr 18, 2025 by
tishizaki
1 task done
[Feature]: suggest passing a splited tensor to RLHF vllm's load_weights when tp>1
feature request
New feature or request
#16820
opened Apr 18, 2025 by
janelu9
1 task done
[Bug]: When configuring Ray with a custom temporary directory using the --temp-dir parameter, the distributed multi-node inference cluster fails to deploy successfully.
bug
Something isn't working
#16819
opened Apr 18, 2025 by
shalousun
[Installation]: how to add vllm[audio] when build from source code in arm64 platform
installation
Installation problems
#16816
opened Apr 18, 2025 by
fanfan-lucky
1 task done
[Bug]: 0.8.4 serve QwQ-32B-AWQ failed
bug
Something isn't working
#16811
opened Apr 18, 2025 by
hicodo
1 task done
[Bug]: RuntimeError: operator _C::machete_gemm does not exist
bug
Something isn't working
#16810
opened Apr 18, 2025 by
KilJaeeun
1 task done
[Bug]: Cannot use FlashAttention-2 backend for head size 88 for serving llama4
bug
Something isn't working
#16808
opened Apr 18, 2025 by
zhaoclaire
1 task done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.