vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 6.9k
Star 45.3k

Code
Issues 1.7k
Pull requests 588
Discussions
Actions
Projects 10
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 8

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 87

[Roadmap] vLLM Release/CI/Performance Benchmark Q2 2025

#16284 opened Apr 8, 2025 by khluu

Open 3

Beta

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,688 Open 6,357 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: [v0.8.4][Critical] Tools calling broken: xgrammar rejects minItems in JSON Schema, blocking agent functionality bug

Something isn't working

#16880 opened Apr 19, 2025 by okamiRvS

1 task done

[Usage]: Request scheduling when using LoRA usage

How to use vllm

#16876 opened Apr 19, 2025 by chenhongyu2048

1 task done

[Bug]: KeyError in mm_input_cache when processing multimodal requests with Qwen2.5-VL-72B bug

Something isn't working

#16875 opened Apr 19, 2025 by uyzhang

1 task done

[New Model]: jinaai/jina-embeddings-v2-base-code new-model

Requests to new models

#16874 opened Apr 19, 2025 by cynial

1 task done

[Bug]: Bug while using deepspeed with TRL with vLLM bug

Something isn't working

#16867 opened Apr 18, 2025 by abeerag

1 task done

[Bug]: vllm 0.8.3 abnormal TTFT (too long) in the first serving bug

Something isn't working

#16858 opened Apr 18, 2025 by sjtu-zwh

1 task done

[Feature]: Support Gemma 3 QAT series feature request

New feature or request

#16856 opened Apr 18, 2025 by rbavery

1 task done

[Bug]: Two BOS when using chat bug

Something isn't working

#16853 opened Apr 18, 2025 by efsotr

1 task done

[Bug]: Invalid json schema disconnect the container from GPU without user notice bug

Something isn't working

#16851 opened Apr 18, 2025 by Rictus

1 task done

[Bug]: Calling the load_weights method of the MOE model failed bug

Something isn't working

#16842 opened Apr 18, 2025 by lyz22233

1 task done

[Bug]: Rocm Memory Access Fault. bug

Something isn't working

rocm

Related to AMD ROCm

#16840 opened Apr 18, 2025 by zhang-yu-wei

1 task done

[Bug]: 0.8.3 V1 engine /completions with prompt_logprobs outputs Ġ instead of a space in decoded_token bug

Something isn't working

#16838 opened Apr 18, 2025 by tripathiarpan20

1 task done

[Bug]: 0.8.3 V1 engine /completions breaks when prompt_logprobs set to 10; works when set between 1-9 bug

Something isn't working

#16836 opened Apr 18, 2025 by tripathiarpan20

1 task done

[Bug]: PreemptionMode.RECOMPUTE is incorrect bug

Something isn't working

#16832 opened Apr 18, 2025 by efsotr

1 task done

[Bug]: An error occurred when deploying DeepSeek-R1-Channel-INT8 on two A100 machines using lws bug

Something isn't working

#16827 opened Apr 18, 2025 by MouseSun846

1 task done

[Bug]: The Transformers implementation of My Model is not compatible with vLLM. bug

Something isn't working

#16826 opened Apr 18, 2025 by SnowCharmQ

1 task done

[Bug]: Bug in LRUEvictor: priority_queue and free_table desynchronization cause error bug

Something isn't working

#16825 opened Apr 18, 2025 by loricxy0707

[Usage]: How vllm split the llm when pipeline parallel size>1, is it divided equally by the number of hidden layers? usage

How to use vllm

#16822 opened Apr 18, 2025 by janelu9

1 task done

[Bug]: benchmark with mii backend occurs Error bug

Something isn't working

#16821 opened Apr 18, 2025 by tishizaki

1 task done

[Feature]: suggest passing a splited tensor to RLHF vllm's load_weights when tp>1 feature request

New feature or request

#16820 opened Apr 18, 2025 by janelu9

1 task done

[Bug]: When configuring Ray with a custom temporary directory using the --temp-dir parameter, the distributed multi-node inference cluster fails to deploy successfully. bug

Something isn't working

#16819 opened Apr 18, 2025 by shalousun

[Installation]: how to add vllm[audio] when build from source code in arm64 platform installation

Installation problems

#16816 opened Apr 18, 2025 by fanfan-lucky

1 task done

[Bug]: 0.8.4 serve QwQ-32B-AWQ failed bug

Something isn't working

#16811 opened Apr 18, 2025 by hicodo

1 task done

[Bug]: RuntimeError: operator _C::machete_gemm does not exist bug

Something isn't working

#16810 opened Apr 18, 2025 by KilJaeeun

1 task done

[Bug]: Cannot use FlashAttention-2 backend for head size 88 for serving llama4 bug

Something isn't working

#16808 opened Apr 18, 2025 by zhaoclaire

1 task done

Previous 1 2 3 4 5 … 67 68 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly