-
-
Notifications
You must be signed in to change notification settings - Fork 12.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add multimodal input method in the documentation
documentation
Improvements or additions to documentation
#31601
opened Jan 1, 2026 by
labAxiaoming
Loading…
5 tasks
Fix: Add None check in step_with_batch_queue for async scheduling
v1
#31600
opened Jan 1, 2026 by
rickychen-infinirc
Loading…
[Qwen3-Omni] Prefer CUDA for faster Whisper audio feature extraction
nvidia
qwen
Related to Qwen models
#31598
opened Jan 1, 2026 by
Jzz1943
Loading…
[ROCm][CI] Fix language generation test accuracy by disabling HF flash_sdp and mem_efficient_sdp
rocm
Related to AMD ROCm
#31597
opened Jan 1, 2026 by
AndreasKaratzas
Loading…
[MoE] Fix output_shape calculation in Attention layer to handle 3D query inputs
#31596
opened Jan 1, 2026 by
AndreasKaratzas
Loading…
Fix audio mono dimension
documentation
Improvements or additions to documentation
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
#31595
opened Jan 1, 2026 by
jeremyteboul
Loading…
Fix flashinfer experts quant config hack
llama
Related to Llama models
nvidia
#31593
opened Jan 1, 2026 by
robertgshaw2-redhat
•
Draft
5 tasks
feat(kv-cache): support multiple sliding window groups in HybridKVCac…
v1
#31592
opened Jan 1, 2026 by
DZADSL72-00558
Loading…
5 tasks
[Misc] Tidy up some spec decode logic in GPUModelRunner
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#31591
opened Dec 31, 2025 by
njhill
Loading…
[Bugfix] Replace BaseException with specific exceptions in FLA utils
ready
ONLY add when PR is ready to merge/full CI is needed
#31590
opened Dec 31, 2025 by
c0de128
Loading…
2 of 3 tasks
[Bugfix] Narrow broad exceptions in rank detection functions
#31589
opened Dec 31, 2025 by
c0de128
Loading…
2 of 3 tasks
[Bugfix][Hardware][ROCm] Narrow broad exception in PyNCCL library loading
rocm
Related to AMD ROCm
#31587
opened Dec 31, 2025 by
c0de128
Loading…
2 of 3 tasks
[Bugfix] Narrow broad exception in custom all-reduce detection
#31586
opened Dec 31, 2025 by
c0de128
Loading…
2 of 3 tasks
[Bug] Revert torch warning fix
bug
Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#31585
opened Dec 31, 2025 by
yewentao256
Loading…
[BugFix] scheduler: Fix resuming of preempted requests after async load
v1
#31583
opened Dec 31, 2025 by
orozery
Loading…
[Frontend] [Bugfix] respect server-level default chat template kwargs in reasoning parser
frontend
#31581
opened Dec 31, 2025 by
cjackal
Loading…
3 of 5 tasks
[Bugfix]: update global_rank when adjusting rpc_rank to fix layer key error
v1
#31580
opened Dec 31, 2025 by
zhaoninge
Loading…
3 of 5 tasks
feat: add vllm.utils.device_utils module
#31576
opened Dec 31, 2025 by
codebasecomprehension
Loading…
[Model] Support IQuestCoder model
new-model
Requests to new models
#31575
opened Dec 31, 2025 by
yxing-bj
Loading…
5 tasks
[P/D] Refactor mooncake connector sender thread using async coroutines
kv-connector
#31573
opened Dec 31, 2025 by
dtcccc
Loading…
5 tasks
[Bugfix] Fix activation quantization for compressed-tensors W4A16
ready
ONLY add when PR is ready to merge/full CI is needed
#31572
opened Dec 31, 2025 by
Tmn07
Loading…
[Quantization][MoE] remove unused ep logic from moe marlin
#31571
opened Dec 31, 2025 by
jinzhen-lin
Loading…
feat: support LoRA for DeepSeek-OCR(Language Model part)
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#31569
opened Dec 31, 2025 by
zhima771
Loading…
[Core] Optimize group size selection for hybrid KV cache
v1
#31568
opened Dec 31, 2025 by
DZADSL72-00558
Loading…
5 tasks
[Model] Support SentenceTransformers V6 reranker config
documentation
Improvements or additions to documentation
frontend
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.