-
-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Add revision to
transformers.Auto*.from_pretrained
processors
#17948
opened May 10, 2025 by
xinli-centml
Loading…
[v1] Support multiple KV cache groups in GPU model runner
needs-rebase
tpu
Related to Google TPUs
v1
#17945
opened May 10, 2025 by
heheda12345
Loading…
[doc] list the hf downloaded models
documentation
Improvements or additions to documentation
#17940
opened May 10, 2025 by
reidliu41
Loading…
[BugFix] Correct max_model_len derivation from config.json for Mistral format
#17937
opened May 10, 2025 by
princepride
Loading…
[doc] update lora doc
documentation
Improvements or additions to documentation
#17936
opened May 10, 2025 by
reidliu41
Loading…
[Bugfix] Avoid repeatedly creating dummy data during engine startup
multi-modality
Related to multi-modality (#4194)
v1
#17935
opened May 10, 2025 by
DarkLight1337
Loading…
[kernel] integrate permute/unpermute kernel into deepgemm moe
#17934
opened May 10, 2025 by
CalebDu
Loading…
[Misc][RFC] Add automated profiling sweep and heatmap visualization tools
#17933
opened May 10, 2025 by
ConstBob
Loading…
[WIP] automatically bind CPU OMP Threads of a rank to CPU ids of a NUMA node.
ci/build
#17930
opened May 10, 2025 by
louie-tsai
Loading…
[Frontend] [Core] Add Tensorizer support for LoRA adapter serialization and deserialization
documentation
Improvements or additions to documentation
#17926
opened May 9, 2025 by
sangstar
Loading…
TESTING CI test completion - no need to merge.
ci/build
needs-rebase
#17921
opened May 9, 2025 by
Alexei-V-Ivanov-AMD
Loading…
[Hardware][Intel-Gaudi] enable text embedding for Intel-Gaudi backend
#17920
opened May 9, 2025 by
libinta
Loading…
WIP: fix_llama4_tool_call
documentation
Improvements or additions to documentation
frontend
tool-calling
#17917
opened May 9, 2025 by
wukaixingxp
•
Draft
[Bugfix][V1] Only get input embeddings w/ multi-modal models if first PP
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#17916
opened May 9, 2025 by
jinhuang12
Loading…
[Misc] Add compressed-tensors NVFP4A16 emulation support
quantization
ready
ONLY add when PR is ready to merge/full CI is needed
#17914
opened May 9, 2025 by
dsikka
Loading…
[BugFix][AMD] Compatible patch for AITER lib after 04/20
#17912
opened May 9, 2025 by
qli88
Loading…
[ROCm] Skip tests for quantizations incompatible with ROCm
ready
ONLY add when PR is ready to merge/full CI is needed
#17905
opened May 9, 2025 by
hissu-hyvarinen
Loading…
[Bugfix] Engine crash when using large integer maximum in JSON schema
structured-output
v1
#17897
opened May 9, 2025 by
chaunceyjiang
•
Draft
[UT] Add ut for none hash
ci/build
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
speculative-decoding
structured-output
tool-calling
v1
#17892
opened May 9, 2025 by
andyxning
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.