vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.3k
Star 34.2k

Code
Issues 1.2k
Pull requests 460
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

460 Open 5,283 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bugfix] Remove comments re: pytorch for outlines + compressed-tensors dependencies ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12260 opened Jan 21, 2025 by tdoublep

Loading…

[V1][Bugfix] Fix data item ordering in mixed-modality inference ready

ONLY add when PR is ready to merge/full CI is needed

#12259 opened Jan 21, 2025 by ywang96

Loading…

[core] separate builder init and builder prepare for each batch

#12253 opened Jan 21, 2025 by youkaichao

Loading…

[Model] Enable Inference Support for the New Baichuan-M1 Model documentation

Improvements or additions to documentation

#12251 opened Jan 21, 2025 by rainkert

Loading…

[torch.compile] decouple compile sizes and cudagraph sizes

#12243 opened Jan 21, 2025 by youkaichao

Loading…

[Frontend] Set server's maximum number of generated tokens using generation_config.json frontend

#12242 opened Jan 21, 2025 by mhendrey

Loading…

[Kernel] fix moe_align_block_size error condition

#12239 opened Jan 21, 2025 by jinzhen-lin

Loading…

[Docs] Update FP8 KV Cache documentation documentation

Improvements or additions to documentation

#12238 opened Jan 21, 2025 by mgoin

Loading…

[Misc] Set default backend to SDPA for get_vit_attn_backend ready

ONLY add when PR is ready to merge/full CI is needed

#12235 opened Jan 21, 2025 by wangxiyuan

Loading…

[Misc] Move find_loaded_library to platform_aware_utils.py

#12231 opened Jan 20, 2025 by houseroad

Loading…

[VLM] Merged multi-modal processor for Pixtral

#12211 opened Jan 20, 2025 by Flechman • Draft

[V1][Spec Decode] Ngram Spec Decode

#12193 opened Jan 19, 2025 by LiuXiaoxuanPKU • Draft

1 of 7 tasks

[Bugfix] fix race condition that leads to wrong order of token returned

#12192 opened Jan 19, 2025 by joennlae

Loading…

[Misc] Add Gemma2 GGUF support

#12186 opened Jan 18, 2025 by Isotr0py • Draft

[Kernel] add triton fused moe kernel for gptq/awq

#12185 opened Jan 18, 2025 by jinzhen-lin

Loading…

[Hardware][Gaudi][Bugfix] Fix HPU tensor parallelism, enable multiprocessing executor

#12167 opened Jan 17, 2025 by kzawora-intel

Loading…

[Quantization/Parameter] WIP: Another Implementation of the Quantization Parameter Subclass Substitution

#12158 opened Jan 17, 2025 by cennn

Loading…

[Core] Optimize topp/topk calculation in sampler

#12156 opened Jan 17, 2025 by afierka-intel

Loading…

[WIP][Hardware][CPU] testing branch for mlperf ci/build documentation

Improvements or additions to documentation

needs-rebase

#12141 opened Jan 17, 2025 by bigPYJ1151 • Draft

[Hardware][Gaudi][Feature] Support Contiguous PA

#12139 opened Jan 17, 2025 by zhouyu5 • Draft

[WIP] Multimodal model support for V1 TPU

#12133 opened Jan 16, 2025 by mgoin • Draft

[Misc] Update to Transformers 4.48 ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12120 opened Jan 16, 2025 by tlrmchlsmth

Loading…

[Feature] Support VPTQ quantization ci/build

#12117 opened Jan 16, 2025 by wejoncy • Draft

[BUILD] Add VLLM_BUILD_EXT to control custom op build ci/build

#12116 opened Jan 16, 2025 by MengqingCao

Loading…

[Misc]add modules_to_not_convert attribute to gptq series

#12103 opened Jan 16, 2025 by 1096125073

Loading…

Previous 1 2 3 4 5 … 18 19 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly