-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Multiprocessing Tensor Parallel Support for v1 #9856
Merged
Merged
Changes from 12 commits
Commits
Show all changes
68 commits
Select commit
Hold shift + click to select a range
5ad9c60
initial v1 tp support
tlrmchlsmth 49869fa
V1 TP with zmq-based boostrapping
tlrmchlsmth 71e08aa
improve check for USE_SCHED_YIELD
tlrmchlsmth 4930246
Merge branch 'main' into tms/v1_tp
tlrmchlsmth 3ea0cae
fixup
tlrmchlsmth d4b55ae
workers must be daemonic
tlrmchlsmth feeed73
We can now terminate properly
tlrmchlsmth e3c9c5c
Merge branch 'main' into tms/v1_tp
tlrmchlsmth 254714d
fixes from merge
tlrmchlsmth 10a627e
Fixup termination
tlrmchlsmth d95c01e
Appease mypy
tlrmchlsmth c08bae4
Allow shm_broadcast to enqueue by eithr pickle or msgpack
tlrmchlsmth 2392755
Switch back to pickle for shm_broadcast serialization
tlrmchlsmth bf3705c
Finish msgpack -> pickle
tlrmchlsmth d4ea706
wrap sched_yield and time.sleep in a fn
tlrmchlsmth 2174a5b
Review comments
tlrmchlsmth 25270ab
Rename executors to uniproc and multiproc
tlrmchlsmth 9322db5
more review comments
tlrmchlsmth b5bac31
format
tlrmchlsmth c4fcfce
hacky hacky hacky cleanup
tlrmchlsmth bedd593
Fix spawn vs fork issue using approach from #8823
tlrmchlsmth c03ef6d
skip non-distributed tests in test_basic_correctness to see what happens
tlrmchlsmth 8d9d557
fix async_llm
tlrmchlsmth 5f3a570
format
tlrmchlsmth b59babc
Fixes for testing
tlrmchlsmth 66116c7
Abstract executor class for typing
tlrmchlsmth eaeebc3
remove enforce_eager, format
tlrmchlsmth 6d53d6e
remove stop_remote_worker_execution_loop
tlrmchlsmth a7025fb
Remove profiling
tlrmchlsmth 6a3f2da
ExecutorMsg -> WorkerExecRequest
tlrmchlsmth d4e3813
Merge branch 'main' into tms/v1_tp
tlrmchlsmth 52ef894
Merge branch 'main' into tms/v1_tp
tlrmchlsmth 9f9883e
ensure_termination
tlrmchlsmth 1990433
Move ensure_termination to executor to avoid futures
tlrmchlsmth f8a1b9b
minor updates
tlrmchlsmth 963c97f
call destroy_distributed_environment atexit
tlrmchlsmth 0678911
more graceful shutdown
tlrmchlsmth 3d71b53
Simplify worker termination
tlrmchlsmth 88c9c7b
atexit -> weakref.finalize
tlrmchlsmth ab7cb89
minor cleanup
tlrmchlsmth 024bcad
core client cleanup rework
tlrmchlsmth d77bab5
poke CI
tlrmchlsmth 24ffb8a
fix V1 test, temporarily delete some noisy log statements
tlrmchlsmth be4260f
nccl/issues/1234
tlrmchlsmth cb4b363
Cleanup, _add_prefix
tlrmchlsmth 365ea06
fixup noise a bit
tlrmchlsmth c94e11b
tweaks
tlrmchlsmth 2a36db7
Merge branch 'main' into tms/v1_tp
tlrmchlsmth 536e5f2
back to atexit
tlrmchlsmth 998eb1d
Clean up process termination
tlrmchlsmth ebb2544
robcomments
tlrmchlsmth c81b7f5
format
tlrmchlsmth 0817336
client now kills workers directly to avoid zombies
tlrmchlsmth f10e5e8
remote rpc
tlrmchlsmth e49b071
use WorkerWrapperBase
tlrmchlsmth 661278f
Merge branch 'main' into tms/v1_tp
tlrmchlsmth fce9696
de-duplicate env setup
tlrmchlsmth c61a3e0
Use collective_rpc for initialization
tlrmchlsmth 8bb2430
add RPCParams for readability
tlrmchlsmth 5271ec6
fixup
tlrmchlsmth 50a12bc
Merge branch 'main' into tms/v1_tp: instance id
tlrmchlsmth edab869
review comments
tlrmchlsmth e0aea84
Merge branch 'main' into tms/v1_tp
tlrmchlsmth ce08cb2
profile
tlrmchlsmth 65b79c4
move vllm envs import to work with run_with_both_engines
tlrmchlsmth 143ed09
Merge branch 'main' into tms/v1_tp
tlrmchlsmth ab6bf27
review comments.
tlrmchlsmth 819b229
collective rpc function signature sanity
tlrmchlsmth File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
from enum import Enum, auto | ||
from typing import Optional | ||
|
||
import msgspec | ||
|
||
from vllm.v1.core.scheduler import SchedulerOutput | ||
|
||
|
||
#TODO: Move this file | ||
class ExecutorMsgType(Enum): | ||
TOIL = auto() | ||
TERMINATE = auto() | ||
|
||
|
||
class ExecutorMsg(msgspec.Struct, | ||
array_like=True, | ||
omit_defaults=True, | ||
gc=False): | ||
"""A directive from the core process to its worker processes. | ||
|
||
Wraps SchedulerOutput with a message type to distinguish between | ||
regular work assignments and termination orders.""" | ||
message_type: ExecutorMsgType | ||
payload: Optional[SchedulerOutput] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this skipped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails on V1 but I don't know why. It's not related to this PR as it's not running TP and fails on current main (just enabled it on #10864)