Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI/Build] improve python-only dev setup #9621

Merged
merged 23 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
78d16eb
[CI/Build] improve dev setup
dtrifiro Oct 22, 2024
548433b
remove python_only_dev.py script, update docs
dtrifiro Oct 23, 2024
75f2a2a
bump python version in installation guide
dtrifiro Oct 23, 2024
bb14531
docs: add sccache section
dtrifiro Oct 23, 2024
6675435
docs: fix VLLM_USE_PRECOMPILED env var usage, fix typos/rewording
dtrifiro Oct 24, 2024
8076513
fix inclusion of vllm_flash_attn python/compiled files
dtrifiro Oct 28, 2024
f83ade0
fix build isolation
dtrifiro Oct 28, 2024
ff364ae
fixup
dtrifiro Oct 28, 2024
9c41aba
extract pre-compiled wheel logic into repackage_wheel()
dtrifiro Oct 28, 2024
94a2fac
use files_to_copy
dtrifiro Dec 3, 2024
0acb1b2
allow to set VLLM_PRECOMPILED_WHEEL_LOCATION for custom wheel location
dtrifiro Dec 3, 2024
b17f15b
setup.py: use use wheel location instead of wheel filename
dtrifiro Dec 3, 2024
42c9a45
fix docs linting complaints
dtrifiro Dec 3, 2024
83dcfb6
explicit files to copy
youkaichao Dec 4, 2024
400d1e2
use wheel_path
youkaichao Dec 4, 2024
60532e4
use member
youkaichao Dec 4, 2024
4c0c89b
fix format
youkaichao Dec 4, 2024
c129f9c
add notes
youkaichao Dec 4, 2024
69e6c4e
setup.py: refactor repackage_wheel into custom build_ext class
dtrifiro Dec 4, 2024
73c07fc
setup.py: use vllm-wheels prefix for nightly wheels download dir
dtrifiro Dec 4, 2024
cef112a
remove verbose flag
youkaichao Dec 4, 2024
b7f0c3b
Merge branch 'main' into improve-python-only-dev-setup
youkaichao Dec 4, 2024
57ee0c1
remove super run
youkaichao Dec 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 12 additions & 29 deletions docs/source/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ You can install vLLM using pip:
.. code-block:: console

$ # (Recommended) Create a new conda environment.
$ conda create -n myenv python=3.10 -y
$ conda create -n myenv python=3.12 -y
$ conda activate myenv

$ # Install vLLM with CUDA 12.1.
Expand Down Expand Up @@ -89,45 +89,24 @@ Build from source
Python-only build (without compilation)
---------------------------------------

If you only need to change Python code, you can simply build vLLM without compilation.

The first step is to install the latest vLLM wheel:

.. code-block:: console

pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

You can find more information about vLLM's wheels `above <#install-the-latest-code>`_.

After verifying that the installation is successful, you can use `the following script <https://github.com/vllm-project/vllm/blob/main/python_only_dev.py>`_:
If you only need to change Python code, you can build and install vLLM without compilation. Using `pip's ``--editable`` flag <https://pip.pypa.io/en/stable/topics/local-project-installs/#editable-installs>`_, changes you make to the code will be reflected when you run vLLM:

.. code-block:: console

$ git clone https://github.com/vllm-project/vllm.git
$ cd vllm
$ python python_only_dev.py
$ VLLM_USE_PRECOMPILED=1 pip install --editable .

The script will:
This will download the latest nightly wheel and use the compiled libraries from there in the install.

* Find the installed vLLM package in the current environment.
* Copy built files to the current directory.
* Rename the installed vLLM package.
* Symbolically link the current directory to the installed vLLM package.

Now, you can edit the Python code in the current directory, and the changes will be reflected when you run vLLM.

Once you have finished editing or want to install another vLLM wheel, you should exit the development environment using `the same script <https://github.com/vllm-project/vllm/blob/main/python_only_dev.py>`_ with the ``--quit-dev`` (or ``-q`` for short) flag:
The ``VLLM_PRECOMPILED_WHEEL_LOCATION`` environment variable can be used instead of ``VLLM_USE_PRECOMPILED`` to specify a custom path or URL to the wheel file. For example, to use the `0.6.1.post1 PyPi wheel <https://pypi.org/project/vllm/#files>`_:

.. code-block:: console

$ python python_only_dev.py --quit-dev

The ``--quit-dev`` flag will:

* Remove the symbolic link from the current directory to the vLLM package.
* Restore the original vLLM package from the backup.
$ export VLLM_PRECOMPILED_WHEEL_LOCATION=https://files.pythonhosted.org/packages/4a/4c/ee65ba33467a4c0de350ce29fbae39b9d0e7fcd887cc756fa993654d1228/vllm-0.6.3.post1-cp38-abi3-manylinux1_x86_64.whl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we unify these two env vars into one VLLM_PRECOMPILED_WHEEL=location ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I updated the workflow so that if VLLM_PRECOMPILED_WHEEL_LOCATION is set, there's no need to also provide VLLM_USE_PRECOMPILED. If VLLM_USE_PRECOMPILED is provided but the location is unset, it will use the default nightly url.

$ pip install --editable .

If you update the vLLM wheel and rebuild from the source to make further edits, you will need to repeat the `Python-only build <#python-only-build>`_ steps again.
You can find more information about vLLM's wheels `above <#install-the-latest-code>`_.

.. note::

Expand All @@ -148,9 +127,13 @@ If you want to modify C++ or CUDA code, you'll need to build vLLM from source. T
.. tip::

Building from source requires a lot of compilation. If you are building from source repeatedly, it's more efficient to cache the compilation results.

For example, you can install `ccache <https://github.com/ccache/ccache>`_ using ``conda install ccache`` or ``apt install ccache`` .
As long as ``which ccache`` command can find the ``ccache`` binary, it will be used automatically by the build system. After the first build, subsequent builds will be much faster.

`sccache <https://github.com/mozilla/sccache>`_ works similarly to ``ccache``, but has the capability to utilize caching in remote storage environments.
The following environment variables can be set to configure the vLLM ``sccache`` remote: ``SCCACHE_BUCKET=vllm-build-sccache SCCACHE_REGION=us-west-2 SCCACHE_S3_NO_CREDENTIALS=1``. We also recommend setting ``SCCACHE_IDLE_TIMEOUT=0``.


Use an existing PyTorch installation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
96 changes: 9 additions & 87 deletions python_only_dev.py
Original file line number Diff line number Diff line change
@@ -1,92 +1,14 @@
# enable python only development
# copy compiled files to the current directory directly
msg = """Old style python only build (without compilation) is deprecated, please check https://docs.vllm.ai/en/latest/getting_started/installation.html#python-only-build-without-compilation for the new way to do python only build (without compilation).

dtrifiro marked this conversation as resolved.
Show resolved Hide resolved
import argparse
import os
import shutil
import subprocess
import sys
import warnings
TL;DR:

parser = argparse.ArgumentParser(
description="Development mode for python-only code")
parser.add_argument('-q',
'--quit-dev',
action='store_true',
help='Set the flag to quit development mode')
args = parser.parse_args()
VLLM_USE_PRECOMPILED=1 pip install -e .

# cannot directly `import vllm` , because it will try to
# import from the current directory
output = subprocess.run([sys.executable, "-m", "pip", "show", "vllm"],
capture_output=True)
or

assert output.returncode == 0, "vllm is not installed"
export VLLM_COMMIT=33f460b17a54acb3b6cc0b03f4a17876cff5eafd # use full commit hash from the main branch
export VLLM_PRECOMPILED_WHEEL_LOCATION=https://vllm-wheels.s3.us-west-2.amazonaws.com/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
pip install -e .
""" # noqa

text = output.stdout.decode("utf-8")

package_path = None
for line in text.split("\n"):
if line.startswith("Location: "):
package_path = line.split(": ")[1]
break

assert package_path is not None, "could not find package path"

cwd = os.getcwd()

assert cwd != package_path, "should not import from the current directory"

files_to_copy = [
"vllm/_C.abi3.so",
"vllm/_moe_C.abi3.so",
"vllm/vllm_flash_attn/vllm_flash_attn_c.abi3.so",
"vllm/vllm_flash_attn/flash_attn_interface.py",
"vllm/vllm_flash_attn/__init__.py",
# "vllm/_version.py", # not available in nightly wheels yet
]

# Try to create _version.py to avoid version related warning
# Refer to https://github.com/vllm-project/vllm/pull/8771
try:
from setuptools_scm import get_version
get_version(write_to="vllm/_version.py")
except ImportError:
warnings.warn(
"To avoid warnings related to vllm._version, "
"you should install setuptools-scm by `pip install setuptools-scm`",
stacklevel=2)

if not args.quit_dev:
for file in files_to_copy:
src = os.path.join(package_path, file)
dst = file
print(f"Copying {src} to {dst}")
shutil.copyfile(src, dst)

pre_built_vllm_path = os.path.join(package_path, "vllm")
tmp_path = os.path.join(package_path, "vllm_pre_built")
current_vllm_path = os.path.join(cwd, "vllm")

print(f"Renaming {pre_built_vllm_path} to {tmp_path} for backup")
shutil.copytree(pre_built_vllm_path, tmp_path)
shutil.rmtree(pre_built_vllm_path)

print(f"Linking {current_vllm_path} to {pre_built_vllm_path}")
os.symlink(current_vllm_path, pre_built_vllm_path)
else:
vllm_symlink_path = os.path.join(package_path, "vllm")
vllm_backup_path = os.path.join(package_path, "vllm_pre_built")
current_vllm_path = os.path.join(cwd, "vllm")

print(f"Unlinking {current_vllm_path} to {vllm_symlink_path}")
assert os.path.islink(
vllm_symlink_path
), f"not in dev mode: {vllm_symlink_path} is not a symbolic link"
assert current_vllm_path == os.readlink(
vllm_symlink_path
), "current directory is not the source code of package"
os.unlink(vllm_symlink_path)

print(f"Recovering backup from {vllm_backup_path} to {vllm_symlink_path}")
os.rename(vllm_backup_path, vllm_symlink_path)
print(msg)
83 changes: 79 additions & 4 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,74 @@ def run(self):
self.copy_file(file, dst_file)


class repackage_wheel(build_ext):
"""Extracts libraries and other files from an existing wheel."""
default_wheel = "https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl"

def run(self) -> None:
wheel_location = os.getenv("VLLM_PRECOMPILED_WHEEL_LOCATION",
self.default_wheel)

assert _is_cuda(
), "VLLM_USE_PRECOMPILED is only supported for CUDA builds"

import zipfile

if os.path.isfile(wheel_location):
wheel_path = wheel_location
print(f"Using existing wheel={wheel_path}")
else:
# Download the wheel from a given URL, assume
# the filename is the last part of the URL
wheel_filename = wheel_location.split("/")[-1]

import tempfile

# create a temporary directory to store the wheel
temp_dir = tempfile.mkdtemp(prefix="vllm-wheels")
wheel_path = os.path.join(temp_dir, wheel_filename)

print(f"Downloading wheel from {wheel_location} to {wheel_path}")

from urllib.request import urlretrieve

try:
urlretrieve(wheel_location, filename=wheel_path)
except Exception as e:
from setuptools.errors import SetupError

raise SetupError(
f"Failed to get vLLM wheel from {wheel_location}") from e

with zipfile.ZipFile(wheel_path) as wheel:
files_to_copy = [
"vllm/_C.abi3.so",
"vllm/_moe_C.abi3.so",
"vllm/vllm_flash_attn/vllm_flash_attn_c.abi3.so",
"vllm/vllm_flash_attn/flash_attn_interface.py",
"vllm/vllm_flash_attn/__init__.py",
# "vllm/_version.py", # not available in nightly wheels yet
]
file_members = filter(lambda x: x.filename in files_to_copy,
wheel.filelist)

for file in file_members:
print(f"Extracting and including {file.filename} "
"from existing wheel")
package_name = os.path.dirname(file.filename).replace("/", ".")
file_name = os.path.basename(file.filename)

if package_name not in package_data:
package_data[package_name] = []

wheel.extract(file)
if file_name.endswith(".py"):
# python files shouldn't be added to package_data
continue

package_data[package_name].append(file_name)


def _is_hpu() -> bool:
is_hpu_available = True
try:
Expand Down Expand Up @@ -403,6 +471,8 @@ def get_vllm_version() -> str:
# skip this for source tarball, required for pypi
if "sdist" not in sys.argv:
version += f"{sep}cu{cuda_version_str}"
if envs.VLLM_USE_PRECOMPILED:
version += ".precompiled"
elif _is_hip():
# Get the HIP version
hipcc_version = get_hipcc_rocm_version()
Expand Down Expand Up @@ -514,13 +584,18 @@ def _read_requirements(filename: str) -> List[str]:
package_data = {
"vllm": ["py.typed", "model_executor/layers/fused_moe/configs/*.json"]
}
if envs.VLLM_USE_PRECOMPILED:
ext_modules = []
package_data["vllm"].append("*.so")

if _no_device():
ext_modules = []

if not ext_modules:
cmdclass = {}
else:
cmdclass = {
"build_ext":
repackage_wheel if envs.VLLM_USE_PRECOMPILED else cmake_build_ext
}

setup(
name="vllm",
version=get_vllm_version(),
Expand Down Expand Up @@ -557,7 +632,7 @@ def _read_requirements(filename: str) -> List[str]:
"audio": ["librosa", "soundfile"], # Required for audio processing
"video": ["decord"] # Required for video processing
},
cmdclass={"build_ext": cmake_build_ext} if len(ext_modules) > 0 else {},
cmdclass=cmdclass,
package_data=package_data,
entry_points={
"console_scripts": [
Expand Down
3 changes: 2 additions & 1 deletion vllm/envs.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,8 @@ def get_default_config_root():

# If set, vllm will use precompiled binaries (*.so)
"VLLM_USE_PRECOMPILED":
lambda: bool(os.environ.get("VLLM_USE_PRECOMPILED")),
lambda: bool(os.environ.get("VLLM_USE_PRECOMPILED")) or bool(
os.environ.get("VLLM_PRECOMPILED_WHEEL_LOCATION")),

# CMake build type
# If not set, defaults to "Debug" or "RelWithDebInfo"
Expand Down
Loading