Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perf] use uv for venv creation and pip install #4414

Merged
merged 16 commits into from
Dec 3, 2024
Merged
134 changes: 10 additions & 124 deletions sky/setup_files/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,24 @@
import re
import subprocess
import sys
from typing import Dict, List

import setuptools

# __file__ is setup.py at the root of the repo. We shouldn't assume it's a
# symlink - e.g. in the sdist it's resolved to a normal file.
ROOT_DIR = os.path.dirname(__file__)
SETUP_FILE_DIR = os.path.join(ROOT_DIR, 'sky', 'setup_files')
INIT_FILE_PATH = os.path.join(ROOT_DIR, 'sky', '__init__.py')
_COMMIT_FAILURE_MESSAGE = (
'WARNING: SkyPilot fail to {verb} the commit hash in '
f'{INIT_FILE_PATH!r} (SkyPilot can still be normally used): '
'{error}')

# setuptools does not include the script dir on the search path, so manually add
# it so that we can import the dependencies file.
sys.path.append(SETUP_FILE_DIR)
import dependencies
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dependencies seems not included in this PR?


original_init_content = None

system = platform.system()
Expand Down Expand Up @@ -130,127 +137,6 @@ def parse_readme(readme: str) -> str:
return readme


install_requires = [
'wheel',
'cachetools',
# NOTE: ray requires click>=7.0.
'click >= 7.0',
'colorama',
'cryptography',
# Jinja has a bug in older versions because of the lack of pinning
# the version of the underlying markupsafe package. See:
# https://github.com/pallets/jinja/issues/1585
'jinja2 >= 3.0',
'jsonschema',
'networkx',
'pandas>=1.3.0',
'pendulum',
# PrettyTable with version >=2.0.0 is required for the support of
# `add_rows` method.
'PrettyTable >= 2.0.0',
'python-dotenv',
'rich',
'tabulate',
# Light weight requirement, can be replaced with "typing" once
# we deprecate Python 3.7 (this will take a while).
'typing_extensions',
'filelock >= 3.6.0',
'packaging',
'psutil',
'pulp',
# Cython 3.0 release breaks PyYAML 5.4.* (https://github.com/yaml/pyyaml/issues/601)
# <= 3.13 may encounter https://github.com/ultralytics/yolov5/issues/414
'pyyaml > 3.13, != 5.4.*',
'requests',
]

local_ray = [
# Lower version of ray will cause dependency conflict for
# click/grpcio/protobuf.
# Excluded 2.6.0 as it has a bug in the cluster launcher:
# https://github.com/ray-project/ray/releases/tag/ray-2.6.1
'ray[default] >= 2.2.0, != 2.6.0',
]

remote = [
# Adopted from ray's setup.py: https://github.com/ray-project/ray/blob/ray-2.4.0/python/setup.py
# SkyPilot: != 1.48.0 is required to avoid the error where ray dashboard fails to start when
# ray start is called (#2054).
# Tracking issue: https://github.com/ray-project/ray/issues/30984
"grpcio >= 1.32.0, <= 1.49.1, != 1.48.0; python_version < '3.10' and sys_platform == 'darwin'", # noqa:E501
"grpcio >= 1.42.0, <= 1.49.1, != 1.48.0; python_version >= '3.10' and sys_platform == 'darwin'", # noqa:E501
# Original issue: https://github.com/ray-project/ray/issues/33833
"grpcio >= 1.32.0, <= 1.51.3, != 1.48.0; python_version < '3.10' and sys_platform != 'darwin'", # noqa:E501
"grpcio >= 1.42.0, <= 1.51.3, != 1.48.0; python_version >= '3.10' and sys_platform != 'darwin'", # noqa:E501
# Adopted from ray's setup.py:
# https://github.com/ray-project/ray/blob/ray-2.9.3/python/setup.py#L343
'protobuf >= 3.15.3, != 3.19.5',
# Some pydantic versions are not compatible with ray. Adopted from ray's
# setup.py: https://github.com/ray-project/ray/blob/ray-2.9.3/python/setup.py#L254
'pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3',
]

# NOTE: Change the templates/jobs-controller.yaml.j2 file if any of the
# following packages dependencies are changed.
aws_dependencies = [
# botocore does not work with urllib3>=2.0.0, according to https://github.com/boto/botocore/issues/2926
# We have to explicitly pin the version to optimize the time for
# poetry install. See https://github.com/orgs/python-poetry/discussions/7937
'urllib3<2',
# NOTE: this installs CLI V1. To use AWS SSO (e.g., `aws sso login`), users
# should instead use CLI V2 which is not pip-installable. See
# https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html.
'awscli>=1.27.10',
'botocore>=1.29.10',
'boto3>=1.26.1',
# NOTE: required by awscli. To avoid ray automatically installing
# the latest version.
'colorama < 0.4.5',
]

extras_require: Dict[str, List[str]] = {
'aws': aws_dependencies,
# TODO(zongheng): azure-cli is huge and takes a long time to install.
# Tracked in: https://github.com/Azure/azure-cli/issues/7387
# azure-identity is needed in node_provider.
# We need azure-identity>=1.13.0 to enable the customization of the
# timeout of AzureCliCredential.
'azure': [
'azure-cli>=2.65.0', 'azure-core>=1.31.0', 'azure-identity>=1.19.0',
'azure-mgmt-network>=27.0.0', 'azure-mgmt-compute>=33.0.0',
'azure-storage-blob>=12.23.1', 'msgraph-sdk'
] + local_ray,
# We need google-api-python-client>=2.69.0 to enable 'discardLocalSsd'
# parameter for stopping instances.
# Reference: https://github.com/googleapis/google-api-python-client/commit/f6e9d3869ed605b06f7cbf2e8cf2db25108506e6
'gcp': ['google-api-python-client>=2.69.0', 'google-cloud-storage'],
'ibm': [
'ibm-cloud-sdk-core', 'ibm-vpc', 'ibm-platform-services', 'ibm-cos-sdk'
] + local_ray,
'docker': ['docker'] + local_ray,
'lambda': local_ray,
'cloudflare': aws_dependencies,
'scp': local_ray,
'oci': ['oci'] + local_ray,
'kubernetes': ['kubernetes>=20.0.0'],
'remote': remote,
'runpod': ['runpod>=1.5.1'],
'fluidstack': [], # No dependencies needed for fluidstack
'cudo': ['cudo-compute>=0.1.10'],
'paperspace': [], # No dependencies needed for paperspace
'vsphere': [
'pyvmomi==8.0.1.0.2',
# vsphere-automation-sdk is also required, but it does not have
# pypi release, which cause failure of our pypi release.
# https://peps.python.org/pep-0440/#direct-references
# We have the instruction for its installation in our
# docs instead.
# 'vsphere-automation-sdk @ git+https://github.com/vmware/[email protected]'
],
}

extras_require['all'] = sum(extras_require.values(), [])

long_description = ''
readme_filepath = 'README.md'
# When sky/backends/wheel_utils.py builds wheels, it will not contain the
Expand All @@ -277,8 +163,8 @@ def parse_readme(readme: str) -> str:
long_description_content_type='text/markdown',
setup_requires=['wheel'],
requires_python='>=3.7',
install_requires=install_requires,
extras_require=extras_require,
install_requires=dependencies.install_requires,
extras_require=dependencies.extras_require,
entry_points={
'console_scripts': ['sky = sky.cli:cli'],
},
Expand Down
44 changes: 33 additions & 11 deletions sky/skylet/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
'which python3')
# Python executable, e.g., /opt/conda/bin/python3
SKY_PYTHON_CMD = f'$({SKY_GET_PYTHON_PATH_CMD})'
# Prefer SKY_UV_PIP_CMD, which is faster. TODO(cooper): remove all usages.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the reason in the comment for why we keep this for future reference.

SKY_PIP_CMD = f'{SKY_PYTHON_CMD} -m pip'
# Ray executable, e.g., /opt/conda/bin/ray
# We need to add SKY_PYTHON_CMD before ray executable because:
Expand All @@ -50,6 +51,14 @@
SKY_REMOTE_PYTHON_ENV_NAME = 'skypilot-runtime'
SKY_REMOTE_PYTHON_ENV = f'~/{SKY_REMOTE_PYTHON_ENV_NAME}'
ACTIVATE_SKY_REMOTE_PYTHON_ENV = f'source {SKY_REMOTE_PYTHON_ENV}/bin/activate'
# uv is used for venv and pip, much faster than python implementations.
SKY_UV_INSTALL_DIR = '"$HOME/.local/bin"'
SKY_UV_CMD = f'{SKY_UV_INSTALL_DIR}/uv'
# This won't reinstall uv if it's already installed, so it's safe to re-run.
SKY_UV_INSTALL_CMD = (f'{SKY_UV_CMD} -V >/dev/null 2>&1 || '
'curl -LsSf https://astral.sh/uv/install.sh '
f'| UV_INSTALL_DIR={SKY_UV_INSTALL_DIR} sh')
SKY_UV_PIP_CMD = f'VIRTUAL_ENV={SKY_REMOTE_PYTHON_ENV} {SKY_UV_CMD} pip'
# Deleting the SKY_REMOTE_PYTHON_ENV_NAME from the PATH to deactivate the
# environment. `deactivate` command does not work when conda is used.
DEACTIVATE_SKY_REMOTE_PYTHON_ENV = (
Expand Down Expand Up @@ -148,28 +157,30 @@
'echo "Creating conda env with Python 3.10" && '
f'conda create -y -n {SKY_REMOTE_PYTHON_ENV_NAME} python=3.10 && '
f'conda activate {SKY_REMOTE_PYTHON_ENV_NAME};'
# Install uv for venv management and pip installation.
f'{SKY_UV_INSTALL_CMD};'
# Create a separate conda environment for SkyPilot dependencies.
f'[ -d {SKY_REMOTE_PYTHON_ENV} ] || '
# Do NOT use --system-site-packages here, because if users upgrade any
# packages in the base env, they interfere with skypilot dependencies.
# Reference: https://github.com/skypilot-org/skypilot/issues/4097
f'{SKY_PYTHON_CMD} -m venv {SKY_REMOTE_PYTHON_ENV};'
# --seed will include pip and setuptools, which are present in venvs created
# with python -m venv.
f'{SKY_UV_CMD} venv --seed {SKY_REMOTE_PYTHON_ENV};'
f'echo "$(echo {SKY_REMOTE_PYTHON_ENV})/bin/python" > {SKY_PYTHON_PATH_FILE};'
)

_sky_version = str(version.parse(sky.__version__))
RAY_STATUS = f'RAY_ADDRESS=127.0.0.1:{SKY_REMOTE_RAY_PORT} {SKY_RAY_CMD} status'
RAY_INSTALLATION_COMMANDS = (
f'{SKY_UV_INSTALL_CMD};'
'mkdir -p ~/sky_workdir && mkdir -p ~/.sky/sky_app;'
# Disable the pip version check to avoid the warning message, which makes
# the output hard to read.
'export PIP_DISABLE_PIP_VERSION_CHECK=1;'
# Print the PATH in provision.log to help debug PATH issues.
'echo PATH=$PATH; '
# Install setuptools<=69.5.1 to avoid the issue with the latest setuptools
# causing the error:
# ImportError: cannot import name 'packaging' from 'pkg_resources'"
f'{SKY_PIP_CMD} install "setuptools<70"; '
f'{SKY_UV_PIP_CMD} install "setuptools<70"; '
# Backward compatibility for ray upgrade (#3248): do not upgrade ray if the
# ray cluster is already running, to avoid the ray cluster being restarted.
#
Expand All @@ -183,10 +194,10 @@
# latest ray port 6380, but those existing cluster launched before #1790
# that has ray cluster on the default port 6379 will be upgraded and
# restarted.
f'{SKY_PIP_CMD} list | grep "ray " | '
f'{SKY_UV_PIP_CMD} list | grep "ray " | '
f'grep {SKY_REMOTE_RAY_VERSION} 2>&1 > /dev/null '
f'|| {RAY_STATUS} || '
f'{SKY_PIP_CMD} install --exists-action w -U ray[default]=={SKY_REMOTE_RAY_VERSION}; ' # pylint: disable=line-too-long
f'{SKY_UV_PIP_CMD} install -U ray[default]=={SKY_REMOTE_RAY_VERSION}; ' # pylint: disable=line-too-long
# In some envs, e.g. pip does not have permission to write under /opt/conda
# ray package will be installed under ~/.local/bin. If the user's PATH does
# not include ~/.local/bin (the pip install will have the output: `WARNING:
Expand All @@ -202,10 +213,21 @@
f'which ray > {SKY_RAY_PATH_FILE} || exit 1; }}; ')

SKYPILOT_WHEEL_INSTALLATION_COMMANDS = (
f'{{ {SKY_PIP_CMD} list | grep "skypilot " && '
f'{SKY_UV_INSTALL_CMD};'
f'{{ {SKY_UV_PIP_CMD} list | grep "skypilot " && '
'[ "$(cat ~/.sky/wheels/current_sky_wheel_hash)" == "{sky_wheel_hash}" ]; } || ' # pylint: disable=line-too-long
f'{{ {SKY_PIP_CMD} uninstall skypilot -y; '
f'{SKY_PIP_CMD} install "$(echo ~/.sky/wheels/{{sky_wheel_hash}}/'
f'{{ {SKY_UV_PIP_CMD} uninstall skypilot; '
# uv cannot install azure-cli normally, since it depends on pre-release
# packages. Manually install azure-cli with the --prerelease=allow flag
# first. This will allow skypilot to successfully install. See
# https://docs.astral.sh/uv/pip/compatibility/#pre-release-compatibility.
# We don't want to use --prerelease=allow for all packages, because it will
# cause uv to use pre-releases for some other packages that have sufficient
# stable releases.
'if [ "{cloud}" = "azure" ]; then '
f'{SKY_UV_PIP_CMD} install --prerelease=allow "azure-cli>=2.65.0"; fi;'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should search for all occurence of azure-cli across the project and pip installs. Are we planning to install the dependencies with uv for controller dependencies as well?

'pip install "azure-cli>=2.31.0" azure-core '

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, I didn't know about this code. We should definitely fix this up. To be clear, everything should work as-is but ideally we would move this all to uv as well.

There are no other relevant mentions of azure-cli in the repo.

# Install skypilot from wheel
f'{SKY_UV_PIP_CMD} install "$(echo ~/.sky/wheels/{{sky_wheel_hash}}/'
f'skypilot-{_sky_version}*.whl)[{{cloud}}, remote]" && '
'echo "{sky_wheel_hash}" > ~/.sky/wheels/current_sky_wheel_hash || '
'exit 1; }; ')
Expand All @@ -220,7 +242,7 @@
# The ray installation above can be skipped due to the existing ray cluster
# for backward compatibility. In this case, we should not patch the ray
# files.
f'{SKY_PIP_CMD} list | grep "ray " | '
f'{SKY_UV_PIP_CMD} list | grep "ray " | '
f'grep {SKY_REMOTE_RAY_VERSION} 2>&1 > /dev/null && '
f'{{ {SKY_PYTHON_CMD} -c '
'"from sky.skylet.ray_patches import patch; patch()" || exit 1; }; ')
Expand Down
2 changes: 1 addition & 1 deletion sky/templates/kubernetes-ray.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ available_node_types:
done
{{ conda_installation_commands }}
{{ ray_installation_commands }}
~/skypilot-runtime/bin/python -m pip install skypilot[kubernetes,remote]
VIRTUAL_ENV=~/skypilot-runtime ~/.local/bin/uv pip install skypilot[kubernetes,remote]
touch /tmp/ray_skypilot_installation_complete
echo "=== Ray and skypilot installation completed ==="

Expand Down
Loading
Loading