Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libshortfin CI and Releases #130

Open
4 of 6 tasks
stellaraccident opened this issue Aug 20, 2024 · 8 comments
Open
4 of 6 tasks

libshortfin CI and Releases #130

stellaraccident opened this issue Aug 20, 2024 · 8 comments
Assignees
Labels
infra General category for infrastructure-related requests for common triaging and prioritization

Comments

@stellaraccident
Copy link
Contributor

stellaraccident commented Aug 20, 2024

Bringing up libshortfin CI and releases

Overall Description

libshortfin is a C++ project providing serving oriented APIs atop the IREE runtime. It aims to supercede prior pure-Python systems that used IREE via its low-level synchronous API. From the get-go, libshortfin is:

  • Multi-device
  • Parallel (designed to scale in a single process)
  • Exposing a high level Python API that directly mirrors the C++ API
  • Async (integrating with both Python asyncio and C++ coroutines)

Internally, it consists of three primary components:

  • libshortfin C++ library (built as both a static and a dynamic library)
  • _shortfin native Python module
  • shortfin pure Python module (re-exports from the native module and adds
    higher level Python-only features)

Note that we are transitioning to using the containing repository (currently called sharktank) as a monorepo for a variety of related model development and serving projects. As such, plan for some organization to the way that venvs/dev setup is done across the repo (and CI, etc).

Dependencies

Native deps

libshortfin currently depends purely on the IREE runtime C APIs. In the future, it will also expose an interface to the
IREE compiler C API via a delay loaded stub, allowing the compiler to be dynamically loaded (same as some other integrations). The compiler will remain an optional dep and pure-inference or deployed systems can leave it off so long as whatever is built on libshortfin does not need it (i.e. if pipelines are pre-compiled).

libshortfin currently has the following additional native dependencies:

  • spdlog
  • fmt (dependency of spdlog and used internally)
  • xtensor
  • gtest (if testing is enabled)

Python deps

shortfin does not have any Python deps for minimal functionality, but it is expected that a number of deps will be useful for various applications of it, and it may provide features based on these deps in an optional fashion that people can use if helpful/available:

  • pytest (for testing)
  • numpy
  • torch
  • uvicorn (or other ASGI web server)
  • iree-turbine (for torch compilation)
    • Transitive deps on iree-compiler and iree-runtime
  • onnx (for onnx compilation)
  • etc

Variants

The native _shortfin.lib Python package is provided, similar to iree.runtime, by redirecting at runtime to a concrete native library module based on env flags, allowing the selection of multiple packages runtimes such as "default", "tracy", "assert", etc. As with the iree.runtime, we expect that our normal wheels will bundle the default and tracy versions so that users can always use an instrumented build.

Dependent Projects

It is intended that both C++ and pure/native Python extensions will be built atop libshortfin, providing application level support for various models, etc.

Moving towards release

There are several steps towards a robust release of the project.

Initial Packaging and CI

Initial Build Work

See the current README for developer quality instructions. This just uses find_package for both IREE and other dependencies. A setup.py is added that works for development, but it needs to be completed to build a hermetic/production build.

This will necessitate supporting CMake options to enable bundled/pinned deps vs relying on find_package. The trickiest of these is IREE itself, and for that we should support two modes:

  • Bring your own source (SHORTFIN_IREE_SOURCE_DIR set, for example) which will do a add_subdirectory on IREE and include it directly in the build.
  • FetchContent a pinned IREE release in bundled mode and build off of that.

These bundling modes can be developed/tested in pure cmake in isolation. Once functional, setup.py can be extended to invoke CMake itself. See IREE runtime/setup.py for inspiration. The result should be that pip install libshortfin/ works on any reasonably modern Linux and Windows system with no further fuss.

There are a few things that should be done at this point:

  • Set a soname on libshortfin.so or shortfin-{soname}.dll and let it be configured (needed for coexistence and dependent extension linking).
  • Enable LTO build options for IREE, spdlog, and xtensor.
  • Set compiler/linker flags to include a linker script that enables symbol versioning.
  • Tweak hidden visibility and API re-export settings for deps (they should already be right for libshortfin and are tested).
  • Set up libshortfin CMake install/exports
  • Bundle development headers and CMake exports with the build Python wheels so that applications can be auto configured without fiddling with native deps.

Initial CI work

CI should be brought up to:

  • Build Python packages.
  • Run native unit tests via CTest (running on CPU is fine for these).
  • Run pytest integration tests on combinations of machines with specific hardware.
  • Push built binaries to the GH releases page using some form of pre-release versioning.

Note that the project is self contained enough that it should build just fine on free linux and windows runners.

Versioning and Releasing

Releasing libshortfin properly will necessitate working out the versioning and releasing policies for some of its related deps (IREE, iree-turbine, sharktank, etc). This may be the end of the road for the nightly builds of those, and we may want to take this chance to come up with a real versioning policy and apply it uniformly to all of the projects. It would be great to have normal pre-releases and regular releases in stable places where everything can be installed by adding one index-url (vs the current state where everything is somewhere bespoke with all kinds of -f flags).

We also get requests for conda packages from some in the community, and this would be a good time to enable those.

@amd-chrissosa amd-chrissosa added the infra General category for infrastructure-related requests for common triaging and prioritization label Aug 20, 2024
@stellaraccident
Copy link
Contributor Author

You always want to build python packages that will be run on any other os with a many Linux image. It's typically easier to just do that from the get go. I almost always use one that has been forked locally and extended with needed packages.

ScottTodd added a commit that referenced this issue Sep 26, 2024
… subpackages." (#225)

Progress on #130.
This reverts #224 to
re-land #223.

This now uses `SOURCE_DIR` instead of `SETUPPY_DIR` to fix package
discovery when running in pre-built mode, which will fix the errors
reported on CI:
```
 ImportError while loading conftest '/home/runner/work/SHARK-Platform/SHARK-Platform/libshortfin/tests/conftest.py'.
tests/conftest.py:10: in <module>
    import shortfin as sf
E   ModuleNotFoundError: No module named 'shortfin'
```

Logs from a failed build:
```
   setup.py running in pre-built mode:
    SOURCE_DIR = /home/runner/work/SHARK-Platform/SHARK-Platform/libshortfin
    BINARY_DIR = /home/runner/work/SHARK-Platform/SHARK-Platform/libshortfin/build
  Found libshortfin packages: ['_shortfin_default']
```

Logs from a successful build:
```
   setup.py running in pre-built mode:
    SOURCE_DIR = /home/runner/work/SHARK-Platform/SHARK-Platform/libshortfin
    BINARY_DIR = /home/runner/work/SHARK-Platform/SHARK-Platform/libshortfin/build
  Found libshortfin packages: ['shortfin_apps', 'shortfin', '_shortfin', 'shortfin_apps.llm', 'shortfin_apps.llm.components', 'shortfin.interop', 'shortfin.support', 'shortfin.interop.fastapi', 'shortfin.interop.support']
```

---------

Co-authored-by: Marius Brehler <[email protected]>
@ScottTodd ScottTodd self-assigned this Sep 27, 2024
@ScottTodd
Copy link
Member

Can we yank the old shortfin releases (https://pypi.org/project/shortfin/#history) ? That only has version 0.1.dev3, for Python 3 (including pre-3.12).

If I try running python -m pip install shortfin -f https://github.com/ScottTodd/SHARK-Platform/releases/expanded_assets/dev-wheels from Python 3.10, that gets the stale package, since the new packages are only for Python 3.12 and 3.13.

ScottTodd added a commit that referenced this issue Sep 30, 2024
Progress on #130

This is forked from
*
https://github.com/llvm/torch-mlir/blob/main/build_tools/python_deploy/build_linux_packages.sh
*
https://github.com/iree-org/iree/blob/main/build_tools/python_deploy/build_linux_packages.sh

TODO (future PRs?):

- [x] Clean up script further
- [x] Mention in docs
- [x] Integrate into a github workflow
- [ ] Update manylinux docker image (2_28, not 2014)
- [ ] Test the wheels this builds (including tracing support)
ScottTodd added a commit that referenced this issue Sep 30, 2024
Progress on #130,
follow-up to #230.

## Overview

This gets us the general structure of a release pipeline that:
1. Computes some version information metadata based on build id and date
2. Calls `shortfin/build_tools/build_linux_package.sh`, which runs
`python -m pip wheel` under a manylinux Docker container
3. Uploads wheels to both GitHub artifacts (for the workflow run itself)
and GitHub releases (for archival and developer usage)

This follows how stablehlo
(https://github.com/openxla/stablehlo/releases/tag/dev-wheels) and
torch-mlir
(https://github.com/llvm/torch-mlir-release/releases/tag/dev-wheels)
both publish to a constantly growing "dev-wheels" release rather than
separate releases like IREE does. That makes it harder to directly
"promote" a candidate release, but it avoids polluting the release
history with nightly builds.

## Testing

* Sample run:
https://github.com/ScottTodd/SHARK-Platform/actions/runs/11078636913
* Dev wheels release:
https://github.com/ScottTodd/SHARK-Platform/releases/tag/dev-wheels
* Install with `python -m pip install shortfin -f
https://github.com/ScottTodd/SHARK-Platform/releases/expanded_assets/dev-wheels`

## What's next

In no particular order,

- [ ] Verify that the Tracy build enabled by `SHORTFIN_ENABLE_TRACING`
is functional
- [ ] Set up dependencies and optional components such that a user can
just `pip install shortfin` and have it bring along a compatible version
of packages like `iree-runtime` and `iree-compiler`
- [ ] Update the dockerfile version from manylinux2014 (that is
independent of this workflow)
- [x] Build for Python 3.13 free threaded and test that
- [ ] Build for other platforms/architectures (macOS, Windows, arm64,
etc.)
- [ ] Set up a workflow like IREE's pkgci that installs the packages and
runs some tests on them
- [ ] Document how to install the developer packages

---------

Co-authored-by: Marius Brehler <[email protected]>
@ScottTodd
Copy link
Member

Can we yank the old shortfin releases (https://pypi.org/project/shortfin/#history) ? That only has version 0.1.dev3, for Python 3 (including pre-3.12).

If I try running python -m pip install shortfin -f https://github.com/ScottTodd/SHARK-Platform/releases/expanded_assets/dev-wheels from Python 3.10, that gets the stale package, since the new packages are only for Python 3.12 and 3.13.

Bah, https://peps.python.org/pep-0440/#handling-of-pre-releases

Pre-releases of any kind, including developmental releases, are implicitly excluded from all version specifiers, unless they are already present on the system, explicitly requested by the user, or if the only available version that satisfies the version specifier is a pre-release.

That old 0.1.dev3 pre-release package for https://pypi.org/project/shortfin/ (and similarly for https://pypi.org/project/sharktank/) is making testing new packaging tricky:

  • The old shortfin package is for any Python version, not just 3.12+
  • The old shortfin package depends on sharktank, iree-runtime, etc.
  • The sharktank package depends on shark-turbine, which still needs to be migrated to iree-turbine

@ScottTodd
Copy link
Member

ScottTodd commented Oct 3, 2024

Brain dump before vacation (could merge these into a single tasklist in the original issue comment):

TODO:

  • Build and publish releases of sharktank to the same location.
  • Settle on a versioning scheme and roll that out across each setup.py file and the various projects in scope (iree-compiler, iree-runtime, iree-turbine, sharktank, shortfin, etc.)
  • Ensure that all projects in scope work across our supported Python versions and operating systems. For IREE, I have Build and publish Python 3.13 and 3.13t wheels iree-org/iree#18652 in progress
  • Add CI workflows that test the built packages (new smoketests, existing unit tests, nightly full model tests/workflows)
  • Rebase the shortfin build process from manylinux2014 to manylinux_2_28
  • Build for other platforms/architectures (macOS, Windows, arm64, etc.)
  • Publish releases to pypi, possibly using --pre for prerelease versions. Put less emphasis on installing from source or nightly builds
  • Refactor requirements.txt files and README.md files across https://github.com/nod-ai/SHARK-Platform , focusing on what each individual project needs for development and what the aggregate "platform" needs to be used

marbre added a commit to marbre/shark-ai that referenced this issue Oct 31, 2024
Progress on nod-ai#130 and nod-ai#294.

With this, every package has its own `version_info.json` which defaults
to a dev build (`X.Y.Z.dev`). Based on this version information the
version identifier for a release candate can be computed
(`X.Y.ZrcYYYYMMDD`) which is written to corresponding
`version_info_rc.json` files. This PR further adapt the build packages
workflow to make use of the script and apply the new versioning scheme.
Even though sharktank packages are not yet build, the workflow already
generates the rc version numbers for sharttank and shortfin.
marbre added a commit to marbre/shark-ai that referenced this issue Oct 31, 2024
Progress on nod-ai#130 and nod-ai#294.

With this, every package has its own `version_info.json` which defaults
to a dev build (`X.Y.Z.dev`). Based on this version information the
version identifier for a release candate can be computed
(`X.Y.ZrcYYYYMMDD`) which is written to corresponding
`version_info_rc.json` files. This PR further adapt the build packages
workflow to make use of the script and apply the new versioning scheme.
Even though sharktank packages are not yet build, the workflow already
generates the rc version numbers for sharttank and shortfin.
marbre added a commit to marbre/shark-ai that referenced this issue Oct 31, 2024
Progress on nod-ai#130 and nod-ai#294.

With this, every package has its own `version_info.json` which defaults
to a dev build (`X.Y.Z.dev`). Based on this version information the
version identifier for a release candate can be computed
(`X.Y.ZrcYYYYMMDD`) which is written to corresponding
`version_info_rc.json` files. This PR further adapt the build packages
workflow to make use of the script and apply the new versioning scheme.
Even though sharktank packages are not yet build, the workflow already
generates the rc version numbers for sharttank and shortfin.
marbre added a commit that referenced this issue Oct 31, 2024
Progress on #130 and #294.

With this, every package has its own `version_info.json` which defaults
to a dev build (`X.Y.Z.dev`). Based on this version information the
version identifier for a release candate can be computed
(`X.Y.ZrcYYYYMMDD`) which is written to corresponding
`version_info_rc.json` files. This PR further adapt the build packages
workflow to make use of the script and apply the new versioning scheme.
Even though sharktank packages are not yet build, the workflow already
generates the rc version numbers for sharktank and shortfin.
ScottTodd added a commit that referenced this issue Nov 1, 2024
Progress on #130 and
#294 . These developer
docs can also be considered a draft for user docs
(#359).
@ScottTodd
Copy link
Member

The main chunk of work here is now done. We could keep this open for some of the smaller items (expanded CI coverage, upgrading to manylinux_2_28, making the CMake build more compatible with downstream projects, etc.)... or we could close this as fixed now.

@marbre
Copy link
Collaborator

marbre commented Nov 18, 2024

I think this has still some information / requests we either want to transfer to new issues or we keep this open.

ScottTodd added a commit that referenced this issue Dec 2, 2024
Progress on #130.

The manylinux2014 image includes gcc 10.2.1 by default while
manylinux_2_28 includes gcc 12.2.1. At one point we had warnings/errors
building on the newer gcc version, but that is no longer the case.

With the new Rust dependency coming from
#610, we will likely want to
revive
https://github.com/nod-ai/base-docker-images/blob/main/dockerfiles/manylinux_x86_64.Dockerfile,
add more dependencies there, then switch from the upstream `quay.io/...`
image to that `ghcr.io/nod-ai/...` image.

Tested locally with `OUTPUT_DIR="/tmp/wheelhouse" sudo -E
./build_tools/build_linux_package.sh`. If the nightly package build
fails for some reason we can easily revert this.
@ScottTodd
Copy link
Member

shortfin builds are working on Windows again, so we can start thinking about publishing Windows packages too.

From the repo root, this works for me on Windows:

python -m pip wheel --disable-pip-version-check --no-deps -v -w %CD%/shortfin/wheelhouse %CD%/shortfin

ls shortfin\wheelhouse\
# shortfin-3.1.0.dev0-cp311-cp311-win_amd64.whl

Should be able to add a build_windows_package.sh (or .ps1) next to https://github.com/nod-ai/shark-ai/blob/main/shortfin/build_tools/build_linux_package.sh and hook that up to a workflow pretty easily - no manylinux or Docker needed. Can also reference the script that IREE has been using: https://github.com/iree-org/iree/blob/main/build_tools/python_deploy/build_windows_packages.ps1.

monorimet pushed a commit that referenced this issue Dec 13, 2024
Progress on #130.

The manylinux2014 image includes gcc 10.2.1 by default while
manylinux_2_28 includes gcc 12.2.1. At one point we had warnings/errors
building on the newer gcc version, but that is no longer the case.

With the new Rust dependency coming from
#610, we will likely want to
revive
https://github.com/nod-ai/base-docker-images/blob/main/dockerfiles/manylinux_x86_64.Dockerfile,
add more dependencies there, then switch from the upstream `quay.io/...`
image to that `ghcr.io/nod-ai/...` image.

Tested locally with `OUTPUT_DIR="/tmp/wheelhouse" sudo -E
./build_tools/build_linux_package.sh`. If the nightly package build
fails for some reason we can easily revert this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra General category for infrastructure-related requests for common triaging and prioritization
Projects
None yet
Development

No branches or pull requests

4 participants