Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add official support for onnxruntime-gpu on ARM64/aarch64 platforms #22903

Open
abhishek-iitmadras opened this issue Nov 20, 2024 · 4 comments
Labels
feature request request for unsupported feature or enhancement

Comments

@abhishek-iitmadras
Copy link

Describe the feature request

Issue Description
Currently, onnxruntime-gpu package lacks official support for ARM64/aarch64 architecture, limiting GPU acceleration capabilities on increasingly popular ARM-based platforms.

Current Situation

No official pre-built wheels for onnxruntime-gpu on ARM64/aarch64
Limited documentation for ARM64 GPU deployment

Technical Details

  • Environment:
    Architecture: ARM64/aarch64
    Platform: Various (like AWS Graviton, etc.)
    Python Versions: 3.8+ compatibility needed

Proposed Solution

Official pre-built wheels for ARM64/aarch64
CI/CD pipeline additions for ARM64 builds

Would appreciate any feedback or guidance on how to make this happen?

Describe scenario use case

Use Case :
Growing adoption of ARM64 in edge/hpc computing.
Cloud deployments on ARM64-based servers (AWS Graviton, etc.)
Machine learning workloads on newer ARM-based development machines
IoT and embedded systems requiring GPU acceleration

@abhishek-iitmadras abhishek-iitmadras added the feature request request for unsupported feature or enhancement label Nov 20, 2024
@skottmckay
Copy link
Contributor

Do you have a specific use case? Different GPUs require different things. e.g. CUDA vs AMD vs integrated. Mobile vs server is also very different.

You mention AWS Graviton but I don't see a GPU option with that.

@adamreeve
Copy link
Contributor

I'm also interested in this feature. We'd like to use onnxruntime with CUDA on Linux and arm64 Nvidia Grace CPUs. It would be great if we could install wheel's from PyPI for this rather than need to build onnxruntime ourselves.

I imagine that space pressure on PyPI is a concern, as you already need to remove old versions to make space for new releases (eg. #22747), and adding another architecture would make this worse. If that's a blocker to adding a new architecture then maybe providing builds from an Azure devops feed like you do for CUDA 11 builds would be an option?

You mention AWS Graviton but I don't see a GPU option with that.

G5g instances are Graviton machines with an Nvidia GPU.

@adamreeve
Copy link
Contributor

If this is something that would be accepted, I or someone else in my team could help update the CI pipelines to add arm64 CUDA builds. We could probably use the existing onnxruntime-linux-ARM64-CPU-2019 machine pool for builds, but would ideally be able to run tests on arm64 GPU machines too. It doesn't look like there any arm64 machines with NVidia GPUs available in Azure from what I could see though, so that might be a problem. Is there a way around that, or would it be acceptable to create the builds but leave them untested for now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants