This directory contains a SDXL inference server, CLI and support components. More information about SDXL on huggingface.
For nightly releases For our stable release
The server will prepare runtime artifacts for you.
By default, the port is set to 8000. If you would like to change this, use --port
in each of the following commands.
You can check if this (or any) port is in use on Linux with ss -ntl | grep 8000
.
python -m shortfin_apps.sd.server --device=amdgpu --device_ids=0 --build_preference=precompiled --topology="spx_single"
- Wait until your server outputs:
INFO - Application startup complete.
INFO - Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
python -m shortfin_apps.sd.simple_client --interactive
Congratulations!!! At this point you can play around with the server and client based on your usage.
The SDXL server's implementation does not account for extremely large client batches. Normally, for heavy workloads, services would be composed under a load balancer to ensure each service is fed with requests optimally. For most cases outside of large-scale deployments, the server's internal batching/load balancing is sufficient.
Please see --help for both the server and client for usage instructions. Here's a quick snapshot.
Flags | options |
---|---|
--host HOST | |
--port PORT | server port |
--root-path ROOT_PATH | |
--timeout-keep-alive | |
--device | local-task,hip,amdgpu |
--target | gfx942,gfx1100 |
--device_ids | |
--tokenizers | |
--model_config | |
--workers_per_device | |
--fibers_per_device | |
--isolation | per_fiber, per_call, none |
--show_progress | |
--trace_execution | |
--amdgpu_async_allocations | |
--splat | |
--build_preference | compile,precompiled |
--compile_flags | |
--flagfile FLAGFILE | |
--artifacts_dir ARTIFACTS_DIR | Where to store cached artifacts from the Cloud |
Flags | options |
---|---|
--file | |
--reps | |
--save | Whether to save image generated by the server |
--outputdir | output directory to store images generated by SDXL |
--steps | |
--interactive | |
--port | port to interact with server |