Add a primitive operation for bounded-length for loops #2664

carlosgmartin · 2024-12-09T17:11:09Z

Request description

Currently, StableHLO has a while op for unbounded while loops. It also gets used for bounded for loops, e.g. by JAX. Unfortunately, each iteration of a while requires a kernel launch on GPU, which slows down programs significantly.

For context, see:

Improving the lowering and compilation of unrolled `lax.scan` loops jax-ml/jax#25336
Each iteration of `lax.scan` requires a kernel launch on GPU backends. Can this be resolved? jax-ml/jax#22611

This can be avoided by unrolling and inling the entire loop. The downside of that approach is that it causes extremely long compilation times, as explained in the link above. (If we just inline the entire loop, the compiler downstream apparently can't easily "see" that it's a loop of the same thing being done over and over again.)

To get the best of both worlds (fast execution and fast compilation) a potential solution would be to create a new primitive op for bounded loops (with a known fixed number of iterations) that does not require multiple kernel launches or returning control to the host. Call it for. Then the body can be optimized while still being treated as an atomic unit for the purposes of compilation downstream, thus avoiding an extremely long compilation time for long sequences.

By exposing the bounded-loop structure directly to the compiler, this could also potentially lead to further optimizations downstream.

The text was updated successfully, but these errors were encountered:

carlosgmartin changed the title ~~Add an operation for bounded-length for loops~~ Add a primitive operation for bounded-length for loops Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a primitive operation for bounded-length for loops #2664

Add a primitive operation for bounded-length for loops #2664

carlosgmartin commented Dec 9, 2024 •

edited

Loading

Add a primitive operation for bounded-length for loops #2664

Add a primitive operation for bounded-length for loops #2664

Comments

carlosgmartin commented Dec 9, 2024 • edited Loading

Request description

carlosgmartin commented Dec 9, 2024 •

edited

Loading