[FEA] Support host-side decompression and compression #17641

vuule · 2024-12-20T00:41:42Z

Depending on the number and sizes of the compression blocks, performing compression/decompression on the CPU can be faster than on the GPU. This is more emphasized for the compression side. Libcudf should have this as a run-time option. Ideally, the implementation would dynamically select how the operation is performed based on the parameters and the system.

The implementation should transparently dispatch between kernel and host operations to avoid having all readers/writers depend on this feature. For optimal performance, the host path should use a thread pool.

Compression

#17656 implements opt-in host side compression. It is limited to the GZIP format, as it's the only host-side compression we currently have available in libcudf. We need to look into ways to add support for other widely used formats, namely

Snappy: we already have a 300-line snap kernel, a (uncomplicated) C++ implementation should be easy to maintain.
Zstandard: use libzstd as a (runtime) dependency - need to evaluate feasibility.

Current implementation of the host compression copies the data back to device. This made it easy to integrate the API, but adds unnecessary H2D (and later D2H) copies. Writers do not further process the compressed chunks apart from two steps:

Select between compression and the original chunks (currently selects the smaller one);
Compact the selected (potential) mix of chunks into contiguous chunks/streams.

If we move these steps from the writers to the compression API, we can return host data and avoid the round-trip. Significant changes to the writers are required to make this work.

Decompression

In libcudf we currently have support for GZIP, ZLIB, and Snappy host decompression.
Similar changes to #17656 could make these available through a generic decompression API.
Here, the challenge of the data location is the opposite - we want to avoid ingesting the data to the device, only to copy it back to host when using host decompression. To resolve this, we can combine ingest and decompression into a higher-level abstraction. [TODO] Document how this abstraction could impact ingest as well (e.g. coalescing reads).

vuule added cuIO cuIO issue feature request New feature or request labels Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support host-side decompression and compression #17641

[FEA] Support host-side decompression and compression #17641

vuule commented Dec 20, 2024 •

edited

Loading

[FEA] Support host-side decompression and compression #17641

[FEA] Support host-side decompression and compression #17641

Comments

vuule commented Dec 20, 2024 • edited Loading

Compression

Decompression

vuule commented Dec 20, 2024 •

edited

Loading