Skip to content

Latest commit

 

History

History
47 lines (34 loc) · 1.74 KB

File metadata and controls

47 lines (34 loc) · 1.74 KB

cuSPARSE Generic APIs - Hardware Memory Compression

Description

The sample demonstrates how to optimize sparse vector - dense vector scaling and sum (cusparseAxpby) by exploiting NVIDIA Ampere architecture Hardware Memory Compression

cuSPARSE Optimization Notes

Nsight Compute can be used to understand the effect of the memory compression

nv-nsight-cu-cli --metrics lts__gcomp_input_sectors_compression_achieved_algo_sdc4to1.sum,lts__gcomp_input_sectors_compression_achieved_algo_sdc4to2.sum,fbpa__dram_read_sectors.sum,fbpa__dram_write_sectors.sum,lts__average_gcomp_input_sector_compression_rate.pct ./compression_example

Building

  • Command line

    nvcc -I<cuda_toolkit_path>/include compression_example.c -o compression_example -lcusparse -lcuda
  • Linux

    make
  • Windows/Linux

    mkdir build
    cd build
    cmake ..
    make

    On Windows, instead of running the last build step, open the Visual Studio Solution that was created and build.

Support

  • Supported SM Architectures: SM 8.0, SM 8.6, SM 8.9, SM 9.0
  • Supported OSes: Linux, Windows, QNX, Android
  • Supported CPU Architectures: x86_64, ppc64le, arm64
  • Supported Compilers: gcc, clang, Intel icc, IBM xlc, Microsoft msvc, Nvidia HPC SDK nvc
  • Language: C++14

Prerequisites