DTYPES : some newly created instances are not well protected #928

junliume · 2023-09-20T04:42:34Z

[Reproduce]

CXX=/opt/rocm/llvm/bin/clang++ cmake -DCMAKE_CXX_COMPILER_LAUNCHER="${COMPILER_LAUNCHER}" -DCMAKE_PREFIX_PATH=/opt/rocm -DDTYPES="fp16;fp32;bf16" -DCMAKE_BUILD_TYPE=Release -DINSTANCES_ONLY=ON -DGPU_TARGETS="gfx1100" ..

hence -DDTYPES="fp16;fp32;bf16", then

make -j$(nproc)

[Observation]

/home/junliu/composable_kernel/library/src/tensor_operation_instance/gpu/grouped_gemm_fixed_nk/device_grouped_gemm_xdl_fixed_nk_f16_f8_f16_mk_kn_mn_instance.cpp:18:17: error: no type named 'f8_t' in namespace 'ck'
using F8  = ck::f8_t;
            ~~~~^

Since fp8;bf8 was not included in the cmake command, resulting in the instances partially not protected.
Hence

#if defined CK_ENABLE_FP8
using f8_t = _BitInt(8);
#endif
#if defined CK_ENABLE_BF8
using bf8_t = unsigned _BitInt(8);
#endif

The text was updated successfully, but these errors were encountered:

sgueko · 2023-12-22T18:11:00Z

i have this same error

[ 18%] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o In file included from /tmp/ROCm-SBo/composable_kernel/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp:9: In file included from /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/device/impl/device_gemm_dpp.hpp:14: In file included from /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dpp.hpp:13: In file included from /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/block/blockwise_gemm_dpp.hpp:9: /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/warp/dpp_gemm.hpp:453:81: error: use of undeclared identifier 'f8_t' 453 | is_same<BaseType, int8_t>::value || is_same<BaseType, f8_t>::value, | ^ 1 error generated when compiling for gfx900.

cmake ... -DDTYPES="fp32;fp16;bf16" ...

resolve this by adding fp8;bf8 in my dtypes?

illsilin · 2024-01-02T16:31:06Z

Yes, since we've added a few mixed-type kernels, we cannot decouple the fp16 and fp8 types.

AngryLoki · 2024-12-13T02:17:55Z

The issue probably got bigger (in rocm-6.3.0 and develop branches), now there is another fail:

CC=hipcc CXX=hipcc cmake -GNinja -DBUILD_DEV=OFF -DGPU_TARGETS="gfx942" -DBUILD_TESTING=OFF -DDTYPES="fp32;fp16" .. && ninja

/src/composable_kernel/profiler/src/profile_grouped_gemm_fixed_nk.cpp:92:11: error: unused type alias 'I8' [-Werror,-Wunused-local-typedef]
   92 |     using I8   = int8_t;
      |           ^
/src/composable_kernel/profiler/src/profile_grouped_gemm_fixed_nk.cpp:91:11: error: unused type alias 'BF16' [-Werror,-Wunused-local-typedef]
   91 |     using BF16 = ck::bhalf_t;
      |           ^
2 errors generated when compiling for gfx942.

And another fail:

CC=hipcc CXX=hipcc cmake -GNinja -DBUILD_DEV=OFF -DGPU_TARGETS="gfx1030" -DBUILD_TESTING=OFF -DDTYPES="fp32;fp16" .. && ninja

ld.lld: error: undefined symbol: ck::tensor_operation::device::instance::add_device_pool3d_fwd_ndhwc_bf16_instances(std::__1::vector<std::__1::unique_ptr<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>, std::__1::default_delete<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>>>, std::__1::allocator<std::__1::unique_ptr<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>, std::__1::default_delete<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>>>>>&)
>>> referenced by profile_pool3d_fwd.cpp

Looks like this part from the README is not correct anymore:

DTYPES (default is not set) can be set to any subset of "fp64;fp32;fp16;fp8;bf16;int8" to build instances of select data types only. The main default data types are fp32 and fp16; you can safely skip other data types.

illsilin · 2024-12-13T02:28:56Z

Yes, unfortunately, since we've introduced a bunch of mixed-precision operations, splitting the build by data types became too tricky to maintain. The main motivation for it was to split out the INT8 data type, since some DL gemm instances were taking very long to compile due to compiler issues. But those issues had been resolved by now.

junliume added bug Something isn't working code quality urgency_high labels Sep 20, 2023

junliume assigned illsilin Sep 20, 2023

PeixuanZuo mentioned this issue Oct 17, 2023

Fix gemm splik #991

Closed

PIPIPIG233666 added a commit to PIPIPIG233666/gentoo_pppig that referenced this issue Dec 28, 2023

ck: ROCm/composable_kernel#928

7935d6b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DTYPES : some newly created instances are not well protected #928

DTYPES : some newly created instances are not well protected #928

junliume commented Sep 20, 2023

sgueko commented Dec 22, 2023

illsilin commented Jan 2, 2024

AngryLoki commented Dec 13, 2024

illsilin commented Dec 13, 2024

DTYPES : some newly created instances are not well protected #928

DTYPES : some newly created instances are not well protected #928

Comments

junliume commented Sep 20, 2023

sgueko commented Dec 22, 2023

illsilin commented Jan 2, 2024

AngryLoki commented Dec 13, 2024

illsilin commented Dec 13, 2024