Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DTYPES : some newly created instances are not well protected #928

Open
junliume opened this issue Sep 20, 2023 · 4 comments
Open

DTYPES : some newly created instances are not well protected #928

junliume opened this issue Sep 20, 2023 · 4 comments
Assignees
Labels
bug Something isn't working code quality urgency_high

Comments

@junliume
Copy link
Contributor

[Reproduce]

CXX=/opt/rocm/llvm/bin/clang++ cmake -DCMAKE_CXX_COMPILER_LAUNCHER="${COMPILER_LAUNCHER}" -DCMAKE_PREFIX_PATH=/opt/rocm -DDTYPES="fp16;fp32;bf16" -DCMAKE_BUILD_TYPE=Release -DINSTANCES_ONLY=ON -DGPU_TARGETS="gfx1100" ..

hence -DDTYPES="fp16;fp32;bf16", then

make -j$(nproc)

[Observation]

/home/junliu/composable_kernel/library/src/tensor_operation_instance/gpu/grouped_gemm_fixed_nk/device_grouped_gemm_xdl_fixed_nk_f16_f8_f16_mk_kn_mn_instance.cpp:18:17: error: no type named 'f8_t' in namespace 'ck'
using F8  = ck::f8_t;
            ~~~~^

Since fp8;bf8 was not included in the cmake command, resulting in the instances partially not protected.
Hence

#if defined CK_ENABLE_FP8
using f8_t = _BitInt(8);
#endif
#if defined CK_ENABLE_BF8
using bf8_t = unsigned _BitInt(8);
#endif
@sgueko
Copy link

sgueko commented Dec 22, 2023

i have this same error

[ 18%] Building CXX object library/src/tensor_operation_instance/gpu/gemm/CMakeFiles/device_gemm_instance.dir/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp.o In file included from /tmp/ROCm-SBo/composable_kernel/library/src/tensor_operation_instance/gpu/gemm/device_gemm_dpp_f16_f16_f16_km_kn_mn_instance.cpp:9: In file included from /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/device/impl/device_gemm_dpp.hpp:14: In file included from /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/grid/gridwise_gemm_dpp.hpp:13: In file included from /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/block/blockwise_gemm_dpp.hpp:9: /tmp/ROCm-SBo/composable_kernel/include/ck/tensor_operation/gpu/warp/dpp_gemm.hpp:453:81: error: use of undeclared identifier 'f8_t' 453 | is_same<BaseType, int8_t>::value || is_same<BaseType, f8_t>::value, | ^ 1 error generated when compiling for gfx900.

cmake ... -DDTYPES="fp32;fp16;bf16" ...

resolve this by adding fp8;bf8 in my dtypes?

PIPIPIG233666 added a commit to PIPIPIG233666/gentoo_pppig that referenced this issue Dec 28, 2023
@illsilin
Copy link
Collaborator

illsilin commented Jan 2, 2024

Yes, since we've added a few mixed-type kernels, we cannot decouple the fp16 and fp8 types.

@AngryLoki
Copy link
Contributor

The issue probably got bigger (in rocm-6.3.0 and develop branches), now there is another fail:

CC=hipcc CXX=hipcc cmake -GNinja -DBUILD_DEV=OFF -DGPU_TARGETS="gfx942" -DBUILD_TESTING=OFF -DDTYPES="fp32;fp16" .. && ninja

/src/composable_kernel/profiler/src/profile_grouped_gemm_fixed_nk.cpp:92:11: error: unused type alias 'I8' [-Werror,-Wunused-local-typedef]
   92 |     using I8   = int8_t;
      |           ^
/src/composable_kernel/profiler/src/profile_grouped_gemm_fixed_nk.cpp:91:11: error: unused type alias 'BF16' [-Werror,-Wunused-local-typedef]
   91 |     using BF16 = ck::bhalf_t;
      |           ^
2 errors generated when compiling for gfx942.

And another fail:

CC=hipcc CXX=hipcc cmake -GNinja -DBUILD_DEV=OFF -DGPU_TARGETS="gfx1030" -DBUILD_TESTING=OFF -DDTYPES="fp32;fp16" .. && ninja

ld.lld: error: undefined symbol: ck::tensor_operation::device::instance::add_device_pool3d_fwd_ndhwc_bf16_instances(std::__1::vector<std::__1::unique_ptr<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>, std::__1::default_delete<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>>>, std::__1::allocator<std::__1::unique_ptr<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>, std::__1::default_delete<ck::tensor_operation::device::DevicePoolFwd<5, 3, unsigned short, unsigned short, int, ck::tensor_layout::convolution::NDHWC, ck::tensor_layout::convolution::NDHWC, (ck::ReduceTensorOp)5, false>>>>>&)
>>> referenced by profile_pool3d_fwd.cpp

Looks like this part from the README is not correct anymore:

DTYPES (default is not set) can be set to any subset of "fp64;fp32;fp16;fp8;bf16;int8" to build instances of select data types only. The main default data types are fp32 and fp16; you can safely skip other data types.

@illsilin
Copy link
Collaborator

Yes, unfortunately, since we've introduced a bunch of mixed-precision operations, splitting the build by data types became too tricky to maintain. The main motivation for it was to split out the INT8 data type, since some DL gemm instances were taking very long to compile due to compiler issues. But those issues had been resolved by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working code quality urgency_high
Projects
None yet
Development

No branches or pull requests

4 participants