Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST]About how to create nested make_tiled_copy #2048

Open
LANSHANGH opened this issue Jan 20, 2025 · 2 comments
Open

[QST]About how to create nested make_tiled_copy #2048

LANSHANGH opened this issue Jan 20, 2025 · 2 comments

Comments

@LANSHANGH
Copy link

What is your question?
TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{}, make_layout(make_shape(Int<4>{}, (Int<8>{}, Int<4>{})), make_stride(Int<4>{}, Int<1>{})), // Thr layout 4x8x4 m-major make_layout(make_shape(Int<1>{}, Int<8>{}))); // Val layout 1x8 m-major
I want to create threads in order of 484, but the copyA of print is size 16.
please help me

@ccecka
Copy link

ccecka commented Jan 20, 2025

Formatted, you have

TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{}, 
                                  make_layout(make_shape (Int<4>{}, (Int<8>{}, Int<4>{})), 
                                              make_stride(Int<4>{}, Int<1>{})),   // Thr layout 4x8x4 m-major 
                                  make_layout(make_shape(Int<1>{}, Int<8>{})));   // Val layout  1x8 m-major

which is an unfortunate C++ism as (Int<8>{},Int<4>{}) is a default comma-operator expression that evaluates to Int<4>{} rather than the tuple that it looks like.

You need the extra make_shape (with corrected comments and example strides):

TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{}, 
                                  make_layout(make_shape (Int<4>{}, make_shape (Int<8>{}, Int< 4>{})), 
                                              make_stride(Int<8>{}, make_stride(Int<1>{}, Int<32>{}))),  // Thr layout 4x8x4 k-major interleave 
                                  make_layout(make_shape(Int<1>{}, Int<8>{})));   // Val layout  1x8 k-major

or more compactly:

TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{}, 
                                  Layout<Shape <_4,Shape <_8, _4>>,
                                         Stride<_8,Stride<_1,_32>>>{},  // Thr layout 4x8x4 k-major interleave 
                                  Layout<Shape<_1,_8>>{});              // Val layout  1x8 k-major

@LANSHANGH
Copy link
Author

@ccecka

Thanks for your reply, I think I know what the problem is, but I would like to know under what circumstances commas are used as comma operators in cutlass, because I generally think commas are used as delimiters, such as the separation of variables. Secondly, why does the stride length you described look so strange? Because I want to express that the thread configuration of 484 is 32,4,1. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants