[QST]Why Does CUTLASS Use 3-4-3 Swizzle? #2015

ziyuhuang123 · 2024-12-27T04:02:21Z

May I kindly ask why the swizzle configuration in CUTLASS is specifically set to 3, 4, and 3? I would greatly appreciate any insights or explanations regarding the rationale behind this design choice. Thank you so much in advance!

// K-major GMMA layouts in units of bits
using Layout_K_INTER_Atom_Bits  = ComposedLayout<Swizzle<0,4,3>, smem_ptr_flag, Layout<Shape<_8, _128>,Stride< _128,_1>>>;
using Layout_K_SW32_Atom_Bits   = ComposedLayout<Swizzle<1,4,3>, smem_ptr_flag, Layout<Shape<_8, _256>,Stride< _256,_1>>>;
using Layout_K_SW64_Atom_Bits   = ComposedLayout<Swizzle<2,4,3>, smem_ptr_flag, Layout<Shape<_8, _512>,Stride< _512,_1>>>;
using Layout_K_SW128_Atom_Bits  = ComposedLayout<Swizzle<3,4,3>, smem_ptr_flag, Layout<Shape<_8,_1024>,Stride<_1024,_1>>>;

The text was updated successfully, but these errors were encountered:

leimao · 2024-12-28T05:28:44Z

I have my own interpretations.

ziyuhuang123 · 2025-01-01T08:38:58Z

I have my own interpretations.

Thank you so much! Your post has been incredibly helpful! I actually studied L2 persistent before and remember coming across your post at that time. I'm part of a very active NV WeChat discussion group. I was wondering if you might be interested in joining us to explore CUDA technologies together? My WeChat ID is hermit_purple1.

ziyuhuang123 added ? - Needs Triage question Question labels Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST]Why Does CUTLASS Use 3-4-3 Swizzle? #2015

[QST]Why Does CUTLASS Use 3-4-3 Swizzle? #2015

ziyuhuang123 commented Dec 27, 2024

leimao commented Dec 28, 2024

ziyuhuang123 commented Jan 1, 2025

[QST]Why Does CUTLASS Use 3-4-3 Swizzle? #2015

[QST]Why Does CUTLASS Use 3-4-3 Swizzle? #2015

Comments

ziyuhuang123 commented Dec 27, 2024

leimao commented Dec 28, 2024

ziyuhuang123 commented Jan 1, 2025