You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tAcA is the (tiled, sliced, partitioned) coordinate tensor. You're right that it is read-only.
tApA is the predicate tensor that actually stores bool
Tensor tApA = make_tensor<bool>(shape(tAcA));
which is read/write. So the above code is precomputing and storing predicates (to reuse in a mainloop, for example) via the read-only coordinate tensor.
I agree that I should revisit+update that documentation though...
In the tutorial:
https://github.com/NVIDIA/cutlass/blob/main/media/docs/cute/0y_predication.md
We have the following code snippet:
In the above code, we created two predicate tensors, cA and cB.
I found that when calling the make_identity_tensor function, we are actually creating a tensor view(cA, cB). No memory is allocated to store the contents of cA or cB; instead, an iterator is created.( https://github.com/NVIDIA/cutlass/blob/cc3c29a81a140f7b97045718fb88eb0664c37bd7/include/cute/tensor_impl.hpp)
So, why are we able to modify cA, cB matrix in the tutorial code with:
Does this call implicitly convert the cB matrix from a tensor view into a tensor with actual allocated memory?
The text was updated successfully, but these errors were encountered: