precision issue #7

dyhBUPT · 2024-08-19T02:03:12Z

Hi, thanks for your work. I'm trying your fa2-cm but it raises error because of the following assertion:

assert do.is_contiguous()
assert q.stride() == k.stride() == v.stride() == o.stride() == do.stride()

I solve this problem by using .contiguous() as follows:

q = q.contiguous()
k = k.contiguous()
v = v.contiguous()
o = o.contiguous()
do = do.contiguous()

Another error is, it only supports fp16 inputs, so I convert q/k/v from fp32 to fp16, and convert the out from fp16 to fp32 after fa2.

After these corrections, I can run fa2-cm normally.
However, the results seem bad, because the gradients explode.
I want to ask the possible reasons. It's because of my aforementioned modifications?

Looking forward to your reply~

The text was updated successfully, but these errors were encountered:

alexzhang13 · 2024-08-19T16:55:00Z

Hi,

For FP32 support, I need to make a minor edit. This was attempted by someone else earlier, but their code was buggy so I had to revert it. I will make this edit later this week and let you know!

yyhyyh17 · 2024-09-05T02:32:01Z

Hi,

For FP32 support, I need to make a minor edit. This was attempted by someone else earlier, but their code was buggy so I had to revert it. I will make this edit later this week and let you know!

Do you have any ideas why fp32 will produce wrong results in backward?

alexzhang13 · 2024-11-03T19:36:38Z

Oh it shouldn't, basically I specified in the code that things should be FP16, which was taken from the original Triton example. It should take the dtype of the tensors that are being fed in instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

precision issue #7

precision issue #7

dyhBUPT commented Aug 19, 2024

alexzhang13 commented Aug 19, 2024

yyhyyh17 commented Sep 5, 2024

alexzhang13 commented Nov 3, 2024

precision issue #7

precision issue #7

Comments

dyhBUPT commented Aug 19, 2024

alexzhang13 commented Aug 19, 2024

yyhyyh17 commented Sep 5, 2024

alexzhang13 commented Nov 3, 2024