You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Nvidia HPC SDK Compiler 24.5
When trying to build following sample program with flag "-O2" or "-O3 -fast" VkFFT produces invalid device code, which fails to compile.
The error is VkFFT.cu(453): error: identifier "f" is undefined
loc_1.x=fma(temp_2.x, 8.85456025653209896e-01f, loc_1.x);
loc_1.y=fma(temp_7.y, 8.85456025653209896e-01f, loc_1.y);
loc_12.x=fma(temp_2.y, 4.64723172043768546e-01f, loc_12.x);
loc_12.y=fma(temp_7.x, 4.64723172043768546e-01f, loc_12.y);
loc_2.x=fma(temp_2.x, -9.70941817426052027e-01f, loc_2.x);
loc_2.y=fma(temp_7.y, -9.70941817426052027e-01f, loc_2.y);
loc_11.x=fma(temp_2.y, f, loc_11.x); // HERE and also a lot of times laterloc_11.y=fma(temp_7.x, f, loc_11.y);
Passing flag "-O1" seems to work just fine.
VkFFT is latest from develop branch
Hello!
I am using Nvidia HPC SDK Compiler 24.5
When trying to build following sample program with flag "-O2" or "-O3 -fast" VkFFT produces invalid device code, which fails to compile.
The error is
VkFFT.cu(453): error: identifier "f" is undefined
Passing flag "-O1" seems to work just fine.
VkFFT is latest from
develop
branchHere is the sample:
The text was updated successfully, but these errors were encountered: