WIP: [GPU] Use FP32 accumulator for QK multiplication for 2nd+ token calculation in PagedAttention #17056
Job | Run time |
---|---|
21s | |
35m 33s | |
0s | |
0s | |
7m 58s | |
0s | |
0s | |
0s | |
57s | |
4m 21s | |
9m 47s | |
0s | |
1s | |
58m 58s |
Job | Run time |
---|---|
21s | |
35m 33s | |
0s | |
0s | |
7m 58s | |
0s | |
0s | |
0s | |
57s | |
4m 21s | |
9m 47s | |
0s | |
1s | |
58m 58s |