WIP: [GPU] Use FP32 accumulator for QK multiplication for 2nd+ token calculation in PagedAttention #18501
Job | Run time |
---|---|
20s | |
48s | |
12m 38s | |
0s | |
0s | |
1m 45s | |
52s | |
1m 52s | |
0s | |
2m 45s | |
2m 39s | |
0s | |
1s | |
23m 40s |
Job | Run time |
---|---|
20s | |
48s | |
12m 38s | |
0s | |
0s | |
1m 45s | |
52s | |
1m 52s | |
0s | |
2m 45s | |
2m 39s | |
0s | |
1s | |
23m 40s |