[Speculative decoding] Fix draft_model tun in case of long prompt (#1… #8
Job | Run time |
---|---|
1s | |
1s | |
1s | |
17m 52s | |
18m 6s | |
14m 1s | |
15m 44s | |
1s | |
27m 4s | |
1s | |
1s | |
35m 20s | |
1s | |
1s | |
1s | |
1s | |
1s | |
1s | |
2h 8m 19s |
Job | Run time |
---|---|
1s | |
1s | |
1s | |
17m 52s | |
18m 6s | |
14m 1s | |
15m 44s | |
1s | |
27m 4s | |
1s | |
1s | |
35m 20s | |
1s | |
1s | |
1s | |
1s | |
1s | |
1s | |
2h 8m 19s |