Skip to content

Commit

Permalink
For bs=1 set kv_cache_type to paged.
Browse files Browse the repository at this point in the history
Signed-off-by: MaheshRavishankar <[email protected]>
  • Loading branch information
MaheshRavishankar committed Jan 7, 2025
1 parent 922b1c2 commit 05358b1
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion sharktank/sharktank/examples/export_paged_llm_v1.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def main():
hp,
tensor_parallelism_size=tensor_parallelism_size,
use_hf=False,
kv_cache_type="direct" if args.bs == [1] else "paged",
kv_cache_type="paged",
attention_kernel=args.attention_kernel,
block_seq_stride=args.block_seq_stride,
)
Expand Down

0 comments on commit 05358b1

Please sign in to comment.