From 12079c366f1192b94265c621d3cebb8e93391cda Mon Sep 17 00:00:00 2001 From: Charlie Fu Date: Thu, 13 Jun 2024 14:55:18 -0500 Subject: [PATCH] Update quark quantizer command in fp8 instruction (#49) * update quark quantizer command * typo --- ROCm_performance.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ROCm_performance.md b/ROCm_performance.md index 1c47a818ec852..bae57ea62d47c 100644 --- a/ROCm_performance.md +++ b/ROCm_performance.md @@ -30,7 +30,7 @@ python3 quantize_quark.py --model_dir [llama2 checkpoint folder] \ --output_dir output_dir \ --quant_scheme w_fp8_a_fp8_o_fp8 \ --num_calib_data 128 \ - --export_safetensors \ + --model_export vllm_adopted_safetensors \ --no_weight_matrix_merge ``` For more details, please refer to Quark's documentation.