diff --git a/docs/llama2.md.template b/docs/llama2.md.template index 23b94e2..5fb4a98 100644 --- a/docs/llama2.md.template +++ b/docs/llama2.md.template @@ -20,6 +20,7 @@ | [ctranslate](/bench_ctranslate/) | 46.26 ± 1.59 | 79.41 ± 0.37 | 48.20 ± 0.14 | - | | [vllm](/bench_vllm/) | 89.40 ± 0.22 | 89.43 ± 0.19 | - | 115.52 ± 0.49 | | [exllamav2](/bench_exllamav2/) | - | - | 125.58 ± 1.23 | 159.68 ± 1.85 | +| [onnx](/bench_onnxruntime/) | 14.28 ± 0.12 | 19.42 ± 0.08 | - | - | **Performance Metrics:** GPU Memory Consumption (unit: MB) @@ -35,6 +36,7 @@ | [ctranslate](/bench_ctranslate/) | 29951.52 | 16282.29 | 9470.74 | - | | [vllm](/bench_vllm/) | 77928.07 | 77928.07 | - | 77768.69 | | [exllamav2](/bench_exllamav2/) | - | - | 16582.18 | 7201.62 | +| [onnx](/bench_onnxruntime/) | 33072.09 | 19180.55 | - | - | *(Data updated: ``) diff --git a/docs/mistral.md.template b/docs/mistral.md.template index ecfa022..9279066 100644 --- a/docs/mistral.md.template +++ b/docs/mistral.md.template @@ -20,6 +20,7 @@ | [ctranslate](/bench_ctranslate/) | 43.17 ± 2.97 | 68.03 ± 0.27 | 45.14 ± 0.24 | - | | [vllm](/bench_vllm/) | 84.91 ± 0.27 | 84.89 ± 0.28 | - | 106.03 ± 0.53 | | [exllamav2](/bench_exllamav2/) | - | - | 114.81 ± 1.47 | 126.29 ± 3.05 | +| [onnx](/bench_onnxruntime/) | 15.75 ± 0.15 | 22.39 ± 0.14 | - | - | **Performance Metrics:** GPU Memory Consumption (unit: MB) @@ -34,6 +35,7 @@ | [ctranslate](/bench_ctranslate/) | 32602.32 | 17523.8 | 10074.72 | - | | [vllm](/bench_vllm/) | 73568.09 | 73790.39| - | 74016.88 | | [exllamav2](/bench_exllamav2/) | - | - | 21483.23 | 9460.25 | +| [onnx](/bench_onnxruntime/) | 33629.93 | 19537.07 | - | - | *(Data updated: ``)