Skip to content

ASR Benchmark

Hui Zhang edited this page Jun 17, 2022 · 23 revisions

ASR Benchmark

RTF 定义

RTF = 处理语音总时长 / 语音总时长

测试数据

Aishell-1 test 集作为测试集。

TODO:数据分布。

Non-Streaming ASR

机器硬件:GPU V100 32 G, CPU:Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
测试脚本: CLI
使用 1 GPU

Acoustic Model dedoding_method ctc_weight decoding_chunk_size num_decoding_left_chunk RTF
conformer_aishell attention_rescoring 0.5 16 -1 0.0623
conformer_wenetspeech attention_rescoring 0.5 16 -1 0.0623
deepspeech2offline_aishell ctc_prefix_beam_search - 1 - 0.1787

使用 CPU

Acoustic Model dedoding_method ctc_weight decoding_chunk_size num_decoding_left_chunk RTF
conformer_aishell attention_rescoring 0.5 16 -1 0.3
conformer_wenetspeech attention_rescoring 0.5 16 -1 0.51539
deepspeech2offline_aishell ctc_prefix_beam_search - 1 - 0.3953

Streaming ASR

机器硬件:GPU V100 32 G, CPU:Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
测试脚本:Streaming Server
使用 1 GPU

Acoustic Model enigne dedoding_method ctc_weight decoding_chunk_size num_decoding_left_chunk RTF
conformer_online_multicn python attention_rescoring 0.5 16 -1 0.250782
conformer_wenetspeech python attention_rescoring 0.5 16 -1 0.26339
deepspeech2online_aishell inference ctc_prefix_beam_search - 1 - 0.351434

使用 CPU

Acoustic Model Model Size enigne dedoding_method ctc_weight decoding_chunk_size num_decoding_left_chunk RTF
conformer_online_multicn - python attention_rescoring 0.5 16 -1 1.55706
conformer_wenetspeech - python attention_rescoring 0.5 16 -1 0.895237
deepspeech2online_aishell - infernece ctc_prefix_beam_search - 1 - 0.874739
deepspeech2online_wenetspeech - infernece ctc_prefix_beam_search - 1 - 1.9108175171428279(utts=80)
deepspeech2online_wenetspeech 659MB onnx ctc_prefix_beam_search - 1 - 0.5617182449999291 (utts=80)
deepspeech2online_wenetspeech 166MB onnx quant ctc_prefix_beam_search - 1 - 0.5617182449999291 (utts=80)