ASR Benchmark

RTF 定义

RTF = 处理语音总时长 / 语音总时长

Aishell-1 test 集作为测试集。

TODO:数据分布。

机器硬件：GPU V100 32 G， CPU：Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
测试脚本： CLI
使用 1 GPU

Acoustic Model	dedoding_method	ctc_weight	decoding_chunk_size	num_decoding_left_chunk	RTF
conformer_aishell	attention_rescoring	0.5	16	-1	0.0623
conformer_wenetspeech	attention_rescoring	0.5	16	-1	0.0623
deepspeech2offline_aishell	ctc_prefix_beam_search	-	1	-	0.1787

使用 CPU

Acoustic Model	dedoding_method	ctc_weight	decoding_chunk_size	num_decoding_left_chunk	RTF
conformer_aishell	attention_rescoring	0.5	16	-1	0.3
conformer_wenetspeech	attention_rescoring	0.5	16	-1	0.51539
deepspeech2offline_aishell	ctc_prefix_beam_search	-	1	-	0.3953

机器硬件：GPU V100 32 G， CPU：Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
测试脚本：Streaming Server
使用 1 GPU

Acoustic Model	enigne	dedoding_method	ctc_weight	decoding_chunk_size	num_decoding_left_chunk	RTF
conformer_online_multicn	python	attention_rescoring	0.5	16	-1	0.250782
conformer_wenetspeech	python	attention_rescoring	0.5	16	-1	0.26339
deepspeech2online_aishell	inference	ctc_prefix_beam_search	-	1	-	0.351434

使用 CPU

Acoustic Model	Model Size	enigne	dedoding_method	ctc_weight	decoding_chunk_size	num_decoding_left_chunk	RTF
conformer_online_multicn	-	python	attention_rescoring	0.5	16	-1	1.55706
conformer_wenetspeech	-	python	attention_rescoring	0.5	16	-1	0.895237
deepspeech2online_aishell	-	infernece	ctc_prefix_beam_search	-	1	-	0.874739
deepspeech2online_wenetspeech	-	infernece	ctc_prefix_beam_search	-	1	-	1.9108175171428279(utts=80)
deepspeech2online_wenetspeech	659MB	onnx	ctc_prefix_beam_search	-	1	-	0.5617182449999291 (utts=80)
deepspeech2online_wenetspeech	166MB	onnx quant	ctc_prefix_beam_search	-	1	-	0.5617182449999291 (utts=80)