Deep Speaker: speaker recognition system

Data Set: LibriSpeech
Reference paper: "Deep Speaker: an End-to-End Neural Speaker Embedding System" https://arxiv.org/pdf/1705.02304.pdf
Reference code : https://github.com/philipperemy/deep-speaker (Thanks Philippe Rémy. I have greatly modified the code during the experiment, but the theme is still similar.)

This code was trained using librispeech-train-clean dataset, tested using librispeech-test-clean dataset. In my code librispeech dataset shows ~5% EER using CNN.

About Code

train.py
This is the main file. This file train the model,then save the model and evaluate the result every specific steps.
models.py
This is the implementation of model used in this project. It contains three models, the CNN model (similar with the paper's CNN), the GRU model (similar with the paper's GRU), and the third model is simplified simple_cnn model.
select_batch.py
Choose the optimal batch feed to the network. This is one of the core of this experiment.
triplet_loss.py
This is the code for calculating the triplet-loss for network training.
test_model.py
This is a code that evaluate (test) the model, Such as eer...
eval_matrics.py
This file contains equal error rate, f-measure, accuracy and other metrics used in evaluation part. pretaining.py
This is a code for pre-training of softmax classification.
pre_process.py
This code implemented for read the voice-data, filter the mute, extract the fbank feature, and save the extracted-features as .npy format.

Results

This code was trained using librispeech-train-clean dataset, tested using librispeech-test-clean dataset. In my code, librispeech dataset shows ~5% EER using CNN.

If you want to know more details, please read 'deep_speaker实验报告.pdf'(Chinese). If you want to read details in English, please contact me.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
__pycache__		__pycache__
demo		demo
testwav		testwav
.zip		.zip
README.md		README.md
constants.py		constants.py
deep_speaker实验报告.pdf		deep_speaker实验报告.pdf
eval_metrics.py		eval_metrics.py
flactowav.sh		flactowav.sh
kaldi_form_preprocess.py		kaldi_form_preprocess.py
models.py		models.py
network.txt		network.txt
pre_process.py		pre_process.py
pretraining.py		pretraining.py
random_batch.py		random_batch.py
select_batch.py		select_batch.py
silence_detector.py		silence_detector.py
test_model.py		test_model.py
train.py		train.py
triplet_loss.py		triplet_loss.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Speaker: speaker recognition system

About Code

Results

About

Releases

Packages

Languages

Siomarry/Audio_recognition_

Folders and files

Latest commit

History

Repository files navigation

Deep Speaker: speaker recognition system

About Code

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages