Skip to content
masoud parpanchi edited this page Jun 7, 2020 · 5 revisions

Speech To Text using Tensorflow, Deep Learning 🧠 , Mozzila Deep Speech WIKI📖

This is a TensorFlow implementation of the Speech to Text described in the paper which published by BAIDU. The project also uses ideas from Tensorflow Doc, Udacity Course and many other readings from Medium,Webpages etc.

Compatibility

The code is tested using Tensorflow r2 under Ubuntu 18.04 with Python 3. Also this codes are tested in Kaggle Notebooks ( you can use there too.) Deep speech Library versions are defined on their Doc Mozilla Deep Speech Doc to Create your own model

Progress table 📋

date 📅 update 🔄
Marh 2020 20th Reasearch started. Books,Webpages, Medium writes, and many other sources. You can find these Sources in [Tutorial folder](https://github.com/shenasa-ai/speech2text/tree/master/Speeh2Text_Tutorial) of this Repository
April 2020 1st Attention model Code created.
April 2020 17th We Started to use Deep Speech mozilla dataset and Model.
April 2020 30th A guid for Dataset Creation for Persian Dataset created. To download this pdf file check Tutorial Folder of this repository
May 2020 7th CTC Custom Data Generator Code created.
May 2020 17th Useful Scripts Collected. you can find them in Useful Scripts folder
May 2020 27th Start to Create Dataset. You can find scripts in useful Scripts Folder
June 2020 5th We Started Training on Mozilla_Deep_Speech and attention model

Models Output 🎯

Soon ...

model Name Loss/accuracy DataSet Architecture
Soon ... Soon ... Soon ... Soon ...

Inspirations ℹ️

as we mentioned this code is heavilly inspired by many sources such as:
1. tensorflow doc
2. udacity
3. blogs
4. webpages
5. More Soon..

Training Data 💾

we used Mozilla deep Speech and also we created our own dataset with movies, news, audio books, etc...

Pre Processing 🔨

There are amazing Preprocessing types for Speech To Text systems such as : Spectrogram, MFCC, Filter Bank...

we used MFCC and Spectrogram . to Get More Info about this Preprocessing you can check Tutorial Folder.

About pre processing the texts :
we used to tokenize texts to indexes in character level mode. TF.tokenizer and our own code for tokenizations was used for this purpose.

Performance 🔭

Soon.

Clone this wiki locally