-
Notifications
You must be signed in to change notification settings - Fork 28
Home
This is a TensorFlow implementation of the Speech to Text described in the paper which published by BAIDU. The project also uses ideas from Tensorflow Doc, Udacity Course and many other readings from Medium,Webpages etc.
The code is tested using Tensorflow r2 under Ubuntu 18.04 with Python 3. Also this codes are tested in Kaggle Notebooks ( you can use there too.) Deep speech Library versions are defined on their Doc Mozilla Deep Speech Doc to Create your own model
date 📅 | update 🔄 |
---|---|
Marh 2020 20th | Reasearch started. Books,Webpages, Medium writes, and many other sources. You can find these Sources in [Tutorial folder](https://github.com/shenasa-ai/speech2text/tree/master/Speeh2Text_Tutorial) of this Repository |
April 2020 1st | Attention model Code created. |
April 2020 17th | We Started to use Deep Speech mozilla dataset and Model. |
April 2020 30th | A guid for Dataset Creation for Persian Dataset created. To download this pdf file check Tutorial Folder of this repository |
May 2020 7th | CTC Custom Data Generator Code created. |
May 2020 17th | Useful Scripts Collected. you can find them in Useful Scripts folder |
May 2020 27th | Start to Create Dataset. You can find scripts in useful Scripts Folder |
June 2020 5th | We Started Training on Mozilla_Deep_Speech and attention model |
Soon ...
model Name | Loss/accuracy | DataSet | Architecture |
---|---|---|---|
Soon ... | Soon ... | Soon ... | Soon ... |
as we mentioned this code is heavilly inspired by many sources such as:
1. tensorflow doc
2. udacity
3. blogs
4. webpages
5. More Soon..
we used Mozilla deep Speech and also we created our own dataset with movies, news, audio books, etc...
There are amazing Preprocessing types for Speech To Text systems such as : Spectrogram, MFCC, Filter Bank...
we used MFCC and Spectrogram . to Get More Info about this Preprocessing you can check Tutorial Folder.
About pre processing the texts :
we used to tokenize texts to indexes in character level mode. TF.tokenizer and our own code for tokenizations was used for this purpose.
Soon.