Skip to content

Commit

Permalink
Address review comments
Browse files Browse the repository at this point in the history
  • Loading branch information
reuben committed Aug 6, 2020
1 parent 4d98958 commit 0b51004
Show file tree
Hide file tree
Showing 27 changed files with 130 additions and 133 deletions.
2 changes: 1 addition & 1 deletion BIBLIOGRAPHY.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
This file contains a list of papers in chronological order that have been published
using Mozilla's DeepSpeech.
using Mozilla Voice STT.

To appear
==========
Expand Down
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Project DeepSpeech
==================
Mozilla Voice STT
=================


.. image:: https://readthedocs.org/projects/deepspeech/badge/?version=latest
Expand All @@ -12,7 +12,7 @@ Project DeepSpeech
:alt: Task Status


DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
Mozilla Voice STT is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Mozilla Voice STT uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.

Documentation for installation, usage, and training models are available on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_.

Expand Down
28 changes: 14 additions & 14 deletions doc/BUILDING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Now, ``cd`` into the ``DeepSpeech/native_client`` directory and use the ``Makefi
.. code-block::
cd ../DeepSpeech/native_client
make deepspeech
make mozilla_voice_stt
Installing your own Binaries
----------------------------
Expand All @@ -121,7 +121,7 @@ Included are a set of generated Python bindings. After following the above build
cd native_client/python
make bindings
pip install dist/deepspeech*
pip install dist/mozilla_voice_stt*
The API mirrors the C++ API and is demonstrated in `client.py <python/client.py>`_. Refer to the `C API <c-usage>` for documentation.

Expand Down Expand Up @@ -175,13 +175,13 @@ And your command line for ``LePotato`` and ``ARM64`` should look like:
While we test only on RPi3 Raspbian Buster and LePotato ARMBian Buster, anything compatible with ``armv7-a cortex-a53`` or ``armv8-a cortex-a53`` should be fine.

The ``deepspeech`` binary can also be cross-built, with ``TARGET=rpi3`` or ``TARGET=rpi3-armv8``. This might require you to setup a system tree using the tool ``multistrap`` and the multitrap configuration files: ``native_client/multistrap_armbian64_buster.conf`` and ``native_client/multistrap_raspbian_buster.conf``.
The ``mozilla_voice_stt`` binary can also be cross-built, with ``TARGET=rpi3`` or ``TARGET=rpi3-armv8``. This might require you to setup a system tree using the tool ``multistrap`` and the multitrap configuration files: ``native_client/multistrap_armbian64_buster.conf`` and ``native_client/multistrap_raspbian_buster.conf``.
The path of the system tree can be overridden from the default values defined in ``definitions.mk`` through the ``RASPBIAN`` ``make`` variable.

.. code-block::
cd ../DeepSpeech/native_client
make TARGET=<system> deepspeech
make TARGET=<system> mozilla_voice_stt
Android devices support
-----------------------
Expand Down Expand Up @@ -236,10 +236,10 @@ Please note that you might have to copy the file to a local Maven repository
and adapt file naming (when missing, the error message should states what
filename it expects and where).

Building C++ ``deepspeech`` binary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Building C++ ``mozilla_voice_stt`` binary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Building the ``deepspeech`` binary will happen through ``ndk-build`` (ARMv7):
Building the ``mozilla_voice_stt`` binary will happen through ``ndk-build`` (ARMv7):

.. code-block::
Expand Down Expand Up @@ -272,32 +272,32 @@ demo of one usage of the application. For example, it's only able to read PCM
mono 16kHz 16-bits file and it might fail on some WAVE file that are not
following exactly the specification.

Running ``deepspeech`` via adb
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Running ``mozilla_voice_stt`` via adb
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You should use ``adb push`` to send data to device, please refer to Android
documentation on how to use that.

Please push Mozilla Voice STT data to ``/sdcard/deepspeech/``\ , including:
Please push Mozilla Voice STT data to ``/sdcard/mozilla_voice_stt/``\ , including:


* ``output_graph.tflite`` which is the TF Lite model
* External scorer file (available from one of our releases), if you want to use
the scorer; please be aware that too big scorer will make the device run out
of memory

Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/ds``\ :
Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/stt``\ :

* ``deepspeech``
* ``mozilla_voice_stt``
* ``libmozilla_voice_stt.so``
* ``libc++_shared.so``

You should then be able to run as usual, using a shell from ``adb shell``\ :

.. code-block::
user@device$ cd /data/local/tmp/ds/
user@device$ LD_LIBRARY_PATH=$(pwd)/ ./deepspeech [...]
user@device$ cd /data/local/tmp/stt/
user@device$ LD_LIBRARY_PATH=$(pwd)/ ./mozilla_voice_stt [...]
Please note that Android linker does not support ``rpath`` so you have to set
``LD_LIBRARY_PATH``. Properly wrapped / packaged bindings does embed the library
Expand Down
26 changes: 13 additions & 13 deletions doc/DotNet-API.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@
==============


DeepSpeech Class
----------------
MozillaVoiceSttModel Class
--------------------------

.. doxygenclass:: DeepSpeechClient::DeepSpeech
.. doxygenclass:: MozillaVoiceSttClient::MozillaVoiceSttModel
:project: deepspeech-dotnet
:members:

DeepSpeechStream Class
----------------------
MozillaVoiceSttStream Class
---------------------------

.. doxygenclass:: DeepSpeechClient::Models::DeepSpeechStream
.. doxygenclass:: MozillaVoiceSttClient::Models::MozillaVoiceSttStream
:project: deepspeech-dotnet
:members:

Expand All @@ -21,33 +21,33 @@ ErrorCodes

See also the main definition including descriptions for each error in :ref:`error-codes`.

.. doxygenenum:: DeepSpeechClient::Enums::ErrorCodes
.. doxygenenum:: MozillaVoiceSttClient::Enums::ErrorCodes
:project: deepspeech-dotnet

Metadata
--------

.. doxygenclass:: DeepSpeechClient::Models::Metadata
.. doxygenclass:: MozillaVoiceSttClient::Models::Metadata
:project: deepspeech-dotnet
:members: Transcripts

CandidateTranscript
-------------------

.. doxygenclass:: DeepSpeechClient::Models::CandidateTranscript
.. doxygenclass:: MozillaVoiceSttClient::Models::CandidateTranscript
:project: deepspeech-dotnet
:members: Tokens, Confidence

TokenMetadata
-------------

.. doxygenclass:: DeepSpeechClient::Models::TokenMetadata
.. doxygenclass:: MozillaVoiceSttClient::Models::TokenMetadata
:project: deepspeech-dotnet
:members: Text, Timestep, StartTime

DeepSpeech Interface
--------------------
IMozillaVoiceSttModel Interface
-------------------------------

.. doxygeninterface:: DeepSpeechClient::Interfaces::IDeepSpeech
.. doxygeninterface:: MozillaVoiceSttClient::Interfaces::IMozillaVoiceSttModel
:project: deepspeech-dotnet
:members:
8 changes: 4 additions & 4 deletions doc/DotNet-Examples.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
.NET API Usage example
======================

Examples are from `native_client/dotnet/DeepSpeechConsole/Program.cs`.
Examples are from `native_client/dotnet/MozillaVoiceSttConsole/Program.cs`.

Creating a model instance and loading model
-------------------------------------------

.. literalinclude:: ../native_client/dotnet/DeepSpeechConsole/Program.cs
.. literalinclude:: ../native_client/dotnet/MozillaVoiceSttConsole/Program.cs
:language: csharp
:linenos:
:lineno-match:
Expand All @@ -16,7 +16,7 @@ Creating a model instance and loading model
Performing inference
--------------------

.. literalinclude:: ../native_client/dotnet/DeepSpeechConsole/Program.cs
.. literalinclude:: ../native_client/dotnet/MozillaVoiceSttConsole/Program.cs
:language: csharp
:linenos:
:lineno-match:
Expand All @@ -26,4 +26,4 @@ Performing inference
Full source code
----------------

See :download:`Full source code<../native_client/dotnet/DeepSpeechConsole/Program.cs>`.
See :download:`Full source code<../native_client/dotnet/MozillaVoiceSttConsole/Program.cs>`.
12 changes: 6 additions & 6 deletions doc/Java-API.rst
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
Java
====

DeepSpeechModel
---------------
MozillaVoiceSttModel
--------------------

.. doxygenclass:: org::mozilla::deepspeech::libdeepspeech::DeepSpeechModel
.. doxygenclass:: org::mozilla::voice::stt::MozillaVoiceSttModel
:project: deepspeech-java
:members:

Metadata
--------

.. doxygenclass:: org::mozilla::deepspeech::libdeepspeech::Metadata
.. doxygenclass:: org::mozilla::voice::stt::Metadata
:project: deepspeech-java
:members: getNumTranscripts, getTranscript

CandidateTranscript
-------------------

.. doxygenclass:: org::mozilla::deepspeech::libdeepspeech::CandidateTranscript
.. doxygenclass:: org::mozilla::voice::stt::CandidateTranscript
:project: deepspeech-java
:members: getNumTokens, getConfidence, getToken

TokenMetadata
-------------
.. doxygenclass:: org::mozilla::deepspeech::libdeepspeech::TokenMetadata
.. doxygenclass:: org::mozilla::voice::stt::TokenMetadata
:project: deepspeech-java
:members: getText, getTimestep, getStartTime
8 changes: 4 additions & 4 deletions doc/Java-Examples.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
Java API Usage example
======================

Examples are from `native_client/java/app/src/main/java/org/mozilla/deepspeech/DeepSpeechActivity.java`.
Examples are from `native_client/java/app/src/main/java/org/mozilla/voice/sttapp/MozillaVoiceSttActivity.java`.

Creating a model instance and loading model
-------------------------------------------

.. literalinclude:: ../native_client/java/app/src/main/java/org/mozilla/deepspeech/DeepSpeechActivity.java
.. literalinclude:: ../native_client/java/app/src/main/java/org/mozilla/voice/sttapp/MozillaVoiceSttActivity.java
:language: java
:linenos:
:lineno-match:
Expand All @@ -16,7 +16,7 @@ Creating a model instance and loading model
Performing inference
--------------------

.. literalinclude:: ../native_client/java/app/src/main/java/org/mozilla/deepspeech/DeepSpeechActivity.java
.. literalinclude:: ../native_client/java/app/src/main/java/org/mozilla/voice/sttapp/MozillaVoiceSttActivity.java
:language: java
:linenos:
:lineno-match:
Expand All @@ -26,4 +26,4 @@ Performing inference
Full source code
----------------

See :download:`Full source code<../native_client/java/app/src/main/java/org/mozilla/deepspeech/DeepSpeechActivity.java>`.
See :download:`Full source code<../native_client/java/app/src/main/java/org/mozilla/voice/sttapp/MozillaVoiceSttActivity.java>`.
4 changes: 2 additions & 2 deletions doc/ParallelOptimization.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Parallel Optimization
=====================

This is how we implement optimization of the DeepSpeech model across GPUs on a
single host. Parallel optimization can take on various forms. For example
This is how we implement optimization of the Mozilla Voice STT model across GPUs
on a single host. Parallel optimization can take on various forms. For example
one can use asynchronous updates of the model, synchronous updates of the model,
or some combination of the two.

Expand Down
28 changes: 14 additions & 14 deletions doc/SUPPORTED_PLATFORMS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,61 +9,61 @@ Linux / AMD64 without GPU
^^^^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Ubuntu 14.04+ (glibc >= 2.19, libstdc++6 >= 4.8)
* Full TensorFlow runtime (``deepspeech`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* Full TensorFlow runtime (``mozilla_voice_stt`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Linux / AMD64 with GPU
^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Ubuntu 14.04+ (glibc >= 2.19, libstdc++6 >= 4.8)
* CUDA 10.0 (and capable GPU)
* Full TensorFlow runtime (``deepspeech`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* Full TensorFlow runtime (``mozilla_voice_stt`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Linux / ARMv7
^^^^^^^^^^^^^
* Cortex-A53 compatible ARMv7 SoC with Neon support
* Raspbian Buster-compatible distribution
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Linux / Aarch64
^^^^^^^^^^^^^^^
* Cortex-A72 compatible Aarch64 SoC
* ARMbian Buster-compatible distribution
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Android / ARMv7
^^^^^^^^^^^^^^^
* ARMv7 SoC with Neon support
* Android 7.0-10.0
* NDK API level >= 21
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Android / Aarch64
^^^^^^^^^^^^^^^^^
* Aarch64 SoC
* Android 7.0-10.0
* NDK API level >= 21
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

macOS / AMD64
^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* macOS >= 10.10
* Full TensorFlow runtime (``deepspeech`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* Full TensorFlow runtime (``mozilla_voice_stt`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Windows / AMD64 without GPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Windows Server >= 2012 R2 ; Windows >= 8.1
* Full TensorFlow runtime (``deepspeech`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* Full TensorFlow runtime (``mozilla_voice_stt`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)

Windows / AMD64 with GPU
^^^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Windows Server >= 2012 R2 ; Windows >= 8.1
* CUDA 10.0 (and capable GPU)
* Full TensorFlow runtime (``deepspeech`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages)
* Full TensorFlow runtime (``mozilla_voice_stt`` packages)
* TensorFlow Lite runtime (``mozilla_voice_stt_tflite`` packages)
6 changes: 3 additions & 3 deletions doc/TRAINING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ Clone the Mozilla Voice STT repository:
Creating a virtual environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In creating a virtual environment you will create a directory containing a ``python3`` binary and everything needed to run deepspeech. You can use whatever directory you want. For the purpose of the documentation, we will rely on ``$HOME/tmp/deepspeech-train-venv``. You can create it using this command:
In creating a virtual environment you will create a directory containing a ``python3`` binary and everything needed to run Mozilla Voice STT. You can use whatever directory you want. For the purpose of the documentation, we will rely on ``$HOME/tmp/stt-train-venv``. You can create it using this command:

.. code-block::
$ python3 -m venv $HOME/tmp/deepspeech-train-venv/
$ python3 -m venv $HOME/tmp/stt-train-venv/
Once this command completes successfully, the environment will be ready to be activated.

Expand All @@ -36,7 +36,7 @@ Each time you need to work with Mozilla Voice STT, you have to *activate* this v

.. code-block::
$ source $HOME/tmp/deepspeech-train-venv/bin/activate
$ source $HOME/tmp/stt-train-venv/bin/activate
Installing Mozilla Voice STT Training Code and its dependencies
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
Loading

0 comments on commit 0b51004

Please sign in to comment.