Skip to content

Commit

Permalink
Merge branch 'rename-real'
Browse files Browse the repository at this point in the history
  • Loading branch information
reuben committed Aug 6, 2020
2 parents 2eb75b6 + 0b51004 commit ae9fdb1
Show file tree
Hide file tree
Showing 184 changed files with 1,333 additions and 1,497 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,6 @@
/doc/xml-java/
Dockerfile.build
Dockerfile.train
doc/xml-c
doc/xml-java
doc/xml-dotnet
2 changes: 1 addition & 1 deletion BIBLIOGRAPHY.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
This file contains a list of papers in chronological order that have been published
using Mozilla's DeepSpeech.
using Mozilla Voice STT.

To appear
==========
Expand Down
6 changes: 3 additions & 3 deletions Dockerfile.build.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -149,20 +149,20 @@ RUN bazel build \
--copt=-msse4.2 \
--copt=-mavx \
--copt=-fvisibility=hidden \
//native_client:libdeepspeech.so \
//native_client:libmozilla_voice_stt.so \
--verbose_failures \
--action_env=LD_LIBRARY_PATH=${LD_LIBRARY_PATH}

# Copy built libs to /DeepSpeech/native_client
RUN cp bazel-bin/native_client/libdeepspeech.so /DeepSpeech/native_client/
RUN cp bazel-bin/native_client/libmozilla_voice_stt.so /DeepSpeech/native_client/

# Build client.cc and install Python client and decoder bindings
ENV TFDIR /DeepSpeech/tensorflow

RUN nproc

WORKDIR /DeepSpeech/native_client
RUN make NUM_PROCESSES=$(nproc) deepspeech
RUN make NUM_PROCESSES=$(nproc) mozilla_voice_stt

WORKDIR /DeepSpeech
RUN cd native_client/python && make NUM_PROCESSES=$(nproc) bindings
Expand Down
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Project DeepSpeech
==================
Mozilla Voice STT
=================


.. image:: https://readthedocs.org/projects/deepspeech/badge/?version=latest
Expand All @@ -12,7 +12,7 @@ Project DeepSpeech
:alt: Task Status


DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
Mozilla Voice STT is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Mozilla Voice STT uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.

Documentation for installation, usage, and training models are available on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_.

Expand Down
14 changes: 4 additions & 10 deletions doc/DeepSpeech.rst → doc/AcousticModel.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
DeepSpeech Model
================

The aim of this project is to create a simple, open, and ubiquitous speech
recognition engine. Simple, in that the engine should not require server-class
hardware to execute. Open, in that the code and models are released under the
Mozilla Public License. Ubiquitous, in that the engine should run on many
platforms and have bindings to many different languages.
Mozilla Voice STT Acoustic Model
================================

The architecture of the engine was originally motivated by that presented in
`Deep Speech: Scaling up end-to-end speech recognition <http://arxiv.org/abs/1412.5567>`_.
Expand Down Expand Up @@ -77,7 +71,7 @@ with respect to all of the model parameters may be done via back-propagation
through the rest of the network. We use the Adam method for training
`[3] <http://arxiv.org/abs/1412.6980>`_.

The complete RNN model is illustrated in the figure below.
The complete LSTM model is illustrated in the figure below.

.. image:: ../images/rnn_fig-624x598.png
:alt: DeepSpeech BRNN
:alt: Mozilla Voice STT LSTM
90 changes: 45 additions & 45 deletions doc/BUILDING.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
.. _build-native-client:

Building DeepSpeech Binaries
============================
Building Mozilla Voice STT Binaries
===================================

This section describes how to rebuild binaries. We have already several prebuilt binaries for all the supported platform,
it is highly advised to use them except if you know what you are doing.

If you'd like to build the DeepSpeech binaries yourself, you'll need the following pre-requisites downloaded and installed:
If you'd like to build the Mozilla Voice STT binaries yourself, you'll need the following pre-requisites downloaded and installed:

* `Bazel 2.0.0 <https://github.com/bazelbuild/bazel/releases/tag/2.0.0>`_
* `General TensorFlow r2.2 requirements <https://www.tensorflow.org/install/source#tested_build_configurations>`_
Expand All @@ -26,14 +26,14 @@ If you'd like to build the language bindings or the decoder package, you'll also
Dependencies
------------

If you follow these instructions, you should compile your own binaries of DeepSpeech (built on TensorFlow using Bazel).
If you follow these instructions, you should compile your own binaries of Mozilla Voice STT (built on TensorFlow using Bazel).

For more information on configuring TensorFlow, read the docs up to the end of `"Configure the Build" <https://www.tensorflow.org/install/source#configure_the_build>`_.

Checkout source code
^^^^^^^^^^^^^^^^^^^^

Clone DeepSpeech source code (TensorFlow will come as a submdule):
Clone Mozilla Voice STT source code (TensorFlow will come as a submdule):

.. code-block::
Expand All @@ -56,24 +56,24 @@ After you have installed the correct version of Bazel, configure TensorFlow:
cd tensorflow
./configure
Compile DeepSpeech
------------------
Compile Mozilla Voice STT
-------------------------

Compile ``libdeepspeech.so``
Compile ``libmozilla_voice_stt.so``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Within your TensorFlow directory, there should be a symbolic link to the DeepSpeech ``native_client`` directory. If it is not present, create it with the follow command:
Within your TensorFlow directory, there should be a symbolic link to the Mozilla Voice STT ``native_client`` directory. If it is not present, create it with the follow command:

.. code-block::
cd tensorflow
ln -s ../native_client
You can now use Bazel to build the main DeepSpeech library, ``libdeepspeech.so``. Add ``--config=cuda`` if you want a CUDA build.
You can now use Bazel to build the main Mozilla Voice STT library, ``libmozilla_voice_stt.so``. Add ``--config=cuda`` if you want a CUDA build.

.. code-block::
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libmozilla_voice_stt.so
The generated binaries will be saved to ``bazel-bin/native_client/``.

Expand All @@ -82,12 +82,12 @@ The generated binaries will be saved to ``bazel-bin/native_client/``.
Compile ``generate_scorer_package``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Following the same setup as for ``libdeepspeech.so`` above, you can rebuild the ``generate_scorer_package`` binary by adding its target to the command line: ``//native_client:generate_scorer_package``.
Following the same setup as for ``libmozilla_voice_stt.so`` above, you can rebuild the ``generate_scorer_package`` binary by adding its target to the command line: ``//native_client:generate_scorer_package``.
Using the example from above you can build the library and that binary at the same time:

.. code-block::
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:generate_scorer_package
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libmozilla_voice_stt.so //native_client:generate_scorer_package
The generated binaries will be saved to ``bazel-bin/native_client/``.

Expand All @@ -99,7 +99,7 @@ Now, ``cd`` into the ``DeepSpeech/native_client`` directory and use the ``Makefi
.. code-block::
cd ../DeepSpeech/native_client
make deepspeech
make mozilla_voice_stt
Installing your own Binaries
----------------------------
Expand All @@ -121,9 +121,9 @@ Included are a set of generated Python bindings. After following the above build
cd native_client/python
make bindings
pip install dist/deepspeech*
pip install dist/mozilla_voice_stt*
The API mirrors the C++ API and is demonstrated in `client.py <python/client.py>`_. Refer to `deepspeech.h <deepspeech.h>`_ for documentation.
The API mirrors the C++ API and is demonstrated in `client.py <python/client.py>`_. Refer to the `C API <c-usage>` for documentation.

Install NodeJS / ElectronJS bindings
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -136,7 +136,7 @@ After following the above build and installation instructions, the Node.JS bindi
make build
make npm-pack
This will create the package ``deepspeech-VERSION.tgz`` in ``native_client/javascript``.
This will create the package ``mozilla_voice_stt-VERSION.tgz`` in ``native_client/javascript``.

Install the CTC decoder package
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -165,23 +165,23 @@ So your command line for ``RPi3`` and ``ARMv7`` should look like:

.. code-block::
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=rpi3 --config=rpi3_opt -c opt --copt=-O3 --copt=-fvisibility=hidden //native_client:libdeepspeech.so
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=rpi3 --config=rpi3_opt -c opt --copt=-O3 --copt=-fvisibility=hidden //native_client:libmozilla_voice_stt.so
And your command line for ``LePotato`` and ``ARM64`` should look like:

.. code-block::
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=rpi3-armv8 --config=rpi3-armv8_opt -c opt --copt=-O3 --copt=-fvisibility=hidden //native_client:libdeepspeech.so
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=rpi3-armv8 --config=rpi3-armv8_opt -c opt --copt=-O3 --copt=-fvisibility=hidden //native_client:libmozilla_voice_stt.so
While we test only on RPi3 Raspbian Buster and LePotato ARMBian Buster, anything compatible with ``armv7-a cortex-a53`` or ``armv8-a cortex-a53`` should be fine.

The ``deepspeech`` binary can also be cross-built, with ``TARGET=rpi3`` or ``TARGET=rpi3-armv8``. This might require you to setup a system tree using the tool ``multistrap`` and the multitrap configuration files: ``native_client/multistrap_armbian64_buster.conf`` and ``native_client/multistrap_raspbian_buster.conf``.
The ``mozilla_voice_stt`` binary can also be cross-built, with ``TARGET=rpi3`` or ``TARGET=rpi3-armv8``. This might require you to setup a system tree using the tool ``multistrap`` and the multitrap configuration files: ``native_client/multistrap_armbian64_buster.conf`` and ``native_client/multistrap_raspbian_buster.conf``.
The path of the system tree can be overridden from the default values defined in ``definitions.mk`` through the ``RASPBIAN`` ``make`` variable.

.. code-block::
cd ../DeepSpeech/native_client
make TARGET=<system> deepspeech
make TARGET=<system> mozilla_voice_stt
Android devices support
-----------------------
Expand All @@ -193,53 +193,53 @@ Please refer to TensorFlow documentation on how to setup the environment to buil
Using the library from Android project
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

We provide uptodate and tested ``libdeepspeech`` usable as an ``AAR`` package,
We provide up-to-date and tested STT usable as an ``AAR`` package,
for Android versions starting with 7.0 to 11.0. The package is published on
`JCenter <https://bintray.com/alissy/org.mozilla.deepspeech/libdeepspeech>`_,
`JCenter <https://bintray.com/alissy/org.mozilla.voice/stt>`_,
and the ``JCenter`` repository should be available by default in any Android
project. Please make sure your project is setup to pull from this repository.
You can then include the library by just adding this line to your
``gradle.build``, adjusting ``VERSION`` to the version you need:

.. code-block::
implementation 'deepspeech.mozilla.org:libdeepspeech:VERSION@aar'
implementation 'voice.mozilla.org:stt:VERSION@aar'
Building ``libdeepspeech.so``
Building ``libmozilla_voice_stt.so``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You can build the ``libdeepspeech.so`` using (ARMv7):
You can build the ``libmozilla_voice_stt.so`` using (ARMv7):

.. code-block::
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 //native_client:libdeepspeech.so
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 //native_client:libmozilla_voice_stt.so
Or (ARM64):

.. code-block::
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 //native_client:libdeepspeech.so
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 //native_client:libmozilla_voice_stt.so
Building ``libdeepspeech.aar``
Building ``libmozillavoicestt.aar``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In the unlikely event you have to rebuild the JNI bindings, source code is
available under the ``libdeepspeech`` subdirectory. Building depends on shared
object: please ensure to place ``libdeepspeech.so`` into the
``libdeepspeech/libs/{arm64-v8a,armeabi-v7a,x86_64}/`` matching subdirectories.
available under the ``libmozillavoicestt`` subdirectory. Building depends on shared
object: please ensure to place ``libmozilla_voice_stt.so`` into the
``libmozillavoicestt/libs/{arm64-v8a,armeabi-v7a,x86_64}/`` matching subdirectories.

Building the bindings is managed by ``gradle`` and should be limited to issuing
``./gradlew libdeepspeech:build``, producing an ``AAR`` package in
``./libdeepspeech/build/outputs/aar/``.
``./gradlew libmozillavoicestt:build``, producing an ``AAR`` package in
``./libmozillavoicestt/build/outputs/aar/``.

Please note that you might have to copy the file to a local Maven repository
and adapt file naming (when missing, the error message should states what
filename it expects and where).

Building C++ ``deepspeech`` binary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Building C++ ``mozilla_voice_stt`` binary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Building the ``deepspeech`` binary will happen through ``ndk-build`` (ARMv7):
Building the ``mozilla_voice_stt`` binary will happen through ``ndk-build`` (ARMv7):

.. code-block::
Expand Down Expand Up @@ -272,32 +272,32 @@ demo of one usage of the application. For example, it's only able to read PCM
mono 16kHz 16-bits file and it might fail on some WAVE file that are not
following exactly the specification.

Running ``deepspeech`` via adb
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Running ``mozilla_voice_stt`` via adb
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You should use ``adb push`` to send data to device, please refer to Android
documentation on how to use that.

Please push DeepSpeech data to ``/sdcard/deepspeech/``\ , including:
Please push Mozilla Voice STT data to ``/sdcard/mozilla_voice_stt/``\ , including:


* ``output_graph.tflite`` which is the TF Lite model
* External scorer file (available from one of our releases), if you want to use
the scorer; please be aware that too big scorer will make the device run out
of memory

Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/ds``\ :
Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/stt``\ :

* ``deepspeech``
* ``libdeepspeech.so``
* ``mozilla_voice_stt``
* ``libmozilla_voice_stt.so``
* ``libc++_shared.so``

You should then be able to run as usual, using a shell from ``adb shell``\ :

.. code-block::
user@device$ cd /data/local/tmp/ds/
user@device$ LD_LIBRARY_PATH=$(pwd)/ ./deepspeech [...]
user@device$ cd /data/local/tmp/stt/
user@device$ LD_LIBRARY_PATH=$(pwd)/ ./mozilla_voice_stt [...]
Please note that Android linker does not support ``rpath`` so you have to set
``LD_LIBRARY_PATH``. Properly wrapped / packaged bindings does embed the library
Expand Down
Loading

0 comments on commit ae9fdb1

Please sign in to comment.