Same as the orginal marl/openl3 repo but with seperated audio preprocessing/embedding functions to allow for multi-threading so that CPU can handle preprocessing while GPU handles model inference at the same time.
OpenL3 is an open-source Python library for computing deep audio and image embeddings.
Please refer to the documentation for detailed instructions and examples.
UPDATE: Openl3 now has Tensorflow 2 support!
NOTE: Whoops! A bug was reported in the training code, with the effect that positive audio-image pairs that come from the same video do not necessarily overlap in time. Nonetheless, the embedding still seems to capture useful semantic information.
The audio and image embedding models provided here are published as part of [1], and are based on the Look, Listen and Learn approach [2]. For details about the embedding models and how they were trained, please see:
Look, Listen and Learn More: Design Choices for Deep Audio Embeddings
Aurora Cramer, Ho-Hsiang Wu, Justin Salamon, and Juan Pablo Bello.
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pages 3852–3856, Brighton, UK, May 2019.
OpenL3 depends on the pysoundfile
module to load audio files, which depends on the non-Python library
libsndfile
. On Windows and macOS, these will be installed via pip
and you can therefore skip this step.
However, on Linux this must be installed manually via your platform's package manager.
For Debian-based distributions (such as Ubuntu), this can be done by simply running
apt-get install libsndfile1
Alternatively, if you are using conda
, you can install libsndfile
simply by running
conda install -c conda-forge libsndfile
For more detailed information, please consult the
pysoundfile
installation documentation.
Starting with openl3>=0.4.0
, Openl3 has been upgraded to use Tensorflow 2. Because Tensorflow 2 and higher now includes GPU support, tensorflow>=2.0.0
is included as a dependency and no longer needs to be installed separately.
If you are interested in using Tensorflow 1.x, please install using pip install 'openl3<=0.3.1'
.
Because Tensorflow 1.x comes in CPU-only and GPU variants, we leave it up to the user to install the version that best fits their usecase.
On most platforms, either of the following commands should properly install Tensorflow:
pip install "tensorflow<1.14" # CPU-only version
pip install "tensorflow-gpu<1.14" # GPU version
For more detailed information, please consult the Tensorflow installation documentation.
The simplest way to install OpenL3 is by using pip
, which will also install the additional required dependencies
if needed. To install OpenL3 using pip
, simply run
pip install openl3
To install the latest version of OpenL3 from source:
-
Clone or pull the latest version, only retrieving the
main
branch to avoid downloading the branch where we store the model weight files (these will be properly downloaded during installation).git clone [email protected]:marl/openl3.git --branch main --single-branch
-
Install using pip to handle python dependencies. The installation also downloads model files, which requires a stable network connection.
cd openl3 pip install -e .
To help you get started with OpenL3 please see the tutorial.
Please cite the following papers when using OpenL3 in your work:
[1] Look, Listen and Learn More: Design Choices for Deep Audio Embeddings
Aurora Cramer, Ho-Hsiang Wu, Justin Salamon, and Juan Pablo Bello.
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pages 3852–3856, Brighton, UK, May 2019.
[2] Look, Listen and Learn
Relja Arandjelović and Andrew Zisserman
IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 2017.
The model weights are made available under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.