Merge pull request tensorflow#5722 from SNeugber/master

Switching to more robust pysoundfile for reading wav files
hotbbsun · Nov 20, 2018 · d299118 · d299118
2 parents 5b8e8cd + de51e74
commit d299118
Show file tree

Hide file tree

Showing 2 changed files with 7 additions and 5 deletions.
diff --git a/research/audioset/README.md b/research/audioset/README.md
@@ -49,14 +49,15 @@ VGGish depends on the following Python packages:
 * [`resampy`](http://resampy.readthedocs.io/en/latest/)
 * [`tensorflow`](http://www.tensorflow.org/)
 * [`six`](https://pythonhosted.org/six/)
+* [`pysoundfile`](https://pysoundfile.readthedocs.io/)
 
 These are all easily installable via, e.g., `pip install numpy` (as in the
 example command sequence below).
 
 Any reasonably recent version of these packages should work. TensorFlow should
-be at least version 1.0.  We have tested with Python 2.7.6 and 3.4.3 on an
-Ubuntu-like system with NumPy v1.13.1, SciPy v0.19.1, resampy v0.1.5, TensorFlow
-v1.2.1, and Six v1.10.0.
+be at least version 1.0.  We have tested that everything works on Ubuntu and
+Windows 10 with Python 3.6.6, Numpy v1.15.4, SciPy v1.1.0, resampy v0.2.1,
+TensorFlow v1.3.0, Six v1.11.0 and PySoundFile 0.9.0.
 
 VGGish also requires downloading two data files:
 

diff --git a/research/audioset/vggish_input.py b/research/audioset/vggish_input.py
@@ -17,11 +17,12 @@
 
 import numpy as np
 import resampy
-from scipy.io import wavfile
 
 import mel_features
 import vggish_params
 
+import soundfile as sf
+
 
 def waveform_to_examples(data, sample_rate):
   """Converts audio waveform into an array of examples for VGGish.
@@ -80,7 +81,7 @@ def wavfile_to_examples(wav_file):
   Returns:
     See waveform_to_examples.
   """
-  sr, wav_data = wavfile.read(wav_file)
+  wav_data, sr = sf.read(wav_file, dtype='int16')
   assert wav_data.dtype == np.int16, 'Bad sample type: %r' % wav_data.dtype
   samples = wav_data / 32768.0  # Convert to [-1.0, +1.0]
   return waveform_to_examples(samples, sr)