Repository that contains scripts to reproduce results & more, presented on "A Robust Deep Learning based System for Environmental Audio Compression and Classification, Audio Engineering Society, 154 (May,2023)" Both Auto-Encoder models (5-Folds) are contained in the folder pretrained_autoencoders
- Retrieve data structure from the following link: https://drive.google.com/drive/folders/15yfoJC5PvlbIR0AZv3BGIXcbGB9ib5Pa?usp=sharing
- Save the folder into the root directory (EnvCompClass)
- Run autoencoders.py to train the Auto-Encoder architectures, set use_SE to True in order to include Squeeze-and-Excitation Network. This script will store each architecture in autoencoders folder. Script will also return validation metrics (PSNR, SSIM & PESQ) per folder and total (Mean & Std Deviation)
There are 2 tested Classification Tasks
- ESC-50 total Classification, involving:
- ACDNet
- SE-ACDNet
- CNN-1D
- Binary Classification (PR & NPR), involving :
- ACDNet
- On Original & Reconstructed Audio
- SE-ACDNet
- On Original & Reconstructed Audio
- CNN-1D
- On reconstructed Audio
- On compressed representation Experiments on Reconstructed & Compressed representation, define the Overall System (Compression & Classification) tests
- ACDNet
- Visit https://github.com/mohaimenz/acdnet, follow instructions for data preparation.
- In opts.py, set opt.binary = True, opt.nClasses = 2 to conduct binary classification experiments. In val_generator.py, set opt.batchsize = 1600 or lower
- Run ./common/prepare_dataset.py to generate data in appropriate format
- Run ./common/val_generator.py to generate validation & testing data in appropriate format
- Run ./torh/trainer.py and follow on screen instructions
- For Original audio as input:
- In ./torch/trainer.py, exclude enc model (comment) and make sure to remove (comment out) enc model from training & validation processes
- Using ACDNet and SE-ACDNet:
- Reshape the input to match(Batch_size,1,1,22050)
- For Reconstructed audio as input:
- Include enc model. Specify model's path in opts.py (there's an already example)
- Using CAE as Auto-Encoder:
- in ./torch/resources/models.py go to get_ae and remove the lines referring to squeeze-and-excitation networks (in class Autoencoders and in function get_ae)
- For Compressed audio as input:
- in ./torch/resources/models.py specify as output of the auto-encoder the bottleneck output Don't forget to include the autoencoder in the trainer's __validate function and, if using (SE)ACDNet, reshape to match models' input
- For Original audio as input:
- Run ./torch/tester.py following the screen-instructions to test models' performance
This repository is currently under construction, updates will follow soon