This repository is the vocoder of our model (HPMDubbing), which is used to convert the mel-spectrogram generated by our model into time-domain waveform.
We provide the pretrained models. One can download the checkpoints of generator (e.g., g_05000000) within the listed folders.
Folder Name | Sampling Rate | Hop Length | Segment Size | Win Length | Params. | Dataset | Fine-Tuned |
---|---|---|---|---|---|---|---|
HPM_Chem | 16000 Hz | 160 | 8000 | 640 | 55M | LibriTTS | No |
HPM_V2C | 22050 Hz | 220 | 9900 | 880 | 58M | LibriTTS | No |
- Please run
or
python train_V2C_HiFiGAN.py --config config_V2C_22050Hz.json
python train_hifigan_16KHz.py --config config_Chem_16KHz.json
- inference.py : wav -> mel -> wav
python inference.py --checkpoint_file [Your path of checkpoint_file]
- inference_e2e.py : mel -> wav
python inference_e2e.py --checkpoint_file [Your path of checkpoint_file]
- Please run
or
tensorboard --logdir HifiGAN_16/logs/ --port=[Your port]
tensorboard --logdir My_vocoder_V2C/logs/ --port=[Your port]