-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement missing speed functions along with durable speech rate / speed changer function. #4115
Implement missing speed functions along with durable speech rate / speed changer function. #4115
Conversation
Removes a (GPL) dependency
refactor(dataset): get audio length with torchaudio
refactor(bin.find_unique_chars): use existing function
Reverts c59f0ca (coqui-ai#13) Too many CI test timeouts from installing torch/nvidia packages with uv: astral-sh/uv#1912
Update repository links, package names, release script
Can be handled by adjusting logging levels instead.
Update links and Github actions
The XTTS model itself already supports Hindi, it was just in these components.
Use Python logging instead of print()
feat(xtts): support Hindi for sentence-splitting and fine-tuning
Add tokenizer logging, update version for release 0.23.0
…ids file Previously, running `LanguageManager.init_from_config(config)` would never use the `language_ids_file` if that field is present because it was overwritten in the next line with a new manager that manually parses languages from the datasets in the config. Now that is only used as a fallback.
fix(xtts): clearer error message when file given to checkpoint_dir
Expand Python API capabilities
Improve documentation
feat: allow both Path and strings where possible and add type hints
This way the outputs are available for further downstream processing, e.g. with grep. For TTS/bin/synthesize.py, if --pipe_out is set, log to stderr because then only the output audio stream should be on stdout, e.g. to pipe it to aplay.
fix(bin): log to stdout in cli tools
…ai#237) * Fix num2words call using non-standard lang code * build: update minimum num2words version --------- Co-authored-by: Enno Hermann <[email protected]>
…e durable latents. also missed tts speed implementations added.
|
Since I learned that this repo is no longer used and maintained, I am closing the mr here and opening it to idiap's repo. It continues from here. idiap#239 |
Added missing speed parameters to functions and ensured more durable, accurate speed adjustments with the new
adjust_speech_rate
function.