-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update on getting styletts2 piper-tts and xtts to all work in one install #33
Comments
@DrewThomasson |
What I've managed to find when trying to get ways of getting Calibre's ebook-convert function and ffmpeg built into the pip installFor CalibreI was also looking at getting calibre to work with a pip install instead and found this https://github.com/gutenbergtools/ebookconverter but it doesn't work for windows :( I think we might be able to find the binary ebook-convert exe in a Calibre install on windows to use that instead on windows Info on that ebook-convert for windows FFmpegalso these potential for ffmpeg as a static binary to include in it 👀 |
I already explored what you just found and I found the only solution I'm working on, don't worry about Calibre and Ffmpeg, I found a way to not break the native use. FYI if you use chatGPT or other A.I. to help you to code be aware that sometimes copy/paste can generate a big mess at the end :o) btw faster-whisper and whisperX are good engine for now. |
oh dang! kk👍 lol yeah I honestly Never expected this to blow up so my code was a pretty rushed job using chatgpt to cut corners ngl |
ooops btw faster-whisper and whisperX are more STT thant TTS :o\ |
Oh faster-whisper/whisperX the.... fine tuning xtts script? |
lol yeah was confused when you mentioned it |
with python expect to blow up at anytime with a little glitch in the matrix :D |
piper styleTTS2 are nice indeed, great community, active repo. |
kk👍 lol exactly already in my upcoming plans :) ----> |
bark is also a nice funny engine |
Tru It's suppose to be built into coqui tts But I run into issues trying to run it through their api? I'll look further into it because I do quite like the model What's unique to it is that not only does it clone the voice, but it also changes the speaking style, So like you might have for instance "Once upon a time" And if your using a voice where the sample uses the words like- a lot it might come out like "So like once upon like a time" Very cool |
I'll try it with the new updated repo??? 👀 https://github.com/idiap/coqui-ai-TTS IDK HOW I JUST KNOW ABOUT THIS FINALLY A FORK WHERE UPDATES ARE BEING APPLIED |
I'll update you when I get a result from it lol |
didn't know too! |
Looks like it works on my end!At the moment I've gotten the random speaker thing to work in this from the docs text = "Hello, my name is Manmay , how are you?"
from TTS.tts.configs.bark_config import BarkConfig
from TTS.tts.models.bark import Bark
config = BarkConfig()
model = Bark.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir="path/to/model/dir/", eval=True)
# with random speaker
output_dict = model.synthesize(text, config, speaker_id="random", voice_dirs=None)
# cloning a speaker.
# It assumes that you have a speaker file in `bark_voices/speaker_n/speaker.wav` or `bark_voices/speaker_n/speaker.npz`
output_dict = model.synthesize(text, config, speaker_id="ljspeech", voice_dirs="bark_voices/")
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=False)
model.load_checkpoint(config, checkpoint_dir="""/Users/drew/Library/Application Support/tts/tts_models--multilingual--multi-dataset--bark""", eval=True) Here is the test output fileThese tests were run on my m1 pro 16gb mac laptop in a python 3.10 env lol |
lol I once got this working in a beta version of VoxNovel in Google Colab I know ill be able to get this working for this later adding to list tho lol |
ooo also gona add to the plans to add a way to use deepfilternet2 to denoise any reference input audio files gradio space I made for demo using it lol
|
excellent Drew! denoiser is amazing too! |
@ROBERT-MCDOWELL and I had to change the install instructions I had to specify the also I added the here you can see here pip install coqui-tts==0.24.2 pydub nltk beautifulsoup4 ebooklib tqdm gradio==4.44.0
python -m nltk.downloader punkt
python -m nltk.downloader punkt_tab |
Hope that helps you out with your pip creation |
Any updates? |
it's coming :)... maybe tonight or tomorrow, first clean. other things later... |
@DrewThomasson |
@ROBERT-MCDOWELL
|
Interesting... I'm seeing docker auto-install for all of them? Is that needed for all of them to run? or just a preinstall for anyone to use the docker version? |
DockerfileUtils is for now calibre and ffmpeg. and can be easily extended to further apps if neeeded by only adding the app to the list. This way is avoiding many many issues with versions conflict between python and the OS. the install script is installing python_env 3.11 into the repo folder and we don't have to waste our time with all these conflicts. btw, this PR keeps backward compatibility to your current version. meaning if user wants to use directly app.py so no worries, as long as his OS python is ready to run it. yes please check on Mac and Windows as I don't have VM nor native of these, and frankly I spent already a lot of time on linux so ;0) " Is that needed for all of them to run? or just a preinstall for anyone to use the docker version?" Anyhow, I create a PR and up to you to merge or not... |
Weird, I don't see any pull requests on my end? I'll push it manually once I test it on my end once I'm done reviewing then I guess lol |
I'm working on it :) |
Oh the samples folder is just examples of what xtts sounds like speaking those different languages. It's not needed lol I was gona use the folder of sample texts within an automated test run script too. |
ha ok so I'm going to check each of them and move them into the right voices folder... |
Kk They all used the same voice sample to generate all the sample audio files anyway |
who "they"? |
The samples folder lol Anyway reviewing your PR rn
|
Confirmed Works in ARM MAC! 😄 |
EXCELLENT!!! :D you can keep the root_dir check fix, it doesn't bother. |
@ROBERT-MCDOWELL
I don't know if you'll find this helpful or not but Ive managed to get all Coqui tts, piper-tts and styletts2 all of them to work in one requirements file
Google Colab of it working in
Testing_all_tts_services.ipynb.zip
Huggingface space showing them all working together
https://huggingface.co/spaces/drewThomasson/testing-all-tts-models-together
The text was updated successfully, but these errors were encountered: