Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Czech language #123

Open
SkaceKamen opened this issue Dec 27, 2024 · 11 comments
Open

Add Czech language #123

SkaceKamen opened this issue Dec 27, 2024 · 11 comments
Labels
feature request feature requests for making ebook2audiobookxtts better fixed in next update (pending)

Comments

@SkaceKamen
Copy link

SkaceKamen commented Dec 27, 2024

Description

Due to a bug in TTS package, coqui-ai/TTS#4098 idiap/coqui-ai-TTS#236 it's not possible to use Czech language right now. I've created an issue in the package repo, but found out it's no longer being maintained. I've made an issue & PR in the TTS library - once that's merged this issue will be resolved.

The issue is caused by quite recent release of num2words package which is used by TTS, see the PR here:
savoirfairelinux/num2words#587

So somehow downgrading that package would help, but I'm not sure if that's possible or even desired. The version that breaks the Czech language is https://github.com/savoirfairelinux/num2words/releases/tag/v0.5.14

Steps to replicate

Try to use Czech language, get a crash.

Stacktrace

File "/usr/local/lib/python3.10/site-packages/TTS/api.py", line 366, in tts_to_file
    wav = self.tts(
  File "/usr/local/lib/python3.10/site-packages/TTS/api.py", line 312, in tts
    wav = self.synthesizer.tts(
  File "/usr/local/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 406, in tts
    outputs = self.tts_model.synthesize(
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 410, in synthesize
    return self.full_inference(text, speaker_wav, language, **settings)
  File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 479, in full_inference
    return self.inference(
  File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 525, in inference
    text_tokens = torch.IntTensor(self.tokenizer.encode(sent, lang=language)).unsqueeze(0).to(self.device)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 666, in encode
    txt = self.preprocess_text(txt, lang)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 652, in preprocess_text
    txt = multilingual_cleaners(txt, lang)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 573, in multilingual_cleaners
    text = expand_numbers_multilingual(text, lang)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 562, in expand_numbers_multilingual
    text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
  File "/usr/local/lib/python3.10/re.py", line 209, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 562, in <lambda>
    text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
  File "/usr/local/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 542, in _expand_number
    return num2words(int(m.group(0)), lang=lang if lang != "cs" else "cz")
  File "/usr/local/lib/python3.10/site-packages/num2words/__init__.py", line 98, in num2words
    raise NotImplementedError()
@ROBERT-MCDOWELL
Copy link
Collaborator

are you using our last git or v2.0.0 ? I see you are using your python system. are you running eb2ab with docker?
try to uninstall num2words and downgrade with pip install num2words==0.5.13. btw if you know where the issue comes from on v0.5.14 of num2words so you can try to create a PR on their repo

@SkaceKamen
Copy link
Author

SkaceKamen commented Dec 27, 2024

Sorry for not specifying details, I was running the docker version - the athomasson2/ebook2audiobookxtts:huggingface image, so I guess I was running the 2.0?

This can be easily resolved by grepping out the lang=lang if lang != "cs" else "cz" from the library itself or as you noted by pinning different version.

The root issue actually isn't with the num2words - they fixed their bug by replacing cz with cs which is the correct parameter. The TTS library should be fixed, either by pinning older version or by specifying correct lang, but unfortunately it's not maintained anymore.

Maybe I can persuade num2words to still support cz as legacy value? AFAIK no language has that code anyway...

@ROBERT-MCDOWELL
Copy link
Collaborator

the right iso fo czech is "cs", "cse", "cze", so it's on coqui-tts to modify it to 'cs', we are working with an active fork and you can tell them here https://github.com/idiap/coqui-ai-TTS

@SkaceKamen
Copy link
Author

Oh, thanks for that, I was looking at the wrong library. I've made an issue & PR in the maintained one:
idiap/coqui-ai-TTS#236
idiap/coqui-ai-TTS#237

@ROBERT-MCDOWELL
Copy link
Collaborator

meanwhile if you find another TTS engine with the same or better quality than coqui-TTS so we can implement it.

@ROBERT-MCDOWELL ROBERT-MCDOWELL added the feature request feature requests for making ebook2audiobookxtts better label Dec 29, 2024
@ROBERT-MCDOWELL ROBERT-MCDOWELL changed the title Czech language broken Add Czech language Dec 29, 2024
@DrewThomasson
Copy link
Owner

@ROBERT-MCDOWELL

meanwhile if you find another TTS engine with the same or better quality than coqui-TTS so we can implement it.

It looks like espeak-ng supports Czech language!

Details here:
https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md

@DrewThomasson
Copy link
Owner

DrewThomasson commented Jan 13, 2025

@ROBERT-MCDOWELL

Looks like coqui actually might already have a Czech VITS model built in!

Thats the same kind of model that the fairseq uses! :D

Demo of audio can be heard here in this video
https://www.youtube.com/watch?v=Vnjv2L31eyQ&t=163s

It's the

tts_models/cs/cv/vits model!

@DrewThomasson
Copy link
Owner

theres also a bunch of languages shown in that vid that we were looking for before
:D like Croatian

full language list from video description here
00:00 Intro
00:55 Thanks for being part of my #youtube voice tech community
01:44 The reference phrase
02:32 Bulgarian
02:43 Czech
02:50 Danish
03:00 Estonian
03:07 Irish
03:18 English
03:29 Spanish
03:40 French
03:47 Dutch
03:57 German
04:31 Hungarian
04:39 Greek
04:48 Finnish
04:58 Croatian
05:06 Lithuanian
05:16 Latvian
05:27 Maltese
05:36 Polish
05:44 Portuguese
05:53 Romanian
06:02 Slovak
06:12 Slovenian
06:19 Swedish

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 13, 2025

that's a good news, so we can include it as fairseq model even if it's not...
most of these EU languages are not supported by mms fairseq so it's a plus.

@DrewThomasson
Copy link
Owner

yes yes yes! :D

@ROBERT-MCDOWELL
Copy link
Collaborator

in fact I sould add the right tts engine...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature requests for making ebook2audiobookxtts better fixed in next update (pending)
Projects
None yet
Development

No branches or pull requests

3 participants