Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Kokoro English TTS model and update example usage #73

Closed
wants to merge 1 commit into from

Conversation

thewh1teagle
Copy link
Owner

@thewh1teagle thewh1teagle commented Jan 16, 2025

Add kokoro and update sherpa-onnx

TODO:

  • Wait for new prebuilt release of sherpa
  • Fix vits and matcha config in tts.rs
  • Find better way to initiate the tts struct. sherpa-onnx expect in single struct all variations of tts. maybe we'll use some_field: unsafe { std::mem::zerod } instead of filling bunch of fields each time.
  • Fix kokoro: it failed with
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.55s
     Running `target/debug/examples/tts --text Hello --output audio.wav --model kokoro-en-v0_19/model.onnx --tokens kokoro-en-v0_19/tokens.txt --data-dir kokoro-en-v0_19/espeak-ng-data --voices-path kokoro-en-v0_19/voices.bin`
"kokoro-en-v0_19/model.onnx" kokoro-en-v0_19/voices.bin kokoro-en-v0_19/tokens.txt kokoro-en-v0_19/espeak-ng-data 1
fatal runtime error: Rust cannot catch foreign exceptions
[1]    82967 abort      RUST_LOG=DEBUG cargo run -p sherpa-rs --example tts --no-default-features  --

@csukuangfj
Copy link

Are there more logs?

@thewh1teagle
Copy link
Owner Author

thewh1teagle commented Jan 18, 2025

Are there more logs?

It crash in the CreateOfflineTts() but I couldn't get more logs.
I set everything else except to kokoro to null / 0
tried both pre compiled and manual compiled sherpa

let tts_config = sherpa_rs_sys::SherpaOnnxOfflineTtsConfig {
    max_num_sentences: 0,
    model: sherpa_rs_sys::SherpaOnnxOfflineTtsModelConfig {
        kokoro: sherpa_rs_sys::SherpaOnnxOfflineTtsKokoroModelConfig {
            model: model.as_ptr(),
            voices: voices.as_ptr(),
            tokens: tokens.as_ptr(),
            data_dir: data_dir.as_ptr(),
            length_scale: 1.0,
        },
        matcha: sherpa_rs_sys::SherpaOnnxOfflineTtsMatchaModelConfig {
            acoustic_model: null(),
            vocoder: null(),
            lexicon: null(),
            tokens: null(),
            data_dir: null(),
            noise_scale: 0.0,
            length_scale: 0.0,
            dict_dir: null(),
        },
        vits: sherpa_rs_sys::SherpaOnnxOfflineTtsVitsModelConfig {
            data_dir: null(),
            dict_dir: null(),
            length_scale: 0.0,
            lexicon: null(),
            model: null(),
            noise_scale: 0.0,
            noise_scale_w: 0.0,
            tokens: null(),
        },
        num_threads: config.num_threads.unwrap_or(2),
        debug: 1,
        provider: provider.as_ptr(),
    },
    rule_fars: null(),
    rule_fsts: null(),
};
let tts = unsafe { sherpa_rs_sys::SherpaOnnxCreateOfflineTts(&tts_config) };

@thewh1teagle
Copy link
Owner Author

Maybe related to the fact the my default provider is coreml. I think I'm going to set the provider to CPU always by default to prevent such hard to debug errors

@thewh1teagle
Copy link
Owner Author

Works with cpu! (I think the issue was with coreml)

cargo run --no-default-features --features="tts" --example tts --features="tts" -- \
        --text 'Hello! this audio generated by kokoro!' --output audio.wav --tokens "./kokoro-en-v0_19/tokens.txt" --model "./kokoro-en-v0_19/model.onnx" --voices ./kokoro-en-v0_19/voices.bin --data-dir "./kokoro-en-v0_19/espeak-ng-data" --sid 1 --provider cpu

@csukuangfj
Copy link

Is memory layout of the rust tts config struct the same as c-api?

@thewh1teagle
Copy link
Owner Author

thewh1teagle commented Jan 18, 2025

Is memory layout of the rust tts config struct the same as c-api?

Should be of course. I use bindgen that creates many Rust structs / functions that are identical to sherpa-onnx.
Then I import that raw bindings and create the structs / call the functions as I pasted here.
It's a bit expressive currently as you can see I can't use memset to zero to get defaults on the main struct so I need to fill all of them. but I'll find a way to do it.

Note to myself:
This should work

let matcha_config = unsafe { std::mem::zeroed::<sherpa_rs_sys::SherpaOnnxOfflineTtsMatchaModelConfig>() };

@thewh1teagle
Copy link
Owner Author

thewh1teagle commented Jan 18, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants