Need guidance on how to process audio waveform to remove filler words and sounds? #2500

siddhsql · 2025-01-15T22:30:23Z

siddhsql
Jan 15, 2025

Hello,

I don't know much about speech processing but I have an audio signal and all I want to do is to process it to remove occurrences of filler words and sounds like umm/ugh etc. can someone guide me how to do this? i don't think i should be transcribing the signal (speech to text) and then re-encoding text to speech as that will result in loss of precise timestamps, intonations etc. can someone give me any pointers how to do this and what libraries are available if any? thanks.

edit: to refine my question, is it possible to use whisper to process an audio and get timestamps (start and end) of the pieces in the audio where the speech is unintelligible?

EtienneAb3d · 2025-01-16T08:39:42Z

EtienneAb3d
Jan 16, 2025

You may try something like this Audio Tagger:
https://github.com/YuanGongND/whisper-at

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need guidance on how to process audio waveform to remove filler words and sounds? #2500

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Need guidance on how to process audio waveform to remove filler words and sounds? #2500

siddhsql Jan 15, 2025

Replies: 1 comment

EtienneAb3d Jan 16, 2025

siddhsql
Jan 15, 2025

EtienneAb3d
Jan 16, 2025