| MS Seminar


Name of the Speaker: Mr. SHOUTRIK DAS (EE22S083)
Guide: Dr. Umesh S
Venue: CSD-308 (Conference Hall)
Online meeting link: http://meet.google.com/xwf-wsqb-hwt
Date/Time: 29th July 2025 (Tuesday), 3:00 PM
Title: Advancing Dysarthric Speech Restoration and Multilingual Speech-to-Text Translation for Indian Languages.

Abstract :

For dysarthric speech restoration, we introduce a novel framework using Conditional Flow Matching. This method directly transforms impaired speech into natural-sounding, intelligible speech without requiring parallel data. Our approach leverages discrete acoustic units as an intermediate representation, effectively preserving semantic content while filtering out disorder-specific characteristics. This strategy accelerates model convergence and yields clearer, more consistent outputs compared to directly using noisy acoustic features. In parallel, we advance Automatic Spoken Language Translation (ST) for India's diverse linguistic landscape. We've compiled one of the first large-scale Indian ST datasets, comprising over 4500 hours of English audio with high-quality human translations in eight Indian languages. Building on robust pretrained ASR/ST models, we adapt them to new languages and domains through targeted vocabulary expansion and domain-specific fine-tuning. We also propose novel conditioning strategies, including intermediate CTC supervision, language-aware feature modulation, and a chain-of-thought decoding mechanism that integrates ASR and translation.