DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: Dysarthria Detection and Speech-to-Text Transcription Using Deep Learning and Audio Processing
Authors: Garaga Srilakshmi, Vadakattu Sai Harsha, Kurakula Nitin, Bera Vamsi Krishna, Osipilli David Raju
Journal: Journal of Neonatal Surgery
Publisher: EL-MED-Pub Publishers
Country: Pakistan
Year: 2025
Volume: 14
Issue: 6S
Language: en
Keywords: Mel Frequency Logarithmic Spectrograms
Dysarthria is a motor speech disorder affecting articulation, pitch, and rhythm due to neurological damage in the human body. Early detection is crucial for effective therapy. This study presents a novel dysarthria detection approach using Mel Frequency Logarithmic Spectrograms (MFLS) and Deep Convolutional Neural Networks (DCNN). Speech signals are preprocessed to extract MFLS, capturing essential frequency and temporal features. These spectrograms serve as input to a DCNN, which identifies patterns associated with dysarthric speech.
The model was trained on publicly available datasets, achieving high accuracy and robustness across different severity levels. It performed well under varying conditions such as speech duration, speaker age, and recording quality. Integrating spectrogram-based feature extraction with deep learning enhances automated speech disorder diagnosis.
This study highlights the potential of advanced signal processing for reliable dysarthria detection. Future work may explore additional speech features, multilingual datasets, and real-time applications to improve clinical utility.
Loading PDF...
Loading Statistics...