DefinePK

DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.

Dysarthria Detection and Speech-to-Text Transcription Using Deep Learning and Audio Processing


Article Information

Title: Dysarthria Detection and Speech-to-Text Transcription Using Deep Learning and Audio Processing

Authors: Garaga Srilakshmi, Vadakattu Sai Harsha, Kurakula Nitin, Bera Vamsi Krishna, Osipilli David Raju

Journal: Journal of Neonatal Surgery

HEC Recognition History
Category From To
Y 2023-07-01 2024-09-30
Y 2022-07-01 2023-06-30

Publisher: EL-MED-Pub Publishers

Country: Pakistan

Year: 2025

Volume: 14

Issue: 6S

Language: en

Keywords: Mel Frequency Logarithmic Spectrograms

Categories

Abstract

Dysarthria is a motor speech disorder affecting articulation, pitch, and rhythm due to neurological damage in the human body. Early detection is crucial for effective therapy. This study presents a novel dysarthria detection approach using Mel Frequency Logarithmic Spectrograms (MFLS) and Deep Convolutional Neural Networks (DCNN). Speech signals are preprocessed to extract MFLS, capturing essential frequency and temporal features. These spectrograms serve as input to a DCNN, which identifies patterns associated with dysarthric speech.
The model was trained on publicly available datasets, achieving high accuracy and robustness across different severity levels. It performed well under varying conditions such as speech duration, speaker age, and recording quality. Integrating spectrogram-based feature extraction with deep learning enhances automated speech disorder diagnosis.
This study highlights the potential of advanced signal processing for reliable dysarthria detection. Future work may explore additional speech features, multilingual datasets, and real-time applications to improve clinical utility.


Paper summary is not available for this article yet.

Loading PDF...

Loading Statistics...