DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: ENHANCING SPEECH EMOTION RECOGNITION WITH DEEP LEARNING THROUGH DATA FUSION, SPECTROGRAM AUGMENTATION, AND HYBRID FEATURE INTEGRATION
Authors: Muhammad Talha Jahangir, Mujahid Hussain , Nashitah Alwaz, Muhammad Musawir Saeed, Waheed Ahmad, Uzair Ahmad, Hammad Toheed Khan
Journal: Spectrum of Engineering Sciences
| Category | From | To |
|---|---|---|
| Y | 2024-10-01 | 2025-12-31 |
Publisher: Sociology Educational Nexus Research Institute
Country: Pakistan
Year: 2025
Volume: 3
Issue: 9
Language: en
Keywords: Deep learningSpeech Emotion RecognitionConvolutional Neural NetworkMFCCdata fusionHuman-Computer Interaction (HCI)Bidirectional Long Short-Term MemorySpectrogram AugmentationMel SpectrogramRoot mean square
Speech Emotion Recognition (SER), which lets computers decode human feelings using vocal clues, is among the most vital elements of affective computing. The range of speech patterns, lack of data, and difficulty of emotional expression make it still difficult to get excellent SER accuracy. Data fusion from four baseline datasets RAVDESS, TESS, CREMA-D, and SAVEE is used by our proposed deep learning-based SER architecture. The suggested model design combines Convolutional Neural Networks (CNNs) with Bidirectional Long Short-Term Memory (BiLSTM) to efficiently capture spatial and temporal characteristics. With a remarkable classification accuracy of 98%, the proposed framework improving SER performance and giving computers the ability to immediately detect and respond to human feelings that helps our system foster a more sympathetic and flexible human-computer connection.
Loading PDF...
Loading Statistics...