DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: Optimized Music Classification with a Hybrid VGG16-RNN Using Mel-Spectrogram and MFCC Features
Authors: Mohsin Ashraf, Saima Ashraf
Journal: VAWKUM Transactions on Computer Sciences
Publisher: VFAST-Research Platform
Country: Pakistan
Year: 2024
Volume: 12
Issue: 2
Language: English
Music classification using deep neural networks has gained a lot of attention in recent years. This is due to the difficult task of capturing every essential aspect of music in features and interpretability of classifiers. There is limited research on the integration of VGG16 and RNNs, but the researchers found that few classifiers accurately capture intrinsic musical characteristics. Previous work in this field has primarily focused on spectral features, which has constrained overall performance. To address this issue, we proposed a novel hybrid neural architecture based on Visual Geometry Group 16 (VGG16), which is highly effective in extracting important features from musical variations. We combined VGG16 with several recurrent neural network (RNN) variants, including Gated Recurrent Unit (GRU), Bidirectional GRU (BiGRU), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM). Additionally, we compared their performance for the GTZAN dataset using both Mel-Spectrogram and Mel-Frequency Cepstral Coefficients (MFCC) features. Our results indicate that the VGG16+GRU model achieved the highest accuracy of 89. 60% with Mel spectrograms and 82. 70% with MFCC features. These findings demonstrate the effectiveness of combining advanced feature extraction techniques with deep learning models for music genre classification.
Loading PDF...
Loading Statistics...