DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: An Explainable Identifier of iGHBPs Peptides Based on Deep PSSM Features and Learning Approaches
Authors: Rahu Sikander, Mujeebu Rehman, Tarique Ali Brohi, Arif Ahmed, Ali Ghulam, Sultan Ahmed
Journal: Insights-Journal of Health and Rehabilitation
| Category | From | To |
|---|---|---|
| Y | 2024-10-01 | 2025-12-31 |
Publisher: Health And Research Insights (SMC-Private) Limited
Country: Pakistan
Year: 2024
Volume: 2
Issue: 2
Language: English
DOI: 10.71000/7dqqxs92
Keywords: Deep learningDPCACCGRU
Growth hormone can be effectively and non-covalently communicated with by a growth hormone binding protein (GHBP), also referred to as a soluble carrier protein. Accurately recognizing the GHBP from a certain protein sequence is crucial for comprehending biological processes and cell growth. In the postgenomic era, a lot of protein sequence data has been gathered, which makes it even more urgent to build an integrated computational method that can quickly and precisely identify possible GHBPs from a huge number of candidate proteins. In this work, we provide iGHBP, a growth hormone binding protein (GHBP) predictor tool. To date, scant attention has been paid to protein descriptors, such as the amino acid index, which is a collection of 20 numerical values that indicate different physico-chemical and biological attributes of amino acid sequences and Dipeptide Composition (DPC), are used in feature extraction approaches. This study introduces a novel machine learning predictor called accurate computational identification of growth hormone binding proteins (ac-iGHBPs), utilizing an innovative gate recurrent unit (GRU) technique. We performed a cross-validation investigation to demonstrate the effectiveness of our feature selection process, and the results showed that iGHBP had an accuracy of 84.9%, 7% higher than the control very random tree predictor trained with all characteristics. Furthermore, in an objective examination on a different data set, our new iGHBP strategy performed better than the existing method.
To develop an accurate and efficient computational tool, iGHBP, for predicting growth hormone binding proteins (GHBP) using advanced feature extraction and machine learning techniques.
The study employed two feature extraction methods: amino acid composition (AAC) and dipeptide composition (DPC), both applied to position-specific scoring matrices (PSSM) generated by PSI-BLAST. Machine learning algorithms, including Gated Recurrent Unit (GRU), Random Forest (RF), and K-Nearest Neighbor (K-NN), were evaluated. Performance was assessed using five-fold and ten-fold cross-validation and an independent dataset, with evaluation metrics including accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), and area under the curve (AUC).
graph TD
A["Data Collection & Preparation"] --> B["Feature Extraction: AAC-PSSM, DPC-PSSM"];
B --> C["Machine Learning Model Training"];
C --> D["Model Evaluation: Cross-validation, Independent Dataset"];
D --> E["Performance Assessment: Accuracy, Sensitivity, Specificity, MCC, AUC"];
E --> F["Comparison with Existing Methods"];
F --> G["Conclusion: iGHBP Predictor"];
The study highlights the effectiveness of advanced computational methods, specifically feature mining and machine learning, for predicting GHBPs. The integration of Deep-PSSM with GRU demonstrated superior performance, emphasizing the importance of precise feature selection for high predictive accuracy. The findings suggest potential for advancing GHBP research and facilitating drug discovery. Limitations include a constrained dataset and the need for external validation with larger datasets.
The iGHBP predictor using AAC-PSSM features achieved an accuracy of 95.4%, sensitivity of 91.8%, specificity of 99.1%, and MCC of 92.1%. The DPC-PSSM approach yielded an accuracy of 93.4%, sensitivity of 93.7%, specificity of 93.9%, and MCC of 87.9%. The GRU model, particularly with AAC-PSSM features, outperformed other machine learning models like RF and K-NN.
The iGHBP predictor demonstrated exceptional performance in accurately identifying growth hormone binding proteins, offering a valuable tool for biological research and therapeutic development. Future work will focus on expanding datasets and incorporating advanced validation techniques to further enhance predictive accuracy.
1. The iGHBP predictor achieved an accuracy of 95.4% using AAC-PSSM features. (Confirmed in Results section).
2. The DPC-PSSM approach with GRU achieved an accuracy of 93.4%. (Confirmed in Results section).
3. The study utilized a dataset of 123 proteins initially, and constructed an independent dataset of 46 true positives and 46 negative samples. (Confirmed in Methods section).
Loading PDF...
Loading Statistics...