DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: DETECTING PHISHING ATTACKS IN CYBERSECURITY USING MACHINE LEARNING WITH DATA PREPROCESSING AND FEATURE ENGINEERING
Authors: Sohaib Latif, Saher Pervaiz
Journal: Kashf Journal of Multidisciplinary Research (KJMR)
| Category | From | To |
|---|---|---|
| Y | 2024-10-01 | 2025-12-31 |
Publisher: Kashf Institute of Development & Studies
Country: Pakistan
Year: 2025
Volume: 2
Issue: 3
Language: en
DOI: 10.71146/kjmr335
Keywords: Ensemble learningFraud detectionPhishing DetectionEmail SecuritySpam Filtering
Phishing attacks are one of the most persistent cybersecurity threats, evolving rapidly to bypass traditional security measures. Given the widespread use of email for sensitive communications, detecting phishing attempts has become more critical than ever. This study explores the effectiveness of multiple machine learning models in classifying phishing emails using a dataset of 39,000 samples. To enhance accuracy, we employ preprocessing techniques such as feature engineering, vectorization, and class balancing with SMOTE (Synthetic Minority Over-sampling Technique). Our analysis compares various models, including Random Forest, XGBoost, Logistic Regression, Naïve Bayes, and AdaBoost, evaluating their performance using precision, recall, F1-score, and accuracy metrics. The results demonstrate that ensemble learning techniques, particularly XGBoost and Random Forest, significantly outperform other models, achieving accuracy rates as high as 99.00%. These findings reinforce the importance of advanced classification techniques and data preprocessing in phishing detection. Beyond academic implications, our research contributes to strengthening email security, mitigating financial losses, and protecting personal data from cyber threats. Future work could focus on integrating deep learning models and real-time detection systems to further improve accuracy and adaptability.
Loading PDF...
Loading Statistics...