DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: Mitigating Cyber Threats: Machine Learning and Explainable AI for Phishing Detection
Authors: Hafiz Muhammad Usman Akhtar, Muhammad Nauman, Nadeem Akhtar, Mustafa Hameed, Sidra Hameed, Muhammad Zeshan Tareen
Journal: VFAST Transactions on Software Engineering
Publisher: VFAST-Research Platform
Country: Pakistan
Year: 2025
Volume: 13
Issue: 2
Language: en
The exponential growth of organizations and users has accelerated the adoption of new technologies, increasing the complexity of online security. Phishing attacks have surged significantly in 2024, with over 932,923 incidents reported in Q3 alone, driven by advanced AI-enabled social engineering tactics. From simple scams to sophisticated schemes exploiting emails, URLs, text messages, and social media platforms, phishing attacks deceive victims into disclosing sensitive information or inadvertently installing malware, often compromising devices as part of more extensive botnet networks. Despite advancements in Cyber-security measures, phishing remains a critical threat, causing substantial financial and reputational damage to businesses. Recently, Machine Learning (ML) algorithms have demonstrated remarkable efficacy in phishing detection; however, many high-performing models operate as black boxes, raising concerns about transparency, interpretability, and trustworthiness—factors essential in high-stakes applications for ensuring reliability, accountability, and regulatory compliance. This research integrates ML techniques with Explainable Artificial Intelligence (XAI) methodologies to address this issue and enhance model interpretability and transparency in phishing detection. The proposed approach employs Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Random Forest, k-Nearest Neighbors (KNN), Twin Support Vector Machine (Twin SVM), and Convolutional Neural Networks (CNN), evaluated across four publicly available datasets to assess performance and interpretability. The research findings reveal that XGBoost achieved the highest accuracy at 99.65%. The Local Interpretable Model-agnostic Explanations (LIME) method was applied to elucidate the importance of feature and model decision-making processes. This comprehensive approach aims to strengthen Cyber-security resilience against phishing threats while promoting model transparency and regulatory compliance.
Loading PDF...
Loading Statistics...