DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.
Title: Analysis of Code Vulnerabilities in Repositories of GitHub and Rosettacode: A comparative Study
Authors: Abdul Malik, Muhammad Shumail Naveed
Journal: International Journal of Innovations in Science & Technology
Publisher: 50SEA JOURNALS (SMC-PRIVATE) LIMITED
Country: Pakistan
Year: 2022
Volume: 4
Issue: 2
Language: English
Keywords: Software VulnerabilitySoftware SecurityProgramming PortalVulnerability Severity
Open-source code hosted online at programming portals is present in 99% of commercial software and is common practice among developers for rapid prototyping and cost-effective development. However, research reports the presence of vulnerabilities, which result in catastrophic security compromise, and the individual, organization, and even national secrecy are all victims of this circumstance. One of the frustrating aspects of vulnerabilities is that vulnerabilities manifest themselves in hidden ways that software developers are unaware of. One of the most critical tasks in ensuring software security is vulnerability detection, which jeopardizes core security concepts like integrity, authenticity, and availability. This study aims to explore security-related vulnerabilities in programming languages such as C, C++, and Java and present the disparities between them hosted at popular code repositories. To attain this purpose, 708 programs were examined by severity-based guidelines. A total of 1371 vulnerable codes were identified, of which 327 in C, 51 in C++, and 993 in Java. Statistical analysis also indicated a substantial difference between them, as there is ample evidence that the Kruskal-Wallis H-test p-value (.000) is below the 0.05 significance level. The Mann-Whitney Test mean rank for GitHub (Mean-rank=676.05) and Rosettacode (Mean-rank=608.64) are also different. The novelty of this article is to identify security vulnerabilities and grasp the nature severity of vulnerability in popular code repositories. This study eventually manifests a guideline for choosing a secure programming language as a successful testing technique that targets vulnerabilities more liable to breaching security.
Full Text
To explore and compare security-related vulnerabilities in C, C++, and Java programming languages hosted on GitHub and RosettaCode, and to identify disparities between them.
The study involved examining 708 programs from GitHub and RosettaCode, focusing on C, C++, and Java. These programs were selected based on common algorithms and tasks across various disciplines. The code was preprocessed to standardize it for analysis. The Yasca tool was used to scan for vulnerabilities, categorizing them by severity (Critical, High, Warning, Low, Informational). Statistical tests, including the Mann-Whitney Test, Kolmogorov-Smirnov Test, Wald-Wolfowitz Test, and Kruskal-Wallis H-test, were employed to analyze the significance of the findings.
graph TD;
A[Select Programs from GitHub & RosettaCode C, C++, Java] --> B[Preprocess Code];
B --> C[Scan for Vulnerabilities using Yasca];
C --> D[Categorize Vulnerabilities by Severity];
D --> E[Perform Statistical AnalysisMann-Whitney, Kruskal-Wallis, etc.];
E --> F[Analyze Results & Draw Conclusions];
The study highlights that security vulnerabilities are a significant concern in open-source code, impacting software security, confidentiality, integrity, and availability. The comparative analysis of GitHub and RosettaCode, and C, C++, and Java, reveals that certain languages and platforms are more prone to vulnerabilities. The prevalence of informational severity vulnerabilities suggests that while critical issues are less common, a large number of less severe vulnerabilities can still pose risks. The findings suggest that developers should consider both the programming language and the hosting portal when aiming for secure software development.
- A total of 1371 vulnerable codes were identified: 327 in C, 51 in C++, and 993 in Java.
- GitHub repositories contained more vulnerabilities than RosettaCode repositories.
- Java exhibited the highest number of vulnerabilities, followed by C, and then C++.
- Informational severity vulnerabilities were the most prevalent across all categories, while critical severity vulnerabilities were the least reported.
- Statistical analysis confirmed significant differences in code vulnerabilities between programming languages and between programming portals.
The research concludes that GitHub (Java) is the most vulnerable, followed by RosettaCode (Java), GitHub (C), RosettaCode (C), GitHub (C++), and RosettaCode (C++). Informational severity vulnerabilities are most frequent. RosettaCode (C++) appears to be the most secure in terms of vulnerabilities. The study provides a guideline for selecting secure programming languages and effective testing approaches to identify vulnerabilities.
- Total vulnerable codes identified: 1371 (327 in C, 51 in C++, 993 in Java).
- Kruskal-Wallis H-test p-value: .000, which is below the 0.05 significance level, indicating a statistically significant difference in code vulnerabilities between different programming languages.
- Mann-Whitney Test mean ranks for GitHub (676.05) and RosettaCode (608.64) indicate a statistically significant difference in vulnerability disclosure between the portals.
Loading PDF...
Loading Statistics...