AI vs. Human Programmers: Complexity and Performance in Code Generation

Article Information

Title: AI vs. Human Programmers: Complexity and Performance in Code Generation

Authors: Samina Azeem, Muhammad Shumail Naveed, Muhammad Sajid, Imran Ali

Journal: VAWKUM Transactions on Computer Sciences

HEC Recognition History

Category	From	To
Y	2024-10-01	2025-12-31
Y	2023-07-01	2024-09-30
Y	2022-07-01	2023-06-30

Publisher: VFAST-Research Platform

Country: Pakistan

Year: 2025

Volume: 13

Issue: 1

Language: en

DOI: 10.21015/vtcs.v13i1.2043

Abstract

Large language models, such as ChatGPT, have demonstrated the capability to perform diverse tasks across various domains, significantly enhancing efficiency. However, their growing adoption raises concerns about potential job displacement, especially in technical fields. While numerous studies have explored the performance of large language models in technical domains, a notable gap exists in evaluating their capabilities in programming. This study addresses that gap by comparing ChatGPT (GPT-4) with human experts in the programming domain to assess whether ChatGPT has reached a level where it could replace human programmers. To achieve this objective, the study generated 300 Python programs using ChatGPT (GPT-4) and compared them with functionally equivalent programs developed by three experienced human programmers. The evaluation encompassed both quantitative and qualitative analyses, employing metrics such as Halstead Complexity, Cyclomatic Complexity, and expert judgment from two human evaluators. The findings revealed statistically significant differences between ChatGPT generated and human-written code. Programs generated by ChatGPT exhibited verbosity, complexity, and resource demands, as evidenced by higher program volume, difficulty, and cyclomatic complexity scores. In qualitative terms, ChatGPT’s code was more readable but lagged in key areas, including documentation quality, function structuring, and adherence to coding standards. Conversely, human-written programs excelled in maintainability, error handling, and addressing edge cases. Although ChatGPT demonstrated remarkable efficiency in generating functional code, its output required extensive review and refinement to meet standards. The study concluded while ChatGPT serves as valuable tool for code generation, it has not yet reached the level required to replace human expertise in programming.

Paper summary is not available for this article yet.

Loading PDF...

Loading Statistics...

DefinePK

AI vs. Human Programmers: Complexity and Performance in Code Generation

Article Information

HEC Recognition History

Categories

Abstract

DefinePK

Select Collection

AI vs. Human Programmers: Complexity and Performance in Code Generation

Article Information

HEC Recognition History

Categories

Abstract