New Research On CoPilot And Code Quality

Bill Harding’s research highlights that the use of AI tools like GitHub Copilot has led to increased code churn and a rise in copy-pasted code, raising concerns about code quality and maintainability. He emphasizes the need for developers, particularly juniors, to focus on understanding existing code and improving it rather than solely relying on AI-generated suggestions, while also introducing Git Clear as a tool to enhance code management practices.

In a recent discussion, Bill Harding shared insights from his research on the impact of AI tools like GitHub Copilot on code quality and development practices. One of the most striking findings was the significant increase in code churn, which refers to the frequency of changes made to the same lines of code. Previously, the average churn rate in open-source projects was around six months, but it has now accelerated to approximately every two weeks. This rapid turnover indicates that while developers are producing code more quickly, they may not be revising or refactoring it effectively, leading to potential quality issues.

Harding highlighted the rise in copy-pasted code as a concerning trend. His research revealed that for the first time, the amount of copy-pasted code exceeded the amount of moved or refactored code in repositories. This shift suggests that developers are increasingly relying on AI suggestions to duplicate code rather than reusing existing implementations. The data showed that over 10% of lines changed in commits were identical copies, raising concerns about maintainability and the potential for bugs due to duplicated logic across different parts of the codebase.

The discussion also touched on the implications of increased deletions and the frequency of code revisions shortly after being authored. Harding noted that 70% of new lines of code were revised within two weeks of being added, which aligns with findings from Google Dora that indicated a rise in defect rates associated with AI adoption. This trend suggests that while AI tools may enhance productivity in terms of code volume, they could also lead to a decline in code quality and an increase in technical debt.

As the conversation progressed, Harding emphasized the importance of understanding existing codebases and the need for developers, especially juniors, to focus on maintainability rather than merely producing more lines of code. He advised that junior developers should strive to demonstrate their ability to reuse and improve existing code rather than relying solely on AI-generated suggestions. This approach not only enhances code quality but also sets them apart in a competitive job market.

Finally, Harding introduced Git Clear, a tool designed to help developers and teams track code changes and understand the evolution of their codebases. Git Clear aims to alleviate the challenges associated with reviewing pull requests and identifying code movements, ultimately promoting better code management practices. The discussion concluded with a call for developers and managers to reconsider how productivity is measured, advocating for a focus on maintainability and the long-term health of codebases rather than just the quantity of code produced.