Can LLMs Fix Real-World Code Maintenance Issues?

Can LLMs Fix Real-World Code Maintenance Issues?

Evaluating Copilot Chat and Llama 3.1 on actual GitHub maintainability problems

This research systematically evaluates how effectively Large Language Models resolve code maintainability issues found in real-world GitHub projects.

Key Findings:

  • Tested 127 maintainability issues across 10 GitHub repositories
  • Compared zero-shot prompting (Copilot Chat, Llama 3.1) vs. few-shot prompting (Llama 3.1)
  • Evaluated solutions for compilation errors, test failures, and introduction of new issues
  • Few-shot prompting with Llama 3.1 demonstrated the best overall performance

Engineering Impact: This research provides practical insights for development teams looking to leverage LLMs for code maintenance and technical debt reduction, helping determine which LLM approaches are most effective for real-world software quality improvement.

Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World Projects

117 | 323