Why AI Code Agents Fail at GitHub Issue Resolution

Why AI Code Agents Fail at GitHub Issue Resolution

Uncovering the pitfalls in LLM-based software development agents

This research systematically analyzes the failure patterns of AI-driven code agents when attempting to resolve real-world GitHub issues.

  • Planning failures (42%): Agents struggle with issue understanding and creating effective execution plans
  • Tool interaction failures (36%): Challenges in correctly utilizing dev tools and APIs
  • Code context understanding failures (22%): Difficulties navigating and comprehending repository structures

For Engineering teams, this research provides critical insights into the current limitations of AI coding assistants, helping set realistic expectations and identify areas where human oversight remains essential in development workflows.

Unveiling Pitfalls: Understanding Why AI-driven Code Agents Fail at GitHub Issue Resolution

228 | 323