
Why AI Code Agents Fail at GitHub Issue Resolution
Uncovering the pitfalls in LLM-based software development agents
This research systematically analyzes the failure patterns of AI-driven code agents when attempting to resolve real-world GitHub issues.
- Planning failures (42%): Agents struggle with issue understanding and creating effective execution plans
- Tool interaction failures (36%): Challenges in correctly utilizing dev tools and APIs
- Code context understanding failures (22%): Difficulties navigating and comprehending repository structures
For Engineering teams, this research provides critical insights into the current limitations of AI coding assistants, helping set realistic expectations and identify areas where human oversight remains essential in development workflows.
Unveiling Pitfalls: Understanding Why AI-driven Code Agents Fail at GitHub Issue Resolution