
Bridging Code and Documentation
A new benchmark dataset for AI-powered software maintenance
CoDocBench provides a novel dataset for training AI systems to align code modifications with documentation changes - a crucial aspect of software maintenance.
Key Contributions:
- Dataset built from real GitHub projects for authentic software engineering tasks
- Enables training of AI agents to both modify code based on documentation changes and update documentation based on code changes
- Addresses a critical gap in software maintenance workflows where documentation and code often become misaligned
Business Impact: This research equips engineering teams with better tools to maintain software documentation consistency, reducing technical debt and improving developer productivity in large codebases.
CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance