Bridging Code and Documentation

Bridging Code and Documentation

A new benchmark dataset for AI-powered software maintenance

CoDocBench provides a novel dataset for training AI systems to align code modifications with documentation changes - a crucial aspect of software maintenance.

Key Contributions:

  • Dataset built from real GitHub projects for authentic software engineering tasks
  • Enables training of AI agents to both modify code based on documentation changes and update documentation based on code changes
  • Addresses a critical gap in software maintenance workflows where documentation and code often become misaligned

Business Impact: This research equips engineering teams with better tools to maintain software documentation consistency, reducing technical debt and improving developer productivity in large codebases.

CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance

110 | 323