Keeping AI Code Assistants Up-to-Date

CodeUpdateArena introduces the first benchmark for evaluating how code-generating AI models can update their knowledge when APIs evolve.

Tests LLMs' ability to incorporate new programming API information
Measures knowledge editing effectiveness across different programming languages
Evaluates both traditional fine-tuning and newer knowledge editing techniques

This research addresses a critical engineering challenge: as software libraries constantly change, AI code assistants must stay current with the latest API specifications to remain reliable development tools.

CodeUpdateArena: Benchmarking Knowledge Editing on API Updates