
Beyond Single Functions: Translating Entire Code Repositories
A new framework for evaluating large language models in code migration projects
This research introduces Skeleton-Guided-Translation, the first comprehensive framework for evaluating LLM performance on repository-level code translation with fine-grained quality metrics.
- Addresses a critical gap in code translation evaluation by focusing on entire repositories rather than isolated functions
- Introduces a novel code skeleton extraction approach that preserves structural dependencies while translating between languages
- Provides fine-grained quality metrics across syntax, semantics, structure, and functionality dimensions
- Establishes a new Java-to-C# benchmark with extensive evaluation of GPT-4 and other leading LLMs
For engineering teams modernizing legacy systems, this framework offers valuable insights into how LLMs can effectively handle complex, multi-file code translation projects while maintaining structural integrity.