
LLMs: Breaking the Language Barrier in Code Clone Detection
Leveraging Large Language Models to identify similar code across programming languages
This research explores how Large Language Models (LLMs) can revolutionize cross-language code clone detection, a critical challenge in modern software development.
- Demonstrates LLMs' effectiveness in identifying functional similarities across different programming languages
- Evaluates multiple LLM approaches against traditional techniques using diverse benchmarks
- Shows significant improvements in detecting semantic code clones without language-specific engineering
- Offers practical implementation strategies for integrating LLMs into existing development workflows
For engineering teams working with multiple programming languages, this research provides a scalable approach to reduce code duplication, improve maintenance, and prevent the propagation of security vulnerabilities across codebases.
Large Language Models for cross-language code clone detection