LLMs: Breaking the Language Barrier in Code Clone Detection

This research explores how Large Language Models (LLMs) can revolutionize cross-language code clone detection, a critical challenge in modern software development.

Demonstrates LLMs' effectiveness in identifying functional similarities across different programming languages
Evaluates multiple LLM approaches against traditional techniques using diverse benchmarks
Shows significant improvements in detecting semantic code clones without language-specific engineering
Offers practical implementation strategies for integrating LLMs into existing development workflows

For engineering teams working with multiple programming languages, this research provides a scalable approach to reduce code duplication, improve maintenance, and prevent the propagation of security vulnerabilities across codebases.

Large Language Models for cross-language code clone detection