LLMs: Breaking the Language Barrier in Code Clone Detection

LLMs: Breaking the Language Barrier in Code Clone Detection

Leveraging Large Language Models to identify similar code across programming languages

This research explores how Large Language Models (LLMs) can revolutionize cross-language code clone detection, a critical challenge in modern software development.

  • Demonstrates LLMs' effectiveness in identifying functional similarities across different programming languages
  • Evaluates multiple LLM approaches against traditional techniques using diverse benchmarks
  • Shows significant improvements in detecting semantic code clones without language-specific engineering
  • Offers practical implementation strategies for integrating LLMs into existing development workflows

For engineering teams working with multiple programming languages, this research provides a scalable approach to reduce code duplication, improve maintenance, and prevent the propagation of security vulnerabilities across codebases.

Large Language Models for cross-language code clone detection

40 | 323