LLMs as Bug Replicators

LLMs as Bug Replicators

How language models perpetuate coding errors during code completion

This study reveals that Large Language Models frequently reproduce bugs when completing code in bug-prone contexts, raising significant concerns for software development and security.

  • LLMs show a strong tendency to replicate bugs present in their training data
  • The research evaluated 7 different language models on bug-prone code completion tasks
  • Models perform significantly worse on bug-prone code compared to non-buggy contexts
  • Different model architectures and sizes demonstrated varying susceptibility to reproducing bugs

For engineering teams, this underscores the critical need for robust testing and validation when using AI code assistants in production environments, as automated code generation may silently introduce security vulnerabilities.

LLMs are Bug Replicators: An Empirical Study on LLMs' Capability in Completing Bug-prone Code

222 | 323