Cracking the Code Watermark

Cracking the Code Watermark

Why current LLM code watermarking fails under attack

This groundbreaking study reveals critical vulnerabilities in watermarking techniques meant to identify AI-generated code, showing they can be easily defeated with simple transformations.

  • Code watermarks are significantly less robust than text watermarks against even basic attacks
  • Simple code transformations like variable renaming and code reformatting can effectively remove watermarks
  • All tested watermarking techniques failed under automated attack scenarios
  • Security implications are severe for plagiarism detection and preventing malicious code generation

This research highlights an urgent security challenge as LLMs increasingly contribute to software development, demonstrating that current watermarking methods provide a false sense of security for detecting AI-generated code.

Is The Watermarking Of LLM-Generated Code Robust?

3 | 45