Cracking the Code Watermark

This groundbreaking study reveals critical vulnerabilities in watermarking techniques meant to identify AI-generated code, showing they can be easily defeated with simple transformations.

Code watermarks are significantly less robust than text watermarks against even basic attacks
Simple code transformations like variable renaming and code reformatting can effectively remove watermarks
All tested watermarking techniques failed under automated attack scenarios
Security implications are severe for plagiarism detection and preventing malicious code generation

This research highlights an urgent security challenge as LLMs increasingly contribute to software development, demonstrating that current watermarking methods provide a false sense of security for detecting AI-generated code.

Is The Watermarking Of LLM-Generated Code Robust?