
Invisible Fingerprints: Black-Box Watermarking for LLMs
Detecting AI-generated text without access to model internals
A novel black-box watermarking technique that allows detection of AI-generated text without requiring access to the model's internal probability distributions.
- Creates distortion-free watermarks by intelligently manipulating the sampling process
- Enables nested watermarking where multiple watermarks can be applied sequentially
- Achieves strong statistical detection while maintaining text quality
- Works with any API-based LLM access where only text outputs are available
This research provides critical security capabilities for content authentication, giving organizations tools to verify text provenance and detect AI-generated content in practical deployment scenarios where model internals are inaccessible.