Security Vulnerabilities in LLM Caching

This research reveals how prompt caching in LLM APIs creates security vulnerabilities through timing differences that can expose user data.

Cached prompts process faster than non-cached ones, creating observable timing patterns
Attackers can exploit these timing differences to detect what other users are querying
Most commercial API providers (including OpenAI, Anthropic, Google) employ some form of caching
The researchers developed effective methods to detect caching behaviors across providers

These findings highlight critical privacy implications for businesses using shared LLM services, as sensitive prompts could be exposed without proper caching management and isolation.

Auditing Prompt Caching in Language Model APIs