Targeted Prompt Injection Attacks Against Code LLMs

This research introduces TPIA (Target-specific Prompt Injection Attack), a novel attack paradigm that can manipulate Code LLMs to generate malicious code with specific harmful functionality.

Successfully demonstrated on major models including ChatGPT, Claude, and Gemini
Achieves high attack success rates while maintaining code functionality
Functions through carefully crafted prompts that bypass security measures
Exposes significant security vulnerabilities in widely-used code generation tools

Implications for Security: Organizations using AI-powered coding assistants need to implement enhanced security protocols and monitoring systems, as these attacks can be executed without access to model parameters or training data.

TPIA: Towards Target-specific Prompt Injection Attack against Code-oriented Large Language Models