Prompt Injection and Input Manipulation Threats

Studies on how adversaries can manipulate LLM inputs through prompt injection and other techniques

Hero image

Prompt Injection and Input Manipulation Threats

Research on Large Language Models in Prompt Injection and Input Manipulation Threats

SQL Injection: The Hidden Threat in LLM Applications

SQL Injection: The Hidden Threat in LLM Applications

Uncovering security vulnerabilities in LLM-integrated web systems

Engineering AI Transparency

Engineering AI Transparency

A Top-Down Approach to Monitoring and Controlling AI Cognition

Defending LLMs Against Prompt Injection

Defending LLMs Against Prompt Injection

First benchmark for evaluating and mitigating indirect prompt injection attacks

Self-Attacking AI Vision Systems

Self-Attacking AI Vision Systems

How LLMs can generate their own deceptive content

Hidden Threats: AI Agent Secret Collusion

Hidden Threats: AI Agent Secret Collusion

How LLMs can share unauthorized information through steganographic techniques

Security Vulnerabilities in AI-Powered Robotics

Security Vulnerabilities in AI-Powered Robotics

How input sensitivities create dangerous misalignments in LLM/VLM-controlled robots

The Instruction-Data Boundary Problem in LLMs

The Instruction-Data Boundary Problem in LLMs

Addressing critical security vulnerabilities in language models

Detecting LLM Manipulation

Detecting LLM Manipulation

A Novel Approach to Identify Prompt Injections Using Activation Patterns

The Hidden Threat of Prompt Manipulation

The Hidden Threat of Prompt Manipulation

How subtle word changes can dramatically bias LLM responses

Targeted Prompt Injection Attacks Against Code LLMs

Targeted Prompt Injection Attacks Against Code LLMs

New security vulnerabilities in AI code generation tools

Preventing Prompt Theft in LLMs

Preventing Prompt Theft in LLMs

Unraveling and defending against prompt extraction attacks

Defending LLM System Prompts

Defending LLM System Prompts

New techniques to protect intellectual property in AI systems

PROMPTFUZZ: Strengthening LLM Security

PROMPTFUZZ: Strengthening LLM Security

Advanced testing framework to combat prompt injection attacks

Securing LLMs with Instruction Hierarchy

Securing LLMs with Instruction Hierarchy

A novel embedding approach to prevent prompt attacks

AI Hacking Agents in the Wild

AI Hacking Agents in the Wild

Detecting LLM-powered threats with a specialized honeypot system

Balancing LLM Defense Systems

Balancing LLM Defense Systems

Solving the Over-Defense Problem in Prompt Injection Guards

Defending LLMs Against Prompt Injection

Defending LLMs Against Prompt Injection

Novel defense techniques using attackers' own methods

Backdoor Vulnerabilities in RAG Systems

Backdoor Vulnerabilities in RAG Systems

Novel data extraction attacks compromise private information

Prompt Extraction Attacks: A New Security Threat

Prompt Extraction Attacks: A New Security Threat

Reconstructing LLM prompts from limited output samples

Attacking LLM Tool Systems

Attacking LLM Tool Systems

Security vulnerabilities in tool-calling mechanisms

Defending Against LLM Prompt Injections

Defending Against LLM Prompt Injections

PromptShield: A deployable detection system for securing AI applications

Securing Military LLMs Against Prompt Injection

Securing Military LLMs Against Prompt Injection

Vulnerabilities and Countermeasures for Federated Defense Models

Securing Edge-Cloud LLM Systems

Securing Edge-Cloud LLM Systems

Joint optimization for prompt security and system performance

Probing the Dark Side of AI Assistants

Probing the Dark Side of AI Assistants

Using AI investigators to uncover potential vulnerabilities in language models

The Confusion Vulnerability in LLMs

The Confusion Vulnerability in LLMs

How embedded instructions can mislead AI models despite explicit guidance

Defending Against LLM Prompt Attacks

Defending Against LLM Prompt Attacks

Advancing AI security through automated prompt inject detection

Dynamic Command Attacks in LLM Tool Systems

Dynamic Command Attacks in LLM Tool Systems

How AutoCMD exploits tool-learning vulnerabilities for information theft

Automated Attacks Against LLM Systems

Automated Attacks Against LLM Systems

Using Multi-Agent Systems to Test LLM Security

Defending Against Prompt Injection

Defending Against Prompt Injection

Detecting and Removing Malicious Instructions in LLMs

Breaking Through LLM Defenses

Breaking Through LLM Defenses

How Adaptive Attacks Bypass Security Measures in AI Agents

Hidden Threats in Text-to-SQL Models

Hidden Threats in Text-to-SQL Models

Uncovering backdoor vulnerabilities in language models

Defending AI Agents Against Deception

Defending AI Agents Against Deception

Novel in-context defense mechanisms against visual and textual manipulation

Securing LLMs Against Prompt Injections

Securing LLMs Against Prompt Injections

Architectural separation of instructions and data enhances model security

Typography as a Security Threat

Typography as a Security Threat

Uncovering vulnerabilities in AI vision systems

Securing LLM Agents Against Prompt Injections

Securing LLM Agents Against Prompt Injections

CaMeL: A Novel Defense System for LLM-Based Applications

Security Vulnerabilities in Autonomous LLM Agents

Security Vulnerabilities in Autonomous LLM Agents

How SUDO attacks bypass safeguards in computer-use LLMs

Breaking LLM Safety Barriers

Breaking LLM Safety Barriers

How distributed prompt processing bypasses AI safety filters

Securing LLM Apps Against Prompt Injection

Securing LLM Apps Against Prompt Injection

A Permission-Based Defense Using Encrypted Prompts

Uncovering Prompt Vulnerabilities in Style Transfer

Uncovering Prompt Vulnerabilities in Style Transfer

A benchmark dataset for reconstructing LLM style transformation prompts

Separator Injection Attacks in LLMs

Separator Injection Attacks in LLMs

Uncovering security vulnerabilities in conversational AI

Defending LLMs Against Prompt Injection

Defending LLMs Against Prompt Injection

Using Mixture of Encodings to Enhance Security

Exploiting the Blind Spots in LLM Tabular Agents

Exploiting the Blind Spots in LLM Tabular Agents

Novel evolutionary attack strategy bypasses structural safeguards

Hidden Dangers in GUI Agents

Hidden Dangers in GUI Agents

How 'Fine-Print Injections' Threaten LLM-Powered Interfaces

Key Takeaways

Summary of Research on Prompt Injection and Input Manipulation Threats