Prompt Injection and Input Manipulation Threats
Studies on how adversaries can manipulate LLM inputs through prompt injection and other techniques

Prompt Injection and Input Manipulation Threats
Research on Large Language Models in Prompt Injection and Input Manipulation Threats

SQL Injection: The Hidden Threat in LLM Applications
Uncovering security vulnerabilities in LLM-integrated web systems

Engineering AI Transparency
A Top-Down Approach to Monitoring and Controlling AI Cognition

Defending LLMs Against Prompt Injection
First benchmark for evaluating and mitigating indirect prompt injection attacks

Self-Attacking AI Vision Systems
How LLMs can generate their own deceptive content

Hidden Threats: AI Agent Secret Collusion
How LLMs can share unauthorized information through steganographic techniques

Security Vulnerabilities in AI-Powered Robotics
How input sensitivities create dangerous misalignments in LLM/VLM-controlled robots

The Instruction-Data Boundary Problem in LLMs
Addressing critical security vulnerabilities in language models

Detecting LLM Manipulation
A Novel Approach to Identify Prompt Injections Using Activation Patterns

The Hidden Threat of Prompt Manipulation
How subtle word changes can dramatically bias LLM responses

Targeted Prompt Injection Attacks Against Code LLMs
New security vulnerabilities in AI code generation tools

Preventing Prompt Theft in LLMs
Unraveling and defending against prompt extraction attacks

Defending LLM System Prompts
New techniques to protect intellectual property in AI systems

PROMPTFUZZ: Strengthening LLM Security
Advanced testing framework to combat prompt injection attacks

Securing LLMs with Instruction Hierarchy
A novel embedding approach to prevent prompt attacks

AI Hacking Agents in the Wild
Detecting LLM-powered threats with a specialized honeypot system

Balancing LLM Defense Systems
Solving the Over-Defense Problem in Prompt Injection Guards

Defending LLMs Against Prompt Injection
Novel defense techniques using attackers' own methods

Backdoor Vulnerabilities in RAG Systems
Novel data extraction attacks compromise private information

Prompt Extraction Attacks: A New Security Threat
Reconstructing LLM prompts from limited output samples

Attacking LLM Tool Systems
Security vulnerabilities in tool-calling mechanisms

Defending Against LLM Prompt Injections
PromptShield: A deployable detection system for securing AI applications

Securing Military LLMs Against Prompt Injection
Vulnerabilities and Countermeasures for Federated Defense Models

Securing Edge-Cloud LLM Systems
Joint optimization for prompt security and system performance

Probing the Dark Side of AI Assistants
Using AI investigators to uncover potential vulnerabilities in language models

The Confusion Vulnerability in LLMs
How embedded instructions can mislead AI models despite explicit guidance

Defending Against LLM Prompt Attacks
Advancing AI security through automated prompt inject detection

Dynamic Command Attacks in LLM Tool Systems
How AutoCMD exploits tool-learning vulnerabilities for information theft

Automated Attacks Against LLM Systems
Using Multi-Agent Systems to Test LLM Security

Defending Against Prompt Injection
Detecting and Removing Malicious Instructions in LLMs

Breaking Through LLM Defenses
How Adaptive Attacks Bypass Security Measures in AI Agents

Hidden Threats in Text-to-SQL Models
Uncovering backdoor vulnerabilities in language models

Defending AI Agents Against Deception
Novel in-context defense mechanisms against visual and textual manipulation

Securing LLMs Against Prompt Injections
Architectural separation of instructions and data enhances model security

Typography as a Security Threat
Uncovering vulnerabilities in AI vision systems

Securing LLM Agents Against Prompt Injections
CaMeL: A Novel Defense System for LLM-Based Applications

Security Vulnerabilities in Autonomous LLM Agents
How SUDO attacks bypass safeguards in computer-use LLMs

Breaking LLM Safety Barriers
How distributed prompt processing bypasses AI safety filters

Securing LLM Apps Against Prompt Injection
A Permission-Based Defense Using Encrypted Prompts

Uncovering Prompt Vulnerabilities in Style Transfer
A benchmark dataset for reconstructing LLM style transformation prompts

Separator Injection Attacks in LLMs
Uncovering security vulnerabilities in conversational AI

Defending LLMs Against Prompt Injection
Using Mixture of Encodings to Enhance Security

Exploiting the Blind Spots in LLM Tabular Agents
Novel evolutionary attack strategy bypasses structural safeguards
