Securing the Future of AI

Exploring how Large Language Models are transforming cybersecurity, privacy protection, and defense strategies

Hero image
Jailbreaking Attacks and Defense Mechanisms

Jailbreaking Attacks and Defense Mechanisms

Research exploring vulnerabilities in LLMs through jailbreaking attacks and developing effective defense strategies

Prompt Injection and Input Manipulation Threats

Prompt Injection and Input Manipulation Threats

Studies on how adversaries can manipulate LLM inputs through prompt injection and other techniques

Privacy-Preserving Techniques for LLMs

Privacy-Preserving Techniques for LLMs

Research on maintaining data privacy while utilizing LLMs through differential privacy and other methods

Detecting and Mitigating Harmful Content

Detecting and Mitigating Harmful Content

Research on identifying and preventing harmful or malicious content generated by or input to LLMs

Trust, Reliability, and Hallucination Mitigation

Trust, Reliability, and Hallucination Mitigation

Research on addressing hallucinations, improving trustworthiness, and ensuring reliable outputs from LLMs

Domain-Specific Security Applications

Domain-Specific Security Applications

Research applying LLMs to security challenges in specific domains like code analysis, finance, and threat intelligence

Adversarial Robustness and Attack Vectors

Adversarial Robustness and Attack Vectors

Research on improving LLM resilience against various attack vectors and understanding vulnerabilities

Watermarking and Attribution for LLM Content

Watermarking and Attribution for LLM Content

Research on embedding identifiable markers in LLM outputs for content attribution, detection of AI-generated content, and resistance to watermarking attacks

Ethical Alignment, Fairness, and Value Assessment

Ethical Alignment, Fairness, and Value Assessment

Research on improving the ethical alignment of LLMs, reducing bias, and ensuring fairness across different user groups and applications

Security in Multimodal LLMs and Vision-Language Models

Security in Multimodal LLMs and Vision-Language Models

Research on security challenges specific to multimodal LLMs and vision-language models, including cross-modal safety alignment

LLM Governance and Collective Decision Making

LLM Governance and Collective Decision Making

Research on using LLMs in governance contexts, voting systems, and collective decision-making processes while ensuring security and fairness

Machine Unlearning for LLMs

Machine Unlearning for LLMs

Research on methods to make LLMs forget specific knowledge or information to enhance privacy, security, and address copyright concerns

Side-Channel Attacks in LLM Infrastructure

Side-Channel Attacks in LLM Infrastructure

Research on vulnerabilities in LLM serving systems that can be exploited through timing attacks and other side-channel techniques to extract sensitive information

Language-Specific Safety and Security Evaluation

Language-Specific Safety and Security Evaluation

Research focused on evaluating and enhancing LLM safety across different languages and cultural contexts, addressing language-specific security challenges

Data Contamination and Leakage Detection

Data Contamination and Leakage Detection

Research on identifying, preventing, and mitigating data contamination and leakage in training and evaluation of large language models

Safety in Long-Context LLMs

Safety in Long-Context LLMs

Research on safety challenges and alignment techniques specific to long-context large language models

Decentralized AI Security and Blockchain Integration

Decentralized AI Security and Blockchain Integration

Research on enhancing AI security through decentralization and blockchain technologies to address single points of failure and improve data privacy

AI-Generated Content Detection

AI-Generated Content Detection

Research on distinguishing AI-generated content from human-created content for security, integrity, and authenticity verification

Security in Retrieval-Augmented Generation

Security in Retrieval-Augmented Generation

Research on security vulnerabilities, attack vectors, and defensive mechanisms specific to retrieval-augmented generation (RAG) systems that integrate external knowledge with LLMs

Model Provenance and Attribution

Model Provenance and Attribution

Research on identifying model origins, verifying model lineage, and ensuring proper attribution of foundation models and their derivatives

Security in Multi-Agent LLM Systems

Security in Multi-Agent LLM Systems

Research on security challenges and safety issues in systems where multiple LLM agents interact, including evolutionary frameworks and social simulations

Security of Synthetic Data Generation

Security of Synthetic Data Generation

Research on security implications, auditing, and tracing of synthetic data generated by LLMs for downstream applications

Security of LLM Activation Functions and Architecture

Security of LLM Activation Functions and Architecture

Research on how architectural components like activation functions affect safety and security properties of LLMs

Security Implications of Model Editing

Security Implications of Model Editing

Research on security risks and vulnerabilities introduced by editing or modifying LLMs post-training, including knowledge editing techniques and their potential misuse

Model Tampering Attacks and Detection

Model Tampering Attacks and Detection

Research on understanding, performing, and defending against targeted modifications to LLM weights and behavior through model tampering

Anomaly Detection with LLMs

Anomaly Detection with LLMs

Research on using LLMs for zero-shot or few-shot detection of anomalies, outliers, and unusual patterns across various domains

Safety Engineering for ML-Powered Systems

Safety Engineering for ML-Powered Systems

Research on proactive approaches and methodologies to identify, evaluate, and mitigate safety risks in ML-powered systems through systematic safety engineering practices

Security in Federated Learning for LLMs

Security in Federated Learning for LLMs

Research on security challenges, attack vectors, and defensive mechanisms in federated learning environments for large language models

Security for Embodied AI Systems

Security for Embodied AI Systems

Research on security vulnerabilities, attacks, and defenses for embodied AI systems including robots and autonomous vehicles

Community-Based Oversight and Fact-Checking

Community-Based Oversight and Fact-Checking

Research on collaborative human oversight mechanisms for LLMs, including community moderation, fact-checking systems, and distributed content verification

Interpretability for LLM Security

Interpretability for LLM Security

Research on understanding and explaining LLM internal states and mechanisms to improve security, detect vulnerabilities, and enable safer steering of model behavior

Security in Small Language Models

Security in Small Language Models

Research on security vulnerabilities, attacks, and defenses specific to small language models (SLMs) deployed on edge devices or with limited computational resources

Memory Manipulation and Injection Attacks

Memory Manipulation and Injection Attacks

Research on vulnerabilities related to LLM agent memory systems, including injection attacks and defenses for memory banks in conversational AI

Misinformation Detection and Countermeasures

Misinformation Detection and Countermeasures

Research on using LLMs to detect, evaluate, and counter misinformation, including demographic factors in misinformation susceptibility

Persuasion Evaluation and Resistance

Persuasion Evaluation and Resistance

Research on evaluating persuasion capabilities and susceptibility of LLMs, including frameworks for measuring resistance to persuasion

Risk Assessment for LLMs

Risk Assessment for LLMs

Research on methods for assessing, quantifying, and mitigating risks posed by large language models across various domains and applications

Access Control and Authentication for LLMs

Access Control and Authentication for LLMs

Research on securing access to LLM resources through authentication mechanisms, identity verification, and permission management to prevent unauthorized use

Social Engineering Detection and Mitigation

Social Engineering Detection and Mitigation

Research on detecting, simulating, and mitigating social engineering attacks leveraging LLMs, including personalized protection systems

Code Security and Vulnerability Analysis with LLMs

Code Security and Vulnerability Analysis with LLMs

Research on identifying and mitigating security vulnerabilities in LLM-generated code, including API misuse, software defects, and code robustness

Security for Autonomous Systems and Vehicles

Security for Autonomous Systems and Vehicles

Research on security challenges, vulnerability discovery, and safety enhancement for autonomous systems and vehicles powered by LLMs

Intrusion and Anomaly Detection with LLMs

Intrusion and Anomaly Detection with LLMs

Research on using LLMs for detecting intrusions, anomalies, and malicious traffic in networks and computing systems

LLMs for Sociopolitical Analysis and Governance

LLMs for Sociopolitical Analysis and Governance

Research on using LLMs to analyze political systems, assess regime characteristics, and measure democratic quality with security implications

Tool Manipulation and Selection Security

Tool Manipulation and Selection Security

Research on security vulnerabilities and attacks related to the tool selection and manipulation in LLM agent systems that use external tools

Fact-Checking and Content Verification with LLMs

Fact-Checking and Content Verification with LLMs

Research on using LLMs for fact-checking, content verification, and distinguishing between factual and fabricated information in various contexts