Language-Specific Safety and Security Evaluation

Research focused on evaluating and enhancing LLM safety across different languages and cultural contexts, addressing language-specific security challenges

Hero image

Language-Specific Safety and Security Evaluation

Research on Large Language Models in Language-Specific Safety and Security Evaluation

Evaluating LLM Safety in Chinese Contexts

Evaluating LLM Safety in Chinese Contexts

First comprehensive Chinese safety benchmark for LLMs

Lost in Translation: Safety Gaps in Multilingual LLMs

Lost in Translation: Safety Gaps in Multilingual LLMs

How LLM safety measures deteriorate across languages

Multilingual Cyber Threat Detection in Social Media

Multilingual Cyber Threat Detection in Social Media

Comparing ML, DL, and LLM approaches for cross-language security monitoring

DuoGuard: Advancing Multilingual LLM Safety

DuoGuard: Advancing Multilingual LLM Safety

A Reinforcement Learning Approach to Multilingual Safety Guardrails

Multilingual Safety for AI Assistants

Multilingual Safety for AI Assistants

Precision-targeting language-specific vulnerabilities in LLMs

Making LLMs Safe in All Languages

Making LLMs Safe in All Languages

Novel safety alignment for low-resource languages like Singlish

Cross-Cultural LLM Safety Evaluation

Cross-Cultural LLM Safety Evaluation

Assessing AI risks in Kazakh-Russian bilingual contexts

Hidden Dangers in Multilingual AI

Hidden Dangers in Multilingual AI

How backdoor attacks can spread across languages in LLMs

Cultural AI Safety: Beyond Words

Cultural AI Safety: Beyond Words

Evaluating AI sensitivity to offensive non-verbal gestures across cultures

Exposing the Vulnerabilities of Chinese LLMs

Exposing the Vulnerabilities of Chinese LLMs

JailBench: A novel security testing framework for Chinese language models

Language Evolution Under Content Moderation

Language Evolution Under Content Moderation

How LLMs and Genetic Algorithms Simulate User Adaptation to Platform Regulations

Uncovering Stereotype Biases in Japanese LLMs

Uncovering Stereotype Biases in Japanese LLMs

Novel evaluation of bias through direct prompt responses

Beyond Borders: #StopAsianHate as a Global Movement

Beyond Borders: #StopAsianHate as a Global Movement

How multilingualism and K-pop influenced transnational activism

Cross-Lingual Fact-Checking with LLMs

Cross-Lingual Fact-Checking with LLMs

Detecting previously fact-checked claims across languages

Security Gaps in Multilingual LLMs

Security Gaps in Multilingual LLMs

Detecting vulnerabilities in low-resource languages

Safety Across Languages: The Hidden Gap in LLM Alignment

Safety Across Languages: The Hidden Gap in LLM Alignment

How safety mechanisms transfer (or fail to transfer) across languages

PolyGuard: Breaking Language Barriers in AI Safety

PolyGuard: Breaking Language Barriers in AI Safety

Expanding safety moderation to 17 languages beyond the usual English focus

Aligning AI with Persian Culture

Aligning AI with Persian Culture

First comprehensive benchmark for Persian LLM safety and ethics

Key Takeaways

Summary of Research on Language-Specific Safety and Security Evaluation