LLMs in High-Stakes Political Decision-Making

This study pioneers the assessment of Large Language Models (LLMs) for political decision-making by testing them on United Nations Security Council data.

Introduces novel dataset of UN Security Council records for benchmarking LLMs
Evaluates LLMs' ability to process international security governance information
Explores AI limitations and potential in high-stakes political contexts
Identifies current gaps and future opportunities for LLM applications in political science

The findings have significant implications for international security, revealing both the potential and limitations of AI systems in understanding complex geopolitical dynamics and supporting diplomatic processes.

Benchmarking LLMs for Political Science: A United Nations Perspective