Measuring Module Impact in LLM Agents

Measuring Module Impact in LLM Agents

A Game-Theory Approach to Identify MVP Components

CapaBench introduces a novel evaluation framework that quantifies each module's contribution in modular LLM agent architectures, enabling targeted system optimization.

  • Uses cooperative game theory (Shapley values) to measure component impact
  • Analyzes how modules like planning, reasoning, and reflection affect overall performance
  • Identifies which components deliver the most value across different tasks
  • Provides a systematic method to guide engineering investments in LLM agent development

This research matters for engineering because it transforms LLM agent optimization from guesswork to data-driven decision-making, allowing developers to focus resources on high-impact modules and create more efficient, effective AI systems.

Who's the MVP? A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents

15 | 41