
Measuring Module Impact in LLM Agents
A Game-Theory Approach to Identify MVP Components
CapaBench introduces a novel evaluation framework that quantifies each module's contribution in modular LLM agent architectures, enabling targeted system optimization.
- Uses cooperative game theory (Shapley values) to measure component impact
- Analyzes how modules like planning, reasoning, and reflection affect overall performance
- Identifies which components deliver the most value across different tasks
- Provides a systematic method to guide engineering investments in LLM agent development
This research matters for engineering because it transforms LLM agent optimization from guesswork to data-driven decision-making, allowing developers to focus resources on high-impact modules and create more efficient, effective AI systems.
Who's the MVP? A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents