
Hybrid LLM Systems for Faster, Smarter Inference
Optimizing AI Decision-Making with Threshold-Based Control
This research introduces a task-oriented Age of Information (AoI) framework that intelligently combines large and small language models to optimize remote inference systems.
- Addresses the trade-off between inference speed and accuracy by developing threshold-based policies
- Uses a Semi-Markov Decision Process to determine optimal model selection based on data freshness
- Demonstrates significant improvements in timeliness and accuracy compared to single-model approaches
- Provides a practical engineering solution for systems where both speed and accuracy are critical
This innovation has important implications for Engineering, offering an optimized architecture for remote AI systems that need to balance computational resource constraints with inference quality requirements.
Original Paper: Task-oriented Age of Information for Remote Inference with Hybrid Language Models