Optimizing AI Coding Assistants for Developer Experience

This research introduces a Service Level Agreement (SLA)-aware architecture for AI coding assistants, optimizing responsiveness in development environments while managing resource constraints.

Identifies responsiveness requirements critical for developer productivity with AI coding tools
Proposes a novel architecture that balances between latency performance and resource utilization
Demonstrates how the system automatically adapts to meet SLAs while minimizing infrastructure costs
Recommends best practices for deployment of CodeLLMs in real-world engineering environments

This research matters because it addresses a critical challenge in AI-assisted software development: maintaining responsive experiences for developers while keeping infrastructure costs manageable as these tools become essential for modern engineering teams.

SLA-Awareness for AI-assisted coding