Evaluating AI Coding Assistants

This research proposes a hybrid approach that combines HCI and AI methods to evaluate conversational coding assistants powered by LLMs at scale while maintaining human-centered design principles.

Addresses the limitations of traditional human evaluation methods for LLM-based developer tools
Advocates for automatic evaluation techniques informed by human-centered design
Creates a framework to ensure AI coding assistants align with developers' actual needs
Bridges the gap between qualitative human insights and quantitative AI evaluation

This work is significant for Engineering teams as it provides a practical pathway to ensure that AI coding assistants are evaluated not just on technical metrics but on how well they serve actual developer workflows and expectations.

Bridging HCI and AI Research for the Evaluation of Conversational SE Assistants