Benchmarking LLMs for CFD Automation

This research evaluates how effectively different Large Language Models can automate complex Computational Fluid Dynamics (CFD) tasks using OpenFOAM.

Commercial LLMs demonstrated variable effectiveness in managing CFD tasks like adjusting boundary conditions and solver configurations
Smaller locally-deployed models (e.g., QwQ-32B) struggled with complex simulation processes
Zero-shot prompting consistently failed with intricate settings, even for the largest models
Token costs and performance stability varied significantly across LLM options

This research provides crucial insights for engineering teams seeking cost-effective CFD automation solutions, highlighting the potential and limitations of current LLM technology for specialized engineering applications.

A Status Quo Investigation of Large Language Models towards Cost-Effective CFD Automation with OpenFOAMGPT: ChatGPT vs. Qwen vs. Deepseek