
Lost in the Code: LLM Vulnerability Detection
How large language models struggle with vulnerability detection in full-size code files
This research examines the effectiveness of popular LLMs in detecting security vulnerabilities across entire code files rather than isolated functions.
Key Findings:
- LLMs show significant performance degradation when analyzing full-file contexts vs. individual functions
- Models exhibit a "lost in the end" phenomenon, struggling with vulnerabilities located later in files
- Position bias affects vulnerability detection regardless of the LLM's size or capabilities
For security teams, this research highlights critical limitations in current LLM-based code scanning approaches and suggests caution when using these tools for comprehensive vulnerability detection in production environments.
Large Language Models for In-File Vulnerability Localization Can Be "Lost in the End"