
Olympus: The Universal Computer Vision Router
Transforming MLLMs into a unified framework for diverse visual tasks
Olympus introduces a novel approach that enables Multimodal Large Language Models to coordinate and execute a wide range of computer vision tasks through intelligent task routing.
- Uses a controller MLLM to delegate 20+ specialized vision tasks across images, videos, and 3D objects
- Enables complex workflows through chained actions without training heavy generative models
- Creates a modular architecture that easily integrates with existing MLLMs
- Supports instruction-based routing for intuitive task management
For security applications, Olympus offers powerful capabilities in surveillance monitoring, object detection, and anomaly identification across various visual data types—all through a unified interface that simplifies complex visual analysis tasks.