
Smart LLM Traffic Control
Using 'Number of Thoughts' to route prompts and detect attacks
This research introduces a novel Number of Thoughts (NofT) metric to measure prompt complexity, enabling more efficient use of LLMs in production:
- Achieved 2% latency reduction by intelligently routing tasks to appropriate model sizes
- Detected adversarial prompts with 95% accuracy by analyzing thought patterns
- Created an effective method that works across different model architectures
- Demonstrated practical implementation with Deepseek models (1.7B to 67B parameters)
Security Impact: The NofT approach offers a powerful defense mechanism against prompt injection attacks while simultaneously optimizing performance, making it a dual-purpose solution for secure LLM deployments.
Harnessing Chain-of-Thought Metadata for Task Routing and Adversarial Prompt Detection