Smart LLM Traffic Control

Smart LLM Traffic Control

Using 'Number of Thoughts' to route prompts and detect attacks

This research introduces a novel Number of Thoughts (NofT) metric to measure prompt complexity, enabling more efficient use of LLMs in production:

  • Achieved 2% latency reduction by intelligently routing tasks to appropriate model sizes
  • Detected adversarial prompts with 95% accuracy by analyzing thought patterns
  • Created an effective method that works across different model architectures
  • Demonstrated practical implementation with Deepseek models (1.7B to 67B parameters)

Security Impact: The NofT approach offers a powerful defense mechanism against prompt injection attacks while simultaneously optimizing performance, making it a dual-purpose solution for secure LLM deployments.

Harnessing Chain-of-Thought Metadata for Task Routing and Adversarial Prompt Detection

22 | 33