
SudoLM: Selective Access to LLM Knowledge
Moving Beyond One-Size-Fits-All AI Safety with Authorization Alignment
SudoLM introduces a novel framework for credential-based access control within large language models, enabling qualified users to access restricted parametric knowledge.
- Implements authorization alignment that selectively grants access based on user credentials rather than blocking all users uniformly
- Introduces SUDO keys as a mechanism for authenticating qualified users to access sensitive knowledge
- Achieves 91.1% success rate in enforcing access controls while minimizing impact on other capabilities
- Preserves LLM utility for advanced users while maintaining safety guardrails for others
This research is significant for security professionals as it provides a foundation for implementing granular permission systems in AI deployments, similar to traditional computing environments, enabling organizations to balance security and utility.
Original Paper: SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment