Private Compression of Large Language Models

PPC-GPT introduces a privacy-preserving federated framework that compresses large language models into smaller, task-specific models while protecting sensitive domain knowledge.

Combines pruning techniques with Chain-of-Thought distillation to reduce model size while maintaining performance
Implements a server-client federated architecture that keeps private data on local clients
Addresses both privacy concerns and resource limitations in LLM deployment
Provides a practical solution for organizations needing secure AI with lower computational requirements

This research is particularly valuable for security-conscious sectors like healthcare and finance that need efficient AI systems without compromising sensitive data.

PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation