
Taming the Long Tail in Multi-Label Classification
A Co-Occurrence Reranking Approach for Rare Label Detection
LabelCoRank addresses the persistent challenge of accurately classifying rare labels in multi-label text classification by leveraging label relationships rather than focusing solely on text semantics.
- Introduces a novel co-occurrence reranking approach that significantly improves rare label detection
- Demonstrates effectiveness across multiple datasets including medical literature (PubMed)
- Achieves superior performance compared to existing methods, particularly for infrequent labels
- Provides a practical solution that works alongside modern language models
For medical applications, this approach enables more accurate classification of rare conditions, symptoms, and treatments in clinical documentation and research literature, improving information retrieval and clinical decision support systems.
LabelCoRank: Revolutionizing Long Tail Multi-Label Classification with Co-Occurrence Reranking