
Bridging Language Models and Cell Biology
Leveraging AI embeddings to understand cellular landscapes
This research explores how techniques from large language models can revolutionize the analysis of single-cell sequencing data by treating cells as tokens in high-dimensional space.
- Draws parallels between word tokens in NLP and cell embeddings in biology
- Applies the geometric properties of language embeddings to better analyze cellular data
- Proposes new frameworks for visualizing and interpreting single-cell datasets
- Demonstrates cross-disciplinary innovation between AI and biology
The approach offers biologists powerful new tools to map complex cellular relationships, potentially accelerating discoveries in development, disease progression, and therapeutic responses.
The cell as a token: high-dimensional geometry in language models and cell embeddings