LLMs Meet Medieval Texts

LLMs Meet Medieval Texts

Bridging AI and Historical Language Processing

This research explores how large language models perform when applied to historical languages, specifically Old Occitan texts from medieval periods.

  • Models achieved up to 87% accuracy in POS tagging despite orthographic variations
  • Performance varied significantly between medical and hagiographical texts
  • Fine-tuning with domain-specific data substantially improved results
  • Cross-domain generalization remains challenging for historical languages

This work demonstrates the potential and limitations of modern NLP tools for linguistic analysis of historical texts, offering valuable insights for computational linguists working with non-standardized languages.

Modern Models, Medieval Texts: A POS Tagging Study of Old Occitan

50 | 78