This is an area of active research development. The context for pursuing some of the questions described below has been described in the section on the Early Modern Lexicon. The strengths of large language models (LLMs) for identifying similarities within and across texts suggest that they can be a helpful tool for developing a deeper understanding of pre-modern Italian literary and scientific culture either through creating relatively accurate, plain text versions of scanned books or by creating processes of querying collections of texts with highly specific prompts.
In Summer 2025, working with Madina Sotvoldieva (Class of 2028), we developed a preliminary workflow for converting scans to plain text using models installed locally (on a laptop) and working offline to create a protected environment for intellectual property. Our test cases were modern printed volumes that will be used for a class in Spring 2026. Our current work is refining the process for pre-modern print. In addition to publishing results, we hope to make a free and user-friendly workflow available for colleagues.
In parallel with that work, and in collaboration with Madina, Theo Barton (Class of 2026), and Professor Fernando Nascimento (DCS), we developed a protocol for evaluating methods of querying specialized data sets. By working with retrieval augmented generation (RAG), we created two sets of reliable primary source material for Paul Ricoeur’s philosophical works published in English and another for Galileo Galilei’s works in Italian. We then experimented with the different variables for customizing RAG systems to identify the best settings for prompts related to historical questions about the collections. We established two sets of parallel questions about the texts of Ricoeur and Galileo, both of which look for responses related to conceptual definitions, stylistic evaluation, argumentation built from seemingly unrelated concepts, critical appropriation of the ideas of other authors, and establishment of authority. We have asked a panel of experts in our areas of study to evaluate the responses from our RAG system in comparison to those from a foundational LLM chatbot. We will spend the fall and winter writing up our results.
From here:
- Explore The Interactive Shelves
- Explore The Virtual Library
- Explore Visualizing Galileo’s The Assayer
- Explore the GaLiLeO suite of tools