In Fall 2021 I capitalized on a few events to explore the value of visualizing text abstractly, not as words, but as color-coded blocks. The abstract, visual output provides a different way of reading the text based on the features of the words. (The Microsearch widget in Voyant Tools provides a great example of visualizing where words repeat in a text, but not word features.) Other methods could include reading the text for unique words (hapax legomena) or creating a scatter plot of the density of active vs. passive verb constructions over sections of the text. Alternatively, topic modeling identifies clusters of words that co-occur in different sections. These methods can answer certain kinds of questions such as: how an author creates intertextuality through the use of carefully chosen terms, how voice changes over time in a text, or how themes recur throughout a text. What they do not answer are questions about the ages or sources of that intertextuality, the rapid alternation of active and passive forms, or how less frequently used words subtly change of a theme is developed. These are questions related to distant reading with the help of computation, but are literary in their inspiration: how context shapes interpretation, how rhythms add to meaning, how expressions travel through time and across genres.
In 2021 three things came together: the prototype for the Early Modern Lexicon, a request to review a digital humanities project that visualized words in Dante’s Divine Comedy, and an advisee developing his portfolio for job and internship applications (Richard Ohia, Bowdoin Class of 2024). The result was an interactive site that visualizes the Italian text of Galileo’s 1623 letter-treatise Il Saggiatore (The Assayer) using several features of the language.

Landing page of the Galileo Visualized site.
Using the Comparison or Visualization tabs from the top menu, you can see a visual layout for the occurrences of different parts of speech across the text. Hovering over the words provides a snapshot of information about their occurrences in the corpus overall. Note that “Author” contains an alphabetically sorted list of the texts in which the word occurs. (Note: this was a proof-of-concept so there are still bugs in how some of the text is rendered.)
The site is a tool for asking deeper questions about the sections of Il Saggiatore in the contexts of the full work and the texts in Galileo’s library or the broader early modern Italian environment. The drop-down menu lets visitors choose what to visualize:
- Part of Speech: In a text purportedly about precision and measurement, what is the rhetorical effect of repeated use of superlative forms of adjectives, and why not use it in sections on the same topic?
- Tense (includes mood): How often does Galileo use the subjunctive form of verbs? Are there certain topics that he discusses with minimal use of direct, declarative verb forms?
- Relative and Raw Frequency: How often does a word appear in the book? How often does the word appear in texts in Galileo’s library? What are the hapax legomena in Il Saggiatore that have a long history in texts or other genres?
- Oldest Use: What is the date of the oldest text in Galileo’s library in which the word appears? Would his style have sounded dated to his readers?
- Lexical Family: In what sections do certain metaphors appear (or not)? Do any metaphors repeat across the book?
- Texts: Where else in Galileo’s library does the term appear? Is it rare? Common? Restricted to a certain type of book or genre?
Project Status & Next Steps
Getting the code to this point over the course of a semester is already a success. We realized that we need more data to continue:
- More contextual documents from Galileo’s library (see Creating Digital Texts for more information)
- A way to automate the identification of potentially interesting pairings or features that exhibit patterns worth further exploration (also explored in the GaLiLeO project)
While the project is on hold pending more data and more time to build out the general code, ongoing research is focusing specifically on the hapax legomena in Galileo’s Il Saggiatore.
From here:
- Explore the GaLiLeO suite of tools
- Explore the use of Large Language Models for this research
- Explore the Interactive Shelves
- Explore the Virtual Library
