Fourgrams

Based in part on a presentation at RSA in Boston, 2016…

Early work that compares ngram patterns in Ariosto’s poem, Tasso’s poem, and Galileo’s treatise on comets has revealed twice as many allusions to Ariosto as previously known and one implicit quotation from Tasso that has never been identified by critics, but has existed in plain sight to readers on the first page of the Saggiatore for centuries. I undertook this work with similar framing questions that informed the work on Galileo’s relationship to Ariosto and Tasso, the particular valence of abbarbagliare and computational parallax. That is, how can we see the relationship between Galileo’s prose and the written culture of his day? More ambitiously, can understanding that relationship help us to better assess the mechanisms by which his correct and incorrect theories gained traction or met opposition beyond his circle of supporters?

This, of course, raises other questions. At a certain level of granularity, all of the texts from the period might look similar, even though we are learning that frequencies of usage of articles, pronouns, and conjunctions can help to classify similar texts. The method has gained popular attention as well,  as shown by an NPR story from 2o14 discussing how many identity markers can be inferred from such details: age, gender, background, relationship with a conversation partner, or if someone is lying. If we are not aware of such subtleties, could we assume that an early modern reader was? Are we naive to even assume that for late Renaissance Italian readers that for written language distant by only a century a “sound” was even perceptible?

Here is a sample from the studies that I have run to see how similar popular late-Renaissance Italian texts are to Ariosto’s Orlando furioso. The titles in the left-most column are abbreviations of Galileo’s Massimi Sistemi/Chief World Systems (1632), Galileo’s Saggiatore/Assayer (1623), Marino’s Adone/Adonis (1623), Giordano Bruno’s Degli eroici furori/On the Heroic Frenzies (1585), Paolo Sarpi’s Istoria del concilio tridentino/History of the Council of Trent (1612), and Tasso’s Gerusalemme liberata/Jerusalem Delivered (1585). The comparison was made using transcriptions of modern critical editions available from Biblioteca italiana, a choice which I explain in detail in the work on computational parallax.

Percent similarity of ngrams in select Italian texts to Ariosto's poem.

Percent similarity between ngrams in select Italian texts and Ariosto’s poem.

In an article that I am developing , I look closely at these patterns of similarity and argue that Galileo’s Ariosto, that is combination of similarities between his Saggiatore and the Furioso, is not the same Ariosto that his contemporaries are incorporating into their work. A sample of these results can be seen in the following graph that shows the top fourgrams that Galileo’s letter on comets shares with the other works in this sample set.

Relative frequencies of the top fourgrams used by Galileo that are also in Ariosto's poem. Compared with relative usage of those fourgrams by other authors.

Relative frequencies of the top fourgrams used by Galileo that are also in Ariosto’s poem. Compared with relative usage of those fourgrams by other authors.

The graph offers on glimpse at understanding a new aspect of the relationship between Galileo and the written language of the time, chiefly the kinds of common erudite vocabulary that he shared with authors. I see this discovery as a way to allow me to focus on words and word forms that would be more pronounced as peculiar or marked terms in the period. In this sense, I am following the theory of Wai Chee Dimock who has proposed measuring literature through fractals, what she defines as the bumps and dents in the surface of the text that connect it unexpectedly to texts outside its time period, its body of national literature, or its presumed genre. Akin to Deleuze and Guattari’s rhizomes or Wittgenstein’s “family resemblance”, these fractals resist aggregating and averaging, but are a unit of measure nonetheless, robust across scales. Just as Dimock sees the presence of foreign words in Dante’s Divina Comedia as these dents, bumps, or fractals, I think the Ariostan vocabulary should also be a similar indication of the presence of families of ideas.