Matthew W. Wilson Lecture: Quantified Self-City-Nation: Digital Systems for Attentional Control

matt_23_harvard-cropQuantified Self-City-Nation: Digital Systems for Attentional Control

Monday, April 28
7:00 p.m.
Lancaster Lounge, Moulton Union

- Open to the public -

Matthew Wilson’s presentation draws parallels between the rising consumer-electronic sector associated with personal activity monitors and the rapid visioning of smart urbanism. He interrogates developments in interoperability and propriety, competition and habit, fashion and surveillance. He addresses the social-cultural and political implications for this refiguring of spatial thought and action as well as the capacities reinforced and developed through the implementation of these technologies and techniques.

Matthew Wilson is currently Visiting Assistant Professor of Landscape Architecture and Urban Planning and Design at Harvard University and Assistant Professor of Geography at the University of Kentucky, where he co-directs the New Mapping Collaboratory. Matt holds a PhD in Geography from the University of Washington. His website is

Sponsored by Bowdoin’s Digital and Computational Studies Initiative.

Reflections from “Data Driven Societies” by Rita Liao ’15

In the past year, the media, the school’s career counselors, and my coder friends have been telling me that I should learn to code. From a young age I have figured that computing skill is not my strong suit so that I resisted the advice. It was not until I interned at a digital marketing startup that I wanted to learn more about the algorithm behind marketing. Data and new technology allow for more effective product marketing, and coding skills give one competitive advantage in today’s job market. At the same time, it was the dangerous power of data and algorithm that I saw at my internship that prompted me to become more computational literate through the new digital and computational studies initiative course, “Data Driven Societies.”

It turns out that computational thinking is not exclusively programming. Rather, it is about the way of thinking computationally, and I have benefited from the practical side of the course. For example, I have become much more apt at Excel and other graphing and mapping programs, and I can comprehend data visualizations much faster. Without a doubt, computing is empowering and, more importantly, the ethical aspects of technology that I learned from the class allow me to make better normative evaluations of the use of data.

I used to think that I could just observe as an outsider, but a friend of mine, a coder, was right to ask, “How could you write about coding without understanding how it works?” This was also an unexpected realization. Learning computational skills is akin to developing analytical and critical thinking in my humanities classes – they are hard to translate onto a resume but will gradually unfurl their values in life. Just as I found parallels between writing a government essay and the procedure of due diligence at a past internship, I now find myself more able to “think through” and predict the outcome of certain actions. Indeed, the development of logical thinking is working to counterbalance my old thinking habit as an artist who tends to dwell in murky place and jump to new terrains. Data driven societies are a future I feel I will not only take part in but also a future I can take part in leading.

Rita Chengying Liao is a junior at Bowdoin, majoring in Government and minoring Art History. She grew up in Shenzhen, China, a growing metropolis in China that is a destination for Chinese from throughout the country. She is presently exploring her interests in contemplative studies, media politics and photography. It makes her happy to help others live a more fulfilled life, so that she grabs the extra time to teach meditations and yoga and writes for aspiring start-ups in China.

David Stork Lecture: Computer Vision in the Study of Art: New Rigorous Approaches to the Study of Paintings and Drawings

David Stork Lecture


Computer Vision in the Study of Art: New Rigorous Approaches to the Study of Paintings and Drawings
4/21/2014 | 4:15 PM – 6:00 PM
Location: Visual Arts Center, Beam Classroom
Open to the Public
Sponsored by the DCSI

What can computers reveal about images that even the best-trained connoisseurs, art historians and artist cannot? How much more powerful and revealing will these methods become? In short, how is the “hard humanities” field of computer image and analysis of art changing our understanding of paintings and drawings?

David Stork’s lecture will include computer vision, pattern recognition and image analysis of works by Jackson Pollock, Vincent van Gogh, Jan van Eyck, Hans Memling, Lorenzo Lotto, and several others. You may never see paintings the same way again!

Dr. Stork, Rambus Fellow at Rambus Labs, is a graduate in physics of the Massachusetts Institute of Technology and the University of Maryland at College Park. He studied art history at Wellesley College, was Artist-in-Residence through the New York State Council of the Arts and is a Fellow of the International Association for Pattern Recognition and Fellow of SPIE, in part for his work on computer image analysis of art. Sponsored by Bowdoin’s Digital and Computational Studies Initiative.

Lecture: Jessa Lingel “Facebook is Anti-Drag” (3/31 @ 7pm)

Jessa Lingel, Lecture

Facebook is Anti-Drag:
Issues of Online Community and Communication

Lingel copy

  • 3/31/2014 | 7:00 PM – 8:30 PM
  • Location: Moulton Union, Lancaster Lounge
  • Event Type: Lecture
  • Sponsor: Digital and Computational Studies Initiative
  • - Open to the Public -

Online technologies have provided a means of storytelling, visualization, community building, and educational resources that have particular significance for groups that have been historically disenfranchised.

Jessa Lingel addresses the role of technology in the lives of a specific queer community, performers in Brooklyn’s drag scene. Her talk addresses both the benefits and limitations of social media platforms for members of this particular set of queer lives and the intersection of queer theory with internet studies.

Jessa Lingel is a postdoctoral research fellow at Microsoft Research New England, working with the Social Media Collective.

First DCSI Hackathon on February 26th, 6 p.m. – 9 p.m. in the VAC


Well, now you can.

Announcing the 1st Digital & Computational Studies Initiative Hackathon!

A hackathon is a space for programmers and designers, from novices to experts, to collaborate intensively on software projects.

Come start or work on a project, learn a new coding language, visualize your data, or study how to protect your online privacy!

VAC 3rd Floor
February 26th, 6pm-9pm

Digital Humanities Faculty Workshop a Success

Reblogged from the Bowdoin News: “Workshop Gives Faculty the Keys to a Digital World”

Nearly two dozen Bowdoin faculty members are taking a turn as students in a four-day course for faculty titled “Digital Humanities @Bowdoin,” taught January 13-16 as part of the College’s new Digital and Computational Studies Initiative.

It was the first day of class, and five rows of students were seated expectantly – some a little nervously – in a Searles computer lab. “In the next half hour I’m going to teach you everything I know about computers,” said Professor of Computer Science Eric Chown to his audience – which consisted not of undergrads but of nearly two dozen Bowdoin faculty members, representing disciplines such as Romance languages, film studies, art, chemistry, English, history, German, Russian, environmental studies, and math.

Although Chown may have been exaggerating just a little bit for effect, it’s no stretch to say that in today’s increasingly digital world, understanding even the basics of computer science can make a world of difference for scholars and teachers in any field. “Computers are good at things we’re not good at: reading 10,000 books at once, or counting the number of pixels in an image that are more red than green,” Chown said. “They’re fantastic at these things, and these things lead us to think a little bit differently about what we’re studying.”

How do computer programs simplify a complex world into zeros and ones, and how do simple components interact to perform highly complex tasks? What kinds of methods and tools can harness computing power, and what limitations do they face? Those are some of the things that the faculty-turned-students were eager to learn from the four-day workshop “Digital Humanities @ Bowdoin,” co-taught by Chown and Professor of Art History Pamela Fletcher.

Learn About 5 Current Projects in the Digital Humanities at Bowdoin

dighum photosBWhile some participants came into the class more comfortable than others with digital methods, all were convinced of the need to know more. “The digital humanities is exciting and folks want to know what’s going on, or they want to dip their toe into it – because they see that colleagues at other institutions are doing projects, they see that agencies tend to fund people who are doing digital humanities projects, or they see that their students are interested in it,” Chown said.

So many faculty members wanted to sign up for the course that some had to be turned away. “The level of faculty interest is extraordinary,” said Dean for Academic Affairs Cristle Collins Judd. “It not only reflects a strong commitment to the continued development of faculty research and teaching, but also highlights the important opportunities offered by the Digital and Computational Studies Initiative.”

Since its conception in 2012, that initiative has gained impressive momentum. In addition to the steering committee headed by Fletcher and Chown, two full-time faculty have joined the cause: Postdoctoral Fellow in the Humanities Crystal Hall – whose own digital humanities research has led to her book Galileo’s Library, which will be published in February 2014 – and New Media and Data Visualization Specialist Jen Jack Gieseking.dighum photosD

The initiative also boasted the debut of a full-fledged course in fall 2013: “Gateway to the Digital Humanities,” taught by the same team of professors (read about the student course on p. 14-15 of the Fall 2013 Bowdoin Magazine). The course covered four major categories of digital humanities techniques – image analysis, text analysis, spatial analysis, and network analysis – a breakdown inspired by a November 2012 talk at Bowdoin by digital humanities specialist Anne Helmreich.

Fletcher and Chown had to turn down a deluge of requests from faculty members to sit in on the fall course, prompting them to start thinking about developing a January workshop for faculty. After gleaning some ideas from a Northeast Regional Computing Program consortium in Boston, members of the initiative began drafting a program based on the fall course – which, according to both professors and students, was a resounding success.

The first day of the faculty workshop gave an overview of how computers work and what the digital humanities can accomplish, with some image analysis built in (Chown demonstrated, for instance, a basic way of analyzing the color choices in Rembrandt’s “The Night Watch”).

“Programming is about abstracting, and scaling over and over and over, until you’re doing things that look really complicated – but the individual parts of it are very simple,” Chown said. “What makes programming so exciting in the digital humanities is that you can play; you can try things out. You can reverse the colors of a Van Gogh and find out that he was playing with negative space – something I discovered on my own just by playing around.”


The theoretical overviews were followed by hands-on experience: participants embarked on their first lab assignment and discovered just what it means to operate at a fundamental digital level. By typing in code using the programming language Python, they made tiny neon turtles maneuver around to create geometric shapes on their computer screens.

Throughout the rest of the course, participants had an opportunity to hone in on the remaining three categories of analysis. Tuesday covered text analysis – using a tool called Voyant to assess word frequency, for instance – and Wednesday covered spatial analysis, with a special look at GIS projects that participants are already involved in. Today, the final session of the workshop, they’ll cover network analysis, using tools such as Gephi.

The goal of the course is not for everyone to become an expert programmer. It’s about gaining basic fluency in a discipline that’s closely tied to just about every other discipline. “Increasingly, the format for information circulation is digital,” Hall said. “Staying current in any field means at least understanding what’s going on with the digital component – the implications of interface choice, of media choices. It’s incredibly important.”

Just as important as the content covered in the workshop is the opportunity to exchange knowledge and ideas with the instructors and fellow participants. Humanities professors are getting a new perspective on their own fields from computer scientists, and the opposite is also true. “I see the humanities as a great source of ideas,” Chown said. For instance, humanities projects often run up against the limitations of digital tools that aren’t quite suited to the task at hand – providing fertile ground for innovation in computer science.

“The fun thing about this initiative has been gathering up a lot of people from art history, and computer science, and sociology, and earth and oceanographic studies, and all of these other disciplines, and getting them in a room, and having them talk about this stuff,” Chown said. “The ideas that have come out of it have just been phenomenal.”

Digital Reconstructions of Libraries

Libraries are very much on my mind these days as I grapple with the best methodologies for reconstructing and visualizing Galileo’s library. I am also working constantly with digital collections: institutional libraries, archives of organizations, and single studies of authors. Perhaps it is no surprise then, that when first asked to suggest possible readings for the section of the Gateway to Digital Humanities Course that focuses on textual analysis, I immediately recommended Jorge Luis Borge’s “Library of Babel.”

To me this short essay represents many of the possibilities and pitfalls of digital and computational library studies. Borges imagines a library that holds one copy of every book that could possibly be written. Some contain gibberish, others perfect copies of known work. Scholars live in the library searching for answers to questions about human experience. Ideological camps form and battles ensue, but all the while, even this hyperbolically complete library remains enigmatic to its users due to its sheer size. In parallel ways, computers have the potential to create a similar digital library. Natural language processing has already shown that computers can generate prose that has the “sound” of known authors like Immanuel Kant. Programming loops (of the kind the Gateway to Digital Humanities students are applying to images) perform the same action repeatedly (changing one pixel at a time, for example) and could conceptually be employed to provide the infinite variety of texts that populate “The Library of Babel.”

For readers of Python programming language, I tried to express this impossible program in loop terms in Jython. Strings and concatenation would help, but I think this still conveys the message in a light-hearted form:

Screenshot (Crystal Hall, 2013) of JES Jython platform.

Screenshot (Crystal Hall, 2013) of JES Jython platform.

The above attempt at code (that has legal syntax for Jython, but an error-filled program) is a futile approach for bringing order to chaos. Some Digital Humanities (DH) scholars would argue that digital and computational studies could offer partial solutions to comprehending and organizing this vast quantity of textual information. This is quite optimistic that estimates suggest 340 million new 140-character tweets on Twitter daily, not to mention the 3.77 billion (and growing) indexed pages on the world wide web.

Working even with the available (and manageable) digital data, certain assumptions are made by tools and certain information is lost in their application, all of which gives me pause for thought as I reconstruct and try to find analytical pathways through the library of a person about whom ideological fields have been defined and passionate battles have been fought for centuries. Matt Jockers has led the field of DH with his work on Macroanalytics, currently focused on establishing patterns in nineteenth-century fiction, but relies on only the books for which a digital copy has been made. Google books Ngram Viewer allows users to compare the frequencies of words that appear in digital or digitized books during different time periods, which assumes consistency of cataloguing and meta-data entry across all participating institutions, which is not always the case.

Screenshot (Crystal Hall, 2013) of Google books Ngram Viewer.

Screenshot (Crystal Hall, 2013) of Google books Ngram Viewer.

As I revisit the data for my own project on Galileo, I wonder where I will enter the ideological disputes that surround the interested fields; I worry about what information will be excluded from the data; and how my users will navigate the digital library I am about to create.



Excel Data and Gephi Data Laboratory

My goal for this blog entry is to explain how to organize data within an Excel Spreadsheet (that will be saved as a Comma Separated Values file or .csv) to import into Gephi for visualization and analysis of nodes (individual elements represented as points) and edges (relationships represented by connective lines) in a network. My explanation assumes familiarity with the Gephi tutorials based on prepared .gexf files (the extension for files readable by Gephi) of Les Miserables or Facebook data. I assume that my reader is now thinking about applying network analysis to her own research.

New users of Gephi may not have any familiarity with .gexf files, XML mark-up, or other code for organizing data, but can still find use in Gephi.  Excel is typically a more user-friendly application for this kind of organization, and most databases (Microsoft Access for example) can be converted to an Excel workbook (.xls) or directly to a .csv file. The explanations assume a basic understanding of storing, copying, and sorting data in Excel. The organizational principles described below can be applied to whichever application you use to generate the tabular .csv files that you will use in Gephi. Other supported formats and their functionality can be found at Gephi’s site.

I am using screenshots from my own research data on the books in Galileo Galilei’s library to help demonstrate the kinds of information each column should contain. Below is a screen shot of one spreadsheet in the Excel workbook that I have used to organize all of my notes related to the project:

gephiblog1 There are many spreadsheets listed in the tab bar at the bottom of the screen for the different kinds of information I have for the project. Importantly, a .csv file only retains the information in the active worksheet (“By author” in this case, the tab in white) and will not save the other sheets. It is important to copy the information you want to use from your primary workbook (multiple sheets) to a single-spreadsheet workbook for nodes and a single-spreadsheet workbook for edges. Also, the column headings in my workbook (“My#”, “Fav’s#”, “Author. Favaro’s full citation”, “Year”, etc.) are my shorthand and cannot be interpreted by Gephi, another reason that copying the information you want to use to new single-spreadsheet workbook files is highly recommended.

1)   You will need to create two .csv files: a node table and an edge table. I use Excel as my tabular application, and Excel files save by default to the .xlsx format. In order to get the .csv, you need to choose that option for file format when saving.

2)   The node table tells Gephi all of the possible nodes in a network and must have at least the columns Id and Label. There should be one line for every node that will appear in either gephiblog2column of the edge table:


This seems easy enough, but what kinds of information are best placed in the Id column, and how should that differ from the Label? The example above is taken from a spreadsheet that I use to organize information about Galileo’s library. All of my nodes in this example are the proper nouns that are found in titles in the library and the titles themselves (about 2650 nodes total). The example above is, in a word, clunky. It is redundant and ultimately makes my network visualization unreadable if I try to add labels over the nodes. Consider the following example in which full titles would become labels over roughly 650 nodes (obscuring nodes and edges in the process):


Having a unique identifying number (the Id that Gephi expects) allows me to store a lot of information about that node in a spreadsheet or database that I can later choose to access as necessary. Since my organizational system was created long before I knew about Gephi, my Label column corresponds to the Full Title column in my spreadsheet (which ultimately clutters my visualization to the point of illegibility if I add labels). To make this more readable, I need to change the data in the Label column to the data from a “Short Title” column.

3)   As you might notice, there are other columns in the first screen shot for the node table. The node table can also include attributes (in parenthesis in the example because they are not necessary for a basic visualization of a network). Attributes are a way to categorize data, perhaps by gender, race, age, etc. While not necessary for exploring data with Gephi, they allow for a more nuanced exploration of a network. For example, I will want to add attribute columns for religious affiliation (Jesuit, Benedictine, Protestant, Catholic, etc.) and genre to start visualizing the data in a way that helps me answer my research questions. Attribute columns can also be added in the “Data Laboratory” section of the Gephi interface even after you have loaded the .csv files for the nodes and edges.

4)   The time interval is another optional column of information to include about your data, which may or may not be applicable or useful. I copy here a partial screenshot from the page as a reference:gephiblog4

The Gephi wiki also displays the code behind this process.


Thinking about my own dataset, I need a Time Interval column for every title that shows the earliest year that a book could have entered the library. I will stop my time intervals with Galileo’s death in 1642. From the examples in part 3, the time interval information would look like this in the .csv version of the spreadsheet, with the columns Id, Time Start, Time End:




Once you have uploaded the .csv, in Data Laboratory, you can merge the Time Start and Time End columns using the merge strategy “Create Time Interval.” This will concatenate and format what you need in order to be able to view the change over time of the network.

5)   The edge table (the second .csv file that you need to create) then tells Gephi the connections that exist between the nodes. It must have the columns Source and Target:   gephiblog5

This is where having a unique identifier for all nodes can be very convenient. My source above is title 299 in which the Cologne Academy is mentioned as a contributor to the book that I have given the identifier 299. Book titles can include people or places (Targets), but people or places cannot include titles (Sources), so my edges are directed, and the distinction between source nodes and target nodes is critical.

6)   Similarly to the node table, there are many optional categories that can add nuance to an analysis of a network. The edge table can also include a Label column to help with categorization of relationship types, a unique Id for the relationship (generated by Gephi), Attributes (eg: family, friend, co-worker, classmate, etc. for social networks), and Time Interval.

7)   The edge table can also include information not found in the node table. Type indicates whether the relationship is directed or undirected. This column can be auto-filled on upload and is visible in the Data Laboratory.

8)   Another option for the edge table is to provide weightedness for relationship. Weight is your opportunity to give more importance to certain relationships by giving them a numerical weight.

Remember to save the files as .csv, then load them in Gephi, nodes first, using the “Import .csv” option in the Data Laboratory toolbar.  Be sure to indicate which type of file you are uploading (node table or edge table), otherwise you risk error messages.

Data can simply be input directly into the Data Laboratory of Gephi, but I am most familiar with the functionality of Excel, have organized my research data using spreadsheets, and prefer to make adjustments, filter data, and store my information in one format. Programming languages such as R seem particularly adept at creating the tabular information needed here, particularly when automatically pulling data from a large corpus.

My approach may not work for everyone or every project, but hopefully seeing real data in a raw format provides context for its presentation in the data gephiblog6laboratory:



In turn, that should make the analysis of something as complex as the visualization of the connections between names in Galileo’s library less opaque:gephiblog7


“Terms and Conditions May Apply” Storify of Screening and Discussion

Following the screening of “Terms and Conditions May Apply,” Profs. Elias & Gieseking (Government, Digital and Computational Studies) of Bowdoin, and USM Prof. Clearwater (Law) held a brief discussion of the film and issues of privacy, transparency, and participation on the Internet. The Storify of the highlights of their discussion is below.