{"id":691,"date":"2017-05-19T10:29:45","date_gmt":"2017-05-19T15:29:45","guid":{"rendered":"https:\/\/researchbdev.wpengine.com\/digital-computational-studies\/?p=691"},"modified":"2017-05-19T10:29:45","modified_gmt":"2017-05-19T15:29:45","slug":"methodology","status":"publish","type":"post","link":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/digital-computational-studies\/methodology\/","title":{"rendered":"Methodology"},"content":{"rendered":"<p>The digital aspect of this project involves creating a visual network of communication using a combination of Gephi and R script.\u00a0 The goal of this network is to visualize how information moves throughout the community.\u00a0 I will visualize each character in the novel as an individual node and consider how they interact with one another.\u00a0 Revealing this network will answer a number of important questions about the text.\u00a0 Who has and shares knowledge, and who receives it?\u00a0 How does that knowledge move between people?\u00a0 Who is the object of the most gossip, and where does the gossip end?\u00a0 Beyond the value of the model itself, the process of creating it raises a multitude of questions about the text.\u00a0 We might consider how Austen structures the text around dialogue and narration and when the narrator interrupts sequences of dialogue.\u00a0 Further, attempting to divide the text into units raises important questions about chapter divisions: Is there only one conversation per chapter? Do conversations ever last multiple chapters?\u00a0 Why does Austen structure the text in this way, and what can we learn from it?\u00a0 These are just a few of the questions that digitalizing Austen\u2019s communities can raise.<\/p>\n<p>My networks will track dialogue between characters.\u00a0 However, digitalizing dialogue proved to be more complicated than it initially appears.\u00a0 In considering creating a network of dialogue, one might expect to portray the characters as individual nodes with edges connecting them to show interactions.\u00a0 However, how do we determine where the connection starts and where it ends?\u00a0 Let us start with a basic example.\u00a0 Character A is having a conversation with Character B.\u00a0 We might digitize this with lines moving from A to B and back again for each piece of dialogue.\u00a0 However, we might complicate this image by adding a third person, C, to the conversation.\u00a0 When A speaks, is he speaking to B or C, or perhaps both?\u00a0 How do we determine this?\u00a0 One way may be to assume that the next person to speak is the one to whom the first character is speaking. Another would be to assume that each character is always speaking to all the characters in the group.\u00a0 Both these methods pose problems in their accuracy.\u00a0 Further, let us suppose that A, B, and C are gossiping about a fourth person, D.\u00a0 How do we portray this connection?\u00a0 Any automatic method of portraying gossip is inherently flawed, but considering these problems also helps us to consider how dialogue works in the text.<\/p>\n<p>In my research, I discovered a project similar to my own entitled <em>Austen Said: Patterns of Diction in Jane Austen\u2019s Major Novels<\/em>.\u00a0 This project explores forms of discourse in Austen\u2019s novels, most prominently free indirect discourse.\u00a0 The researchers used XML markup to identify each passage in all of Austen\u2019s major works by the speaker and the form of discourse (direct discourse, indirect discourse, or free indirect discourse).\u00a0 The researchers have helpfully provided their marked up data on their website, and I will use this data in my own study of Austen\u2019s novels.\u00a0 Although <em>Austen Said <\/em>primarily focuses on free indirect discourse, their XML markup does provide information about who \u201csays\u201d each passage in the book, which I used in automatically generating my graphs.<\/p>\n<p>I decided to digitally visualize the full novel as well as two individual chapters in greater detail.\u00a0 I outlined the following steps for creating my graphs:<\/p>\n<ol>\n<li>Automatically generate a graph of gossip in the entire novel in which each character speaks to the next character who speaks, using the XML markup from <em>Austen Said <\/em>to identify speakers and listeners.<\/li>\n<li>Automatically generate a graph of each of the individual chapters of focus using the same method as in step 1.<\/li>\n<li>Automatically generate a graph of gossip in the entire novel in which each character speaks about any character they name in their dialogue. For example, if Emma mentions Mrs. Weston in her dialogue, I draw an edge from Emma to Mrs. Weston.<\/li>\n<li>Replicate step 3 for each individual chapter of focus.<\/li>\n<li>Manually create the graphs generated in step 2.<\/li>\n<li>Manually create the graphs generated in step 4.<\/li>\n<li>Add gender.<\/li>\n<\/ol>\n<p>The first step of the project was to perform an initial analysis of the data for the whole novel.\u00a0 This initial phase identified the speaker of each passage using the \u201cwho\u201d attribute of the markup.\u00a0 The next passage\u2019s \u201cwho\u201d attribute became the recipient of each passage.\u00a0 This step ignores the narrator and characters speaking as each other, and it excludes any additional information, such as gender or social class.\u00a0 This version of the graph is inherently flawed, as it makes a whole set of assumptions about the text, like the idea that each character is addressing the next character who speaks.\u00a0 However, it provides a basic starting point from which to proceed.\u00a0 As this step was almost entirely automated, I performed it first on the novel as a whole, and then on the individual chapters.<\/p>\n<p>In order to accomplish this, my advisor, Professor Crystal Hall, provided me with a piece of code in R that identifies the tagged speaker of each passage and uses the following speaker as the recipient of that passage.\u00a0 I ran this code using XML markup of the text and uploaded the spreadsheet into Gephi.\u00a0 Then, I identified the characters I did not want to include in the graph.\u00a0 These included the narrator, the narrator speaking as a character (indicative of free indirect discourse), and characters speaking as each other.\u00a0 I wanted to see gossip as it occurs between characters, so though we might consider the narrator as a gossip in this novel, I eliminated her from the graph.\u00a0 Additionally, I didn\u2019t want \u201cEmma as Knightley\u201d to appear as a separate character in the graph, so deleted these unnecessary nodes.\u00a0 Then, I rearranged the nodes in order to see them all individually.<\/p>\n<p>The next step was to create a graph that shows characters talking about one another.\u00a0 Again, Professor Crystal Hall provided me with an R script that searches for a list of character names and creates edges from the character speaking to those they speak about.\u00a0 This script also accounted for time, considering one chapter as a unit of time.\u00a0 I then developed the list of character names, making sure to include variations like \u201cJane,\u201d \u201cMiss Fairfax,\u201d and \u201cJane Fairfax.\u201d\u00a0 In Gephi, I then combined nodes that were different names for the same character.\u00a0 I performed this analysis on both the whole novel and each individual chapter.\u00a0 I then performed each of the steps above manually for the individual chapters, manually creating my own Excel spreadsheet of my interpretations of the chapters.\u00a0 Finally, I retroactively added gender as an attribute to all of my graphs so that I could color the nodes accordingly.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The digital aspect of this project involves creating a visual network of communication using a combination of Gephi and R script.\u00a0 The goal of this network is to visualize how information moves throughout the community.\u00a0 I will visualize each character in the novel as an individual node and consider how they interact with one another.\u00a0 [&hellip;]<\/p>\n","protected":false},"author":105,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","footnotes":""},"categories":[1,25],"tags":[26],"class_list":{"0":"post-691","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-digital-computational-studies","7":"category-jane-austen","8":"tag-jane-austen-project","9":"entry"},"_links":{"self":[{"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/posts\/691","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/users\/105"}],"replies":[{"embeddable":true,"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/comments?post=691"}],"version-history":[{"count":0,"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/posts\/691\/revisions"}],"wp:attachment":[{"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/media?parent=691"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/categories?post=691"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/research.bowdoin.edu\/digital-computational-studies\/wp-json\/wp\/v2\/tags?post=691"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}