Example: Slave Narratives – WordSeer Project Page

The North American antebellum slave narratives are a collection of works written by fugitive slaves in the decades before the Civil War with the support of abolitionist sponsors. Scholars agree about the slave narrative’s most basic conventions but it is likely that these narratives, with their extreme repetitiveness, may also manifest other regular features that have yet to be detected. This project aimed to uncover these patterns with computer-assisted techniques.

In this case study, we began with a specific task in mind: investigating James Olney’s master plan for slave narratives. In his 1984 work on the subject, Olney set out a number of narrative stereotypes that, he asserted, were “so early and firmly established that one can imagine a sort of master outline drawn from the great narratives and guiding the lesser ones”. His master plan began as follows:

a first sentence beginning “I was born…”, then specifying a place, but not a date of birth;
a sketchy account of parentage often involving a white father;
a description of a cruel master, mistress, or overseer, details of first observed whipping and numerous subsequent whippings, with women very frequently the victims;
an account of one extraordinarily strong, hardworking slave – often “pure African” – who, because there is no reason for it, refuses to be whipped;
record of the barriers raised against slave literacy and the overwhelming difficulties encountered in learning to read and write;

And continues on to:

10. description of successful attempt(s) to escape, lying by during the day, travelling by night guided by the North Star, reception in a free state by Quakers who offer a lavish breakfast and much genial thee/thou conversation;

And so on, in this specific manner. Our question was simple: “how true is this assertion?” Are these narrative conventions really present in all the narratives? Are there narratives that do not match these patterns? Are there some conventions more “conventional” than others?

We chose this area of exploration because it satisfied two criteria: first, it was interesting to the literary scholars we are collaborating with, and not just a task created to demonstrate the computational powers of a tool. Second, would be a novel problem in the field of text analysis systems, and therefore interesting to the computer scientist in the collaboration – not just a simple application of previously-existing tools.

We proceeded with the following two-phase plan of action: first, find examples of the stereotype. We would use the tool to gather examples, and attempt to translate these intstances of stereotypes into some representation in terms of words and relationships between them. In the second phase, we would develop a computational tool for examining the prevalence of any particular sterotype, and for comparing different sterotypes. We would use this tool to examine the patterns of occurrence of the various computationally-represented stereotypes, and draw conclusions about Olney’s master plan.

Each phase of the plan presented a new computational challenge. In the first phase, we would have to develop a way to easily find instances of various stereotypes. In the second, we would have to create a way to get a sense of the distribution and prevalence of particular stereotypes.

Tools

In the first phase, we built a tool to help gather examples of the various narrative stereotypes. In the second phase we analyzed the gathered examples and come up with queries representing the stereotypes.

Grammatical Search

We were in need of a good way to search the narrative collection for stereotypes. Keyword search, familiar to us from web search engines, is an approximate way of searching. This is because our true information need cannot always be expressed as a collection of words. When searching for descriptions of cruel overseers, the words “cruel”, “harsh”, and “overseer” come to mind. Searching with normal keyword search, however, does not work well. Simply typing the words “cruel harsh overseer” does not capture our true information need. A more precise way of expressing our information need is the following: text in which an overseer is described as cruel or harsh.

To address this problem, we created grammatical search. Existing natural language processing technology has made it possible to automatically extract grammatical relationships between words. For example, in the sentence “The good God has given every man intellect”, it is possible, using freely-available tools, to automatically extract that the adjective “good” is being applied to the word “God”, and that “God” is the agent of the verb “give”. The process of extracting this information from text is called dependency parsing, and the relationships between words are called dependencies.

In a novel step, we created a search engine based on this technology. We performed dependency parsing on the entire slave narratives collection and created a way for a scholar to specify the grammatical relationships between words. Grammatical search allows users to be precise about the relationships between query words. For example, in Figure 1, instead of just typing in “overseer cruel” to retrieve sentences in which “overseer” is described as “cruel”, the scholar directly specifies that the amod (adjective modifier) relationship should exist between “overseer” and “cruel”. Figure 1 shows the results. Even though there are only six, each result precisely represents an instance of an overseer being described as cruel.

muralidharan-wordseer-mla0x — Figure 1: Grammatical search for the cruel treatment stereotype.

The power of grammatical search does not stop at issuing precise queries. It can also be used for discovery. Leaving one side of the search box blank returns all the words that match the query in the form of a bar graph. For example, as shown in Figure 2 , we can search for all the “sources of cruelty” by leaving the first box blank and issuing the query “_ (described as) cruel”. This returns all the words that are modified by the adjective “cruel”, arranged in order from most to least frequent.

muralidharan-wordseer-mla1x — Figure 2: All words described as cruel, filtered (by clicking) on the word “master”. The visualization creates an explorable picture of the sources of cruelty in a slave’s life as reflected by the narratives in the collection.

All the words that are that ever described as “cruel” are displayed in an interactive graph. We can immediately see that there are close to 40 instances of “cruel treatment”, followed by “cruel master”, “cruel man”, and “cruel masters”. Clicking on a word in the graph filters the result set to match only that word. In the figure, the results have been filtered on the word “master” – zooming in on the 24 instances of “cruel master”.

We believed that using this tool, our literature scholar collaborators would have a much easier time locating instances of Olney’s stereotypes. By allowing them to see the grammatical “neighborhoods” of words, the system could also help them discover other words relevant to the stereotype. This information could previously only be learned from reading. With this tool, scholars would be able to make a quick assessment of how widespread a certain grammatical construction was, and determine whether or not it was prevalent enough to be considered stereotypical.

Newspaper-column visualization

iwasborn — Figure 3: The distribution of the stereotype “I was born”.

The ability to investigate the prevalence of a stereotype was central to the second phase of our analysis. To this end, we developed a visualization of the entire collection (Figure 3) using the newspaper column visualization. Each vertical column is a narrative. The narratives are segmented into vertical blocks corresponding to 30 sentences each. A block is highlighted if the term occurs anywhere within those sentences. Essentially, this amounts to arranging the narratives side by side, and highlighting occurrences of the query in a given color. Such visualizations are popular in text analysis interfaces when allowing a user to visualize the distribution of a search query.

northstar — Figure 4: Stereotypes compared. The visualization shows the occurrences of the “I was born” stereotype (blue) and the “North Star” guided escape stereotype (yellow/orange). The first is relatively more prevalent, and the two do not occur in the same places.

Figure 3, shows the distribution of the exact phrase “I was born”. It is immediately apparent that this phrase occurs mostly towards the beginnings of narratives. Compared with the stereotype about north-star-guided escapes (Figure 4), it also seems much more prevalent.

Analysis

Prevalent stereotypes

punishment-heatmap — 5(a) Widespread usage of words related to punishment.

punishment-grammatical — 5(b) Widespread description of punishments in cruel terms.

Figure 5: Evidence of the “cruel punishments” stereotype.

We were able to determine that there were indeed certain events that occurred so frequently in the collection as to be rightly called stereotypical. The first of these was the “cruel treatment” stereotype. Of the many listed by Olney, it was also the easiest to search for. As shown in Figure 5a, almost all the narratives had multiple occurrences of the simple keywords: “punish”, “beat”, or “whip”. Such events were not restricted to the few narratives Olney mentioned by name. Grammatical search (Figure 5b) reveals the other side of the stereotype. The numbers on the graph confirm the high prevalence revealed by the newspaper-column visualization, but the adjectives accompanying these actions paint a more complete picture: one of severe, cruel, and inhuman treatment.

Other similar stereotypes were separation from parents (Figure 6a), escape (Figure 6b), and the “I was born” stereotype (Figure 3).

Figure 6: Prevalent stereotypes.

Less prevalent stereotypes

We were also able to identify at least two “stereotypes” that did not appear to be as prevalent in the collection as implied by Olney’s language: those of escapes guided by the north star, and of being received by Quakers.

As already shown in comparison with “I was born”, mentions of the north star (Figure 4) do occur in some narratives. Upon investigation, we also found that almost all occurrences of the “north star” are indeed related to escapes. The remaining referred to a periodical called “North Star”. The accounts of escape, however, are far from being representative of the collection as a whole. Instead they are clustered around a few narratives that mention the north star multiple times. In particular, the title “Uncle Tom’s Companions: Or, Facts Stranger Than Fiction. A Supplement to Uncle Tom’s Cabin: Being Startling Incidents in the Lives of Celebrated Fugitive Slaves” contains 20 of the 86 references to the north star in the collection. A much more conservative “stereotype” around the north star seems to be indicated – whenever the north star is mentioned, it is always in relation to escape, but the converse is not true. In these narratives, escape is not necessarily guided by the north star.

Similarly, Quakers are mentioned in some, but not all narratives (Figure 7). When mentioned, they are always examples of kindness, and sympathy towards the abolitionist cause. Nevertheless, this convention is far less prevalent than the cruelty, escape, or separation stereotypes. Like the north star, a more conservative restatment seems more appropriate. It is stereotypical to portray Quakers as sympathetic to slaves’ escapes, but not all escapes involve reception by Quakers.

In Conclusion

Over the course of our analysis, it became apparent that many stereotypes were beyond our computational grasp. Even though grammatical search could express certain concepts very precisely, others were more difficult to describe. These included the “white father” concept, the “barriers against slave literacy” concept, and the “description of amounts and kinds of food and clothing” stereotype. We identified the following two sources of this difficulty: the vocabulary problem, and the “event characterization” problem.

The vocabulary problem is a well-known problem in search interfaces. It refers to the great variety of words with which different people can express the same concept. In our case, it manifested itself as the problem of not knowing which words were used to describe a particular stereotype in the narratives. Our scholars would think of a few words and synonyms, but would not be able to capture the stereotype because of the great variety of words with which it was actually expressed in the text. Often, they would not be able to include these words in a search because it would also capture a lot of other events in the text that were not related of the stereotype.

The second problem we faced was that some stereotypes were hard to characterize in terms of grammatical relationships. The concept of a white father, for example, was usually conveyed over multiple lines of text, and interspersed with other events from the narrator’s childhood. It was rarely the case that the adjective “white” was directly used to describe “father”, but it was often understood that the father was white because, for example, he was also the master or the overseer.