The increasing prevalence of digitized source material in the humanities has led to uncertainty about how this suddenly available information will change scholars’ research methods. What balance will scholars strike between in-depth examination of a few sources, and a more distant reading of a large number of them? As computer scientists and literary scholars, we see this as an opportunity to tackle a shared challenge between human-computer interaction and the humanities. Our focus is specifically on text collections: comparing texts, getting a sense of style and theme similarities, and tracing patterns of language use. These tasks are not widely supported by any current software, but if humanities researchers want to use digitized text collections on a larger scale, they will need to do exactly such things.

Our goal is for our English scholars to be able to use WordSeer to gather accurate information in a manner satisfying to them. Our project is innovative in several ways. First, because of the advanced computational language processing and information retrieval that it brings to the service of humanistic analysis. Second, because of its emphasis on supporting the entire humanistic scholarly process –a cycle of reading, writing, exploration, and understanding. And third, because of our case-study-based user-centered development approach.

WordSeer is supported by a grant from the National Endowment for the Humanities, Office of Digital Humanities, NEH HK-50011.