Monday, March 4, 2013

What are the main points of argument in "Digital Humanities: Theorizing Research Practices" by Ted Underwood?

Underwood's thesis is that researchers in the humanities have grown to rely on search algorithms without fully understanding or theorizing how they work. He argues that this can lead to research that is "at odds with historicism" in that research is guided by the mathematics of the search algorithm rather than by historical knowledge.
For Underwood, the advent of full text search in the 1990s made it possible to explore almost any verbal pattern in literature to the extent that, given a large enough data set, the results would always confirm whatever hypothesis one started with. Researchers are often unwittingly guided to privilege certain results over others by the complex (and often proprietary) relevancy algorithms used to rank results. In other words, one finds what one looks for. This is what Underwood calls "confirmation bias."
To combat this, Underwood suggests using tools that make it possible to explore verbal relationships that might exist within a set of texts rather than positing a relationship and searching for matches. For example, rather than searching for the terms "shame" and "blush" in eighteenth- and nineteenth-century literature, using this topic modeling tool shows that the word most commonly associated with "blush" is "artless," and allows the researcher's thesis to evolve along with the relationships discovered in the data.
The point of this example, however, is not to argue that topic modeling is superior to keyword search; rather, Underwood suggests that humanists need to interrogate the digital tools they use in research in order to better understand how underlying algorithms affect their work.

No comments:

Post a Comment

What is the theme of the chapter Lead?

Primo Levi's complex probing of the Holocaust, including his survival of Auschwitz and pre- and post-war life, is organized around indiv...