Journalist Kathryn Schultz began publishing a column called The Mechanic Muse in The New York Times on applications of computing technology to scholarship about literature. Her first column, titled "What is Distant Reading?", concerned work to date by Stanford English and Comparative Literature professor Franco Moretti and team at the Stanford Literary Lab.
"We need distant reading, Moretti argues, because its opposite, close reading, can’t uncover the true scope and nature of literature. Let’s say you pick up a copy of 'Jude the Obscure,' become obsessed with Victorian fiction and somehow manage to make your way through all 200-odd books generally considered part of that canon. Moretti would say: So what? As many as 60,000 other novels were published in 19th-century England — to mention nothing of other times and places. You might know your George Eliot from your George Meredith, but you won’t have learned anything meaningful about literature, because your sample size is absurdly small. Since no feasible amount of reading can fix that, what’s called for is a change not in scale but in strategy. To understand literature, Moretti argues, we must stop reading books.
"The Lit Lab seeks to put this controversial theory into practice (or, more aptly, this practice into practice, since distant reading is less a theory than a method). In its January pamphlet, for instance, the team fed 30 novels identified by genre into two computer programs, which were then asked to recognize the genre of six additional works. Both programs succeeded — one using grammatical and semantic signals, the other using word frequency. At first glance, that’s only medium-interesting, since people can do this, too; computers pass the genre test, but fail the 'So what?' test. It turns out, though, that people and computers identify genres via very different features. People recognize, say, Gothic literature based on castles, revenants, brooding atmospheres, and the greater frequency of words like 'tremble' and 'ruin.' Computers recognize Gothic literature based on the greater frequency of words like . . . 'the. Now, that’s interesting. It suggests that genres 'possess distinctive features at every possible scale of analysis.' More important for the Lit Lab, it suggests that there are formal aspects of literature that people, unaided, cannot detect.
"The lab’s newest paper seeks to detect these hidden aspects in plots (primarily in Hamlet) by transforming them into networks. To do so, Moretti, the sole author, turns characters into nodes ('vertices' in network theory) and their verbal exchanges into connections ('edges'). A lot goes by the wayside in this transformation, including the content of those exchanges and all of Hamlet’s soliloquies (i.e., all interior experience); the plot, so to speak, thins. But Moretti claims his networks 'make visible specific ‘regions’ within the plot' and enable experimentation. (What happens to Hamlet if you remove Horatio?). . . ." (http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-reading.html?pagewanted=2, accessed 06-25-2011).